CN103544255A - Text semantic relativity based network public opinion information analysis method - Google Patents
Text semantic relativity based network public opinion information analysis method Download PDFInfo
- Publication number
- CN103544255A CN103544255A CN201310482522.5A CN201310482522A CN103544255A CN 103544255 A CN103544255 A CN 103544255A CN 201310482522 A CN201310482522 A CN 201310482522A CN 103544255 A CN103544255 A CN 103544255A
- Authority
- CN
- China
- Prior art keywords
- text
- information
- semantic
- similarity
- public
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 42
- 238000000605 extraction Methods 0.000 claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000011156 evaluation Methods 0.000 claims abstract description 8
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims description 28
- 239000000284 extract Substances 0.000 claims description 22
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000011218 segmentation Effects 0.000 claims description 16
- 238000013459 approach Methods 0.000 claims description 15
- 238000007621 cluster analysis Methods 0.000 claims description 13
- 230000006872 improvement Effects 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 claims description 7
- 238000004088 simulation Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 230000008451 emotion Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 3
- 230000009193 crawling Effects 0.000 claims description 3
- 238000000151 deposition Methods 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims description 2
- 238000005520 cutting process Methods 0.000 claims description 2
- 230000003447 ipsilateral effect Effects 0.000 claims description 2
- 230000013011 mating Effects 0.000 claims description 2
- 238000013517 stratification Methods 0.000 claims description 2
- 238000005303 weighing Methods 0.000 claims description 2
- 238000005065 mining Methods 0.000 abstract description 9
- 238000010586 diagram Methods 0.000 abstract description 3
- 238000007781 pre-processing Methods 0.000 abstract 2
- 230000008569 process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000009412 basement excavation Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 235000017166 Bambusa arundinacea Nutrition 0.000 description 1
- 235000017491 Bambusa tulda Nutrition 0.000 description 1
- 241001330002 Bambuseae Species 0.000 description 1
- 206010028916 Neologism Diseases 0.000 description 1
- 235000015334 Phyllostachys viridis Nutrition 0.000 description 1
- 239000011425 bamboo Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a text semantic relativity based network public opinion information analysis system. The system comprises a network public opinion information acquisition module, a public opinion information extraction module, a public opinion information preprocessing module, a public opinion information mining module and a public opinion information analysis module. The network public opinion information acquisition module is used for acquiring various public opinion information rich in content from a webpage. The public opinion information extraction module and the public opinion information preprocessing module are used for preliminarily filtering and segmenting the acquired public opinion information, extracting meta-information of a text part, creating a feature semantic network diagram of texts, and performing weighting computation and feature extraction to provide services for public opinion information mining. The public opinion information mining module is used for classifying the texts by adopting a semantic similarity based improved text clustering analysis method. The public opinion information analysis module is used for performing OLAP (on-line analytical processing) multi-dimensional statistics on mined data of the public opinion information, and analyzing public opinion evaluation indices to provide support for relevant public opinion decision making. By the system, the problem that semantic information of words in the texts is incomplete is solved, and clustering analysis and hot topic extraction of dynamic data in a large-scale network environment are realized efficiently.
Description
Technical field
The present invention relates to network information technology field, specifically a kind of based on the relevant network public sentiment information analytical approach of text semantic.
Background technology
Society, internet has been penetrated in daily life, and the JICQs such as microblogging, forum, blog have become people's obtaining information, and then the important channel that gives opinion, diffuses information.By the network platform, public feelings information bamboo telegraph, cause extensive concern, fast, the scope of the speed of its propagation wide, influence power is big, be far from traditional media comparable, the features such as the anonymous interactivity of cyberspace, non-space-time are restricted, make this burst of powerful public opinion strength of network public-opinion, on social development with stablely produce certain impact and impact.Positive network public-opinion, like " positive energy ", promotes and promotes social development; Negative network public-opinion forms negative effect to social stability, causes public sentiment crisis.Thus, Strengthens network public feelings information monitoring, analyze, management, to stable society order, building a harmonious society has important practical significance.Network public sentiment information is monitored in time, correctly judged decision-making, responds in time rapidly, actively adopt an effective measure and dissolve public sentiment crisis, become the Focal point and difficult point problem of network public-opinion management work.
Summary of the invention
Problem for needing to solve in the feature of network public sentiment information in above-mentioned background technology and network public sentiment information management, the invention provides a kind of based on the relevant network public sentiment information analytical approach of text semantic.
The technical solution adopted for the present invention to solve the technical problems is, a kind of based on the relevant network public sentiment information analytical approach of text semantic.Employing comprises the network public sentiment information analytic system that network public sentiment information acquisition module, public feelings information extraction module, public feelings information pretreatment module, public feelings information excavate module, public feelings information analysis module and comprise public feelings information database, and comprises the steps:
A. network public sentiment information acquisition module gathers various public feelings informations from webpage, and stores in public feelings information database;
B. the public feelings information that public feelings information extraction module and public feelings information pretreatment module gather step a tentatively filters and cutting, extracts the content information that text comprises, and for public feelings information excavates, provides data, services;
C. on step b basis, public feelings information excavates module and adopts the improvement Clustering Analysis of Text method based on semantic similarity, generates classification descriptor, filters out the text message comprising in cluster analysis result; The TFIDF words-frequency feature computing method statistics category feature of utilization based on characteristic statistics, obtain Based on Class Feature Word Quadric, select noun as candidate's Based on Class Feature Word Quadric, according to candidate feature word weight, sort, the larger candidate feature word of the weighted value of usining is as classification keyword, utilize the semantic relation between classification keyword, form classification results; Identify and set up new network public-opinion theme, the related content that detects, follows the tracks of existing public sentiment theme;
D. last, public feelings information analysis module excavates public feelings information data through step c are carried out OLAP multidimensional statistics analysis, analyze the public sentiment evaluation metricses such as public sentiment subject content attention rate, public sentiment theme emotion tendency.
In step a, described public feelings information acquisition module, that network public sentiment information source is gathered, different from general web crawlers is, it not only will complete crawling of webpage, and web page contents will be formatd to processing, extracts theme and the content of public sentiment, the data obtained deposits txt form or html formatted file in, and stores public feelings information database into; Network public sentiment information acquisition module adopts timesharing access, timing to change IP address and simulation browser carries out three kinds of technology of single-sign-on in conjunction with carrying out anti-shielding.Network public sentiment information acquisition module adopts timesharing access, timing to change IP address and simulation browser carries out three kinds of technology of single-sign-on in conjunction with carrying out anti-shielding.The concrete steps that network public sentiment information acquisition module is carried out are: the concrete steps that described public feelings information acquisition module is carried out are, from the URL of predefined Topic relative webpage, obtain the text message in webpage, and from current web page, extract new URL and put into queue, until that the public feelings information satisfying condition gathers is complete, till URL queue is sky; The web page text information collecting is stored in public feelings information database according to field classification, provide public feelings information extraction module to call.
Described public feelings information extraction module, it is the irrelevant contents of removing in webpage, as noise datas such as the advertisement in webpage, navigation information, picture, copyright notice, the metamessage of extraction to the useful body part of the analysis of public opinion, text is reconstructed, will there is the representational information aggregation of theme together; Described public feelings information pretreatment module, that public feelings information source to gathering is after the extraction of described public feelings information extraction module, carry out Chinese word segmentation processing, filtration stop words, named entity recognition, part-of-speech tagging, syntax parsing and Feature Words and extract, set up positive sequence index and inverted index; Set up text feature semantic network figure, using the entity E that comprises in the text node as figure, semantic relation between two entities is as the directed edge of figure, semantic relation between entity is the weight as node in conjunction with word frequency information, the weight of directed edge represents the significance level of entity relationship in text, and described entity E comprises things entity NE, event entity VE, event relation entity RE; Word frequency and the text frequency information of statistics text, then carry out Feature Words extraction, and the vocabulary of choosing embodiment text feature shows the text.
In step b, described public feelings information extraction module, is the irrelevant contents of removing in webpage, extracts the metamessage to the useful body part of the analysis of public opinion, and text is reconstructed, and will have the representational information aggregation of theme together; Described public feelings information pretreatment module, that public feelings information source to gathering is after the extraction of described public feelings information extraction module, carry out Chinese word segmentation processing, filtration stop words, named entity recognition, part-of-speech tagging, syntax parsing and Feature Words and extract, set up positive sequence index and inverted index; Set up text feature semantic network figure, using the entity E that comprises in the text node as figure, semantic relation between two entities is as the directed edge of figure, semantic relation between entity is the weight as node in conjunction with word frequency information, the weight of directed edge represents the significance level of entity relationship in text, and described entity E comprises things entity NE, event entity VE, event relation entity RE; Word frequency and the text frequency information of statistics text, then carry out Feature Words extraction, and the vocabulary of choosing embodiment text feature shows the text.
Realize the text analyzings such as network public sentiment information text mining, natural language processing, first to carry out word segmentation processing, use for reference the achievement in research in domestic Chinese word segmentation field, the functions such as the word segmentation that the ICTCLAS Chinese lexical analysis system of using Inst. of Computing Techn. Academia Sinica to develop has, part-of-speech tagging, named entity recognition, by public feelings information content of text is carried out to participle, extract the word that length is greater than two.After text participle, filter the stop words useless to computer understanding text, retain the word of the parts of speech such as noun, verb, adnoun, moving shape word, obtain alternative features word set, effectively reduce the size of index, increase recall precision, improve accuracy rate.Through the text document of word segmentation processing, set up positive sequence index and inverted index, the inquiry that realizes user is mutual.Text through participle, part-of-speech tagging, go after stop words, set up the Feature Semantics network chart of text, the information such as the statistics word frequency of text and text frequency, are then weighted with feature extraction etc.
In step c, described public feelings information excavates module, that text set is being carried out to pre-service, after comprising that Chinese word segmentation processing, stop words filtration and structuring label information are analyzed, the text data set that Information Extracting module is generated, the text semantic feature description scheme building according to text feature semantic network figure, utilizes method for evaluating similarity to calculate the semantic similarity between text, build similarity matrix, adopt the improvement Clustering Analysis of Text method based on semantic similarity to generate cluster result; Cluster analysis result generates classification descriptor, filters out the text message comprising in cluster analysis result; The TFIDF words-frequency feature computing method statistics category feature of utilization based on characteristic statistics, obtain candidate's Based on Class Feature Word Quadric, select noun as candidate's Based on Class Feature Word Quadric, according to candidate feature word weight, sort, the weighted value of usining determines that candidate feature word is as classification keyword, utilize the semantic relation between classification keyword, form classification results; Result is built to knowledge base, and knowledge base can also be arranged to have and support the text mining functions such as public sentiment motif discovery, public sentiment sentiment classification simultaneously.
In steps d, described public feelings information analysis module, that the data that the process step c to depositing in public feelings information database excavates are carried out OLAP multidimensional statistics analysis, analyzing public sentiment theme attention rate, public sentiment content erotic degree, public sentiment propagates the public sentiment evaluation metrics ,Wei relevant departments such as diffusibleness, public sentiment issue degree of impact and grasps in time public sentiment and issue public feelings information dynamically, in good time, make correct decisions and provide support.
Compared with prior art, the present invention has following beneficial effect:
1. current network public feelings information has reflected the features such as magnanimity, dynamic, imperfection, form of expression diversity, and existing public feelings information analytical approach has often been ignored the correlationship of public feelings information content of text, cause public feelings information analysis result inaccurate; The present invention adopts the text feature semantic network graph model that builds public feelings information text, introduces the contact between phrase semantic association and context of co-text in textual description structure; In conjunction with the improvement Text Clustering Algorithm based on semantic similarity, mining analysis goes out the semantic relevant content of context in public feelings information text.
2. by setting up the text feature semantic network figure of public feelings information text, context relation between word in public feelings information text is formed to the digraph structure of characteristic item and weight composition, when retaining text word contextual information structure, strengthened the intension of word context semanteme in text, describe preferably semantic information implicit in text and theme feature, solve the problem of phrase semantic loss of learning in text.
3. the improvement Text Clustering Algorithm based on semantic similarity is suitable under large-scale network environment, the cluster analysis of dynamic data and public sentiment theme focus being found, by text semantic similarity is calculated, build text semantic similarity matrix, the degree of depth is excavated the semantic relevant content of context in public feelings information text, detects in time, follows the tracks of new subject events; Adopt the theme method for expressing at a plurality of centers in class, select the similarity maximal value at each center in text and class as the similarity of this class text, effectively improved running efficiency of system, along with the increase of amount of text, cluster analysis effect can be more obvious.
Accompanying drawing explanation
Fig. 1 is the workflow diagram of the embodiment of the present invention based on the relevant network public sentiment information analytical approach of text semantic.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention will be further described.But embodiments of the present invention are not limited to this.
As shown in Figure 1, in method of the present invention, comprise the network public sentiment information analytic system that network public sentiment information acquisition module, public feelings information extraction module, public feelings information pretreatment module, public feelings information excavate module, public feelings information analysis module and comprise public feelings information database.Its treatment scheme is:
(1) public feelings information collection
Network public sentiment information source is gathered, different from general web crawlers is, it not only will complete crawling of webpage, and web page contents to be formatd to processing, extract useful public feelings information, as the theme of public sentiment and content, the data obtained deposits txt form or html formatted file in, writes original public feelings information database.Concrete steps are: according to default network public sentiment information acquisition strategies, from the URL of a plurality of kinds of sub-pages, send the instruction (adopting GET method) of following http agreement by each generic port; Remote server returns to the document of HTML type according to the content of application instruction.Public feelings information acquisition module is first saved to buffer memory after all information in collecting and returning to document, and is then sent in database and preserves, and obtains the text message in webpage; In obtaining web page text information process, constantly from current web page, extract emerging hyperlink URL access, and reject the hyperlink URL of having accessed, so iterative cycles, until it is complete to meet the web page text information acquisition of search strategy, till the URL queue of access is not sky.The web page text information of collection is stored in database according to field classification, provide public feelings information extraction module to call.
The anti-shielding strategy that network public sentiment information acquisition module adopts timesharing access, timing to change IP address conventionally, simulation browser carries out the multiple technologies combinations such as single-sign-on.For many websites, as passing through user's login mode, forum, blog, microblogging etc. could access, here adopt the strategy of simulation browser more easily to realize, the API Calls that the Web Browser control that utilizes the .NET of Microsoft developing instrument Visual Studio2008 to provide is MS internet explorer, utilize the simulation of SSO single-sign-on to submit user name and password login to, after waiting for that user login information has loaded, page jump is to corresponding URL address, by submitting to keyword to retrieve, obtain the source file of required webpage.
The web page text information gathering comprises web content information, Web structure and uses recorded information two parts.Web content packets of information is containing content of text information such as headline, body matter, review information, and Web structure and Web are used recorded information to comprise the statistical informations such as click volume, pageview, comment amount.
(2) public feelings information extraction
The info web gathering contains the noise datas such as advertisement, navigation information, picture, copyright notice, what concerning public feelings information analysis, really need is the metamessage of body part, dispose these irrelevant contents, the metamessage of useful body part is analyzed in extraction to public feelings information, for the follow-up excavation of text, analysis provide service.Idiographic flow is as follows:
(2-1) first use Tidy instrument to align web page text and carry out the standardization of HTML mark, then utilize html parser tools build HTML tree, node using HTML mark as tree, represents to be convenient to the management of HTML code and operation like this, can to code, carry out structuring excavation better.
(2-2) from the public feelings information source gathering, extract the relevant informations such as title, keyword, text, length, update time and URL, title can intercept the information between label <TITLE> and </TITLE>; Keyword is included in the META label of html file head, can from META label information, extract; Temporal information can be extracted by pattern match analysis and web page analysis.
(2-3) concrete steps that text extracts are: select suitable keyword, obtain the URL address of related web page, by the server at access place, URL address, obtain the html source code of webpage; The useless mark of deleting in webpage source code is capable, retains webpage body matter; Paragraph symbol in HTML code (as </p>, <br> etc.) is replaced with to special symbol (as * [/p] *, * [/br] * etc.), carriage return character and newline replace with line Separator, adopt row structure storage mode, retain web page contents form; Extract the text between every a line HTML mark " < " and " > "; Replace special symbol (as * [/p] *, * [/br] * etc.) with the carriage return character, keep the original paragraph of text; Result character string is removed to the special ESC of HTML (as & quot, & lt etc.) and process, in conjunction with regular expression, mate and extract final text result.
From the public feelings information source gathering, extract after the relevant informations such as title, keyword, text, length, update time and URL, public feelings information extraction module also will realize the reconstruct of text message.
Text reconstruct is by analyzing the architectural feature of the public feelings information existence forms such as Internet news, forum's model, microblogging blog article and text, the information of representative topic is formed to " purport piece ", the information of remainder forms " content blocks ", to improve cluster analysis effect.
For the text reconstruct of web page news, be that the title of web page news and first segment information are formed to " purport piece ", remaining news descriptor and comment content form " content blocks ".
For the text reconstruct of forum's model, be that the title of model and main note are formed to " purport piece ", by money order receipt to be signed and returned to the sender and follow-up information purified treatment, remove and there is no the model of Chinese character content and use the conventional model of evaluating word, select some models to form " content blocks ".
(3) public feelings information pre-service
After public feelings information extraction, next carry out the pre-service such as Chinese word segmentation processing, named entity recognition, part-of-speech tagging, syntax parsing, Feature Words extraction, result is saved in database.Realize the text analyzings such as network public sentiment information text mining, natural language processing, first to carry out word segmentation processing, use for reference the achievement in research in domestic Chinese word segmentation field, the Chinese lexical analysis system ICTCLAS that adopts Inst. of Computing Techn. Academia Sinica to develop carries out participle and the part-of-speech tagging of text, by Chinese word segmentation, process, extract the word that length is greater than two.The function of ICTCLAS has participle, part-of-speech tagging, neologisms identification of Chinese text etc.; Use the method for actor model (role model) to carry out named entity recognition; Support user to define as required personalized dictionary simultaneously, not only have the higher precision of word segmentation, participle effect is also better.Code is as follows:
After text participle, filter the stop words useless to computer understanding text, retain the word of the parts of speech such as noun, verb, adnoun, moving shape word, obtain alternative features word set, to avoid the lengthy and jumbled of text, effectively reduce the size of index, increase recall precision, improve retrieval rate.
Through the text of word segmentation processing, set up positive sequence index and inverted index, the inquiry that realizes user is mutual.For positive sequence index, according to the sequence of word frequency, select top n word to represent text, with Hash table, be expressed as: < filename, keyword phrase >; Set up after positive sequence index, the keyword in search text, finds out the All Files name that comprises this keyword, sets up filename phrase, can obtain inverted index, is expressed as: < keyword, filename phrase > with Hash table.
The foundation of index and the retrieval service of index realize based on the Apache project Lucene that increases income, and Lucene provides complete query engine and index engine, text analyzing engine; Adopt the index file of Hadoop store and management magnanimity.
The process of establishing of index is as follows:
1. create index and write object IndexWriter.During this Object Creation, need provide vocabulary resolver, different vocabulary resolvers adopt different dictionaries.Select ThesaurusAnalyzer, can extract synopsis;
2. for each result set of taking from database creates a Document object;
3. the data element in result set is created respectively to a Field object, and add Document object to;
4. write this Document object.
The process of indexed search is: first create query parser, this query parser needs Field object name and the corresponding parameters such as vocabulary resolver; By query parser and key word, obtain query object again; By query object, obtain the result set of retrieval, result set consists of Document object.
Text through participle, part-of-speech tagging, go after stop words, set up the Feature Semantics network chart of text, the information such as the statistics word frequency of text and text frequency, are then weighted with feature extraction etc.
Text feature semantic network figure is a kind of digraph of expressing public feelings information with entity and semantic relation thereof, the entity E(comprising in text of usining comprises things entity NE, event entity VE, event relation entity RE) as the node of figure, semantic relation between two entities is as the directed edge of figure, semantic relation between entity is the weight as node in conjunction with word frequency information, and the weight of directed edge represents the significance level of entity relationship in text.By introducing and the merging based on concept and the simplification of network node weights, build text feature semantic network figure, the core of extracting text is semantic.The word representing by network node merges, and node weights are added; Remerge directed edge, directed edge weights are added, and build text feature semantic network figure, describe semantic information and theme feature in text.Concrete concept is described below:
C1: things entity NE is defined as NE(id, concept, property, power).Id represents entity identification, and concept represents entitative concept, and property represents entity attribute, and power represents weight.
C2: event entity VE is defined as VE(id, concept, property, power, isN, subT, objT1, objT2).Except the several data item that comprise NE, whether isN representative is negative, and subT represents main body entity gauge outfit, and objTl and objT2 represent the gauge outfit of object entity 1 and 2.
C3: event relation entity RE is defined as RE(id, concept, property, power, isN, subT, objT).RE just can describe completely with a pair of Subjective and Objective entity.
Text feature semantic network graph model analytical procedure is as follows:
S1: when analyzing text, first take statement as unit, build each statement characteristic of correspondence semantic network figure.Analyze sentence by sentence every and produced which NE, NE and attribute information thereof are charged to entity information table.
After S2:NE analyzes, analyze VE, the concept of registration VE, attribute, subject and object.The VE entity list that Subjective and Objective is identical is shown same VE, otherwise different id is set.
S3: next analyze RE.Analyze RE will attention and NE, VE make a distinction, the concept of RE, attribute, main body, object are registered to entity information table.
S4: after analysis finishes, obtain the entity information table of this statement.Entity information table has been described the relation between entity, is used for constructing entity relationship diagram, between NE and VE, between RE and NE, VE, by different lines, entity relationship is visual between entity E and attribute T.
S5: analyzing on the Feature Semantics network chart basis that builds first statement, the Feature Semantics network chart of follow-up statement is merged, first merge node, remerges directed edge.
S6: during merge node, the node that identical or semantic similarity meets threshold condition word between node merges, and node weights are added; Otherwise retain this node.
S7: directed edge merges, is that the directed edge existing between the node after merging is merged, and directed edge weights are added.
S8: upgrade the weights that the weights of new merge node adjacent side are this node, the semantic relation between strengthening node.
S9: export after the Feature Semantics network chart of all merge statements, complete the structure of the Feature Semantics network chart of whole text.
Next step is to part of speech feature weight assignment, accurately to indicate text.According to Chinese part of speech feature and complete event, key element (time, place, personage and event content) is described, in conjunction with Chinese Academy of Sciences's Chinese part of speech label sets, text feature weight assignment is divided into: title weighted value is 3, subtitle and keyword weighted value are 2, summary weighted value is 1.5, and the first sentence of section and section tail sentence weighted value are 1.3.
Public feelings information, after pre-service, for title, text and the reply of text arranges different labels, when calculating weight, reads the label information of keyword, completes the assignment of the position weight of word.
(4) public feelings information excavates
Public feelings information excavates module, that text set is being carried out to pre-service, after comprising that Chinese word segmentation processing, stop words filtration and structuring label information are analyzed, the text data set that Information Extracting module is generated, the text semantic feature description scheme building according to text feature semantic network figure, utilize method for evaluating similarity to calculate the semantic similarity between text, build similarity matrix, adopt the improvement Clustering Analysis of Text method based on semantic similarity to generate cluster result; Cluster analysis result generates classification descriptor, filters out the text message comprising in cluster analysis result; The TFIDF words-frequency feature computing method statistics category feature of utilization based on characteristic statistics, obtain candidate's Based on Class Feature Word Quadric, select noun as candidate's Based on Class Feature Word Quadric, according to candidate feature word weight, sort, the weighted value of usining determines that candidate feature word is as classification keyword, utilize the semantic relation between classification keyword, form classification results; Result is built to knowledge base, and knowledge base can also be arranged to have and support the text mining functions such as public sentiment motif discovery, public sentiment sentiment classification simultaneously.
First define and calculate the similarity between text, the degree of correlation of the theme of discussing between text, uses Sim (D
1, D
2) expression text D
1with text D
2between similarity.Similarity span is between 0 and 1, with text D
1and D
2similarity degree be directly proportional.Similarity between text is larger, shows that the theme correlation degree between text is larger.Semantic similarity evaluation method between text is as follows:
If public feelings information extraction and pretreated text through step b are D
1(t
11, t
12, t
13..., t
1m), D
2(t
21, t
22, t
23..., t
2m), calculate text D
1in all keyword t
1iwith text D
2in all keyword t
2isimilarity, form similarity matrix as follows:
Sim
ij(1=i, j=m) represents text D
1keyword t
1iwith text D
2keyword t
2jsimilarity; M(D
1, D
2) expression text D
1with text D
2between similarity matrix; I is text D
1keyword number; M is text D
2keyword number;
Word similarity formula is: S (T
1, T
2)=Max (i=1,2 ..., n; J=1,2 ..., m) S (y
1i, y
2j), word similarity is the maximal value in all senses of a dictionary entry of two words (a plurality of meaning of a word that word comprises) similarity.
Travel through successively similarity matrix M, find the corresponding combination of the maximum keyword of similarity Sim value, and delete corresponding row and column.Then continue the keyword combination that traversal similarity matrix M finds similarity value maximum, iterative cycles is until matrix M is null value matrix.Finally utilize the maximum keyword composite sequence of the similarity obtaining, try to achieve text D
1and D
2semantic similarity, computing formula is as follows:
Wherein, max is the maximal value of similarity Sim; I is text D
1keyword number; J is text D
2keyword number.
Improvement Clustering Analysis of Text method based on semantic similarity, is described below:
First to the text of all collections after pre-service, adopt TFIDF weighted method to carry out characteristic weighing to all categories keyword, extracts m optimal characteristics keyword and forms original in the vectorial Di* of keyword feature.
2. according to described knowledge base, original carried out to pre-service based on keyword in keyword feature vector Di*: in knowledge base, find the vocabulary mating with keyword and replaced, forming new proper vector D
i, D
i=(T
1, T
2..., T
i), i=1,2,3 ..., m.
3. form m proper vector D of n text
i, utilize text semantic calculating formula of similarity to calculate the semantic similarity between the text gathering, form the similarity matrix M of text set, and obtain the average similarity MA of all proper vectors.Computing formula is as follows:
4. set three similarity thresholds, a multiplicity threshold value is that 0.9, one theme central threshold is 0.5, and a new theme threshold value is 0.3;
5. by text and central theme comparison, if the initial center similarity of text and central theme is greater than multiplicity threshold value 0.9, think that the text belongs to the same content text of same subject; If similarity is less than new theme threshold value 0.3, the text needs a newly-built class; If similarity is in 0~0.5 scope, the text belongs to the core content text of the not ipsilateral discussion of same subject, is labeled as second center, by that analogy, forms the cluster result of the stratification at a plurality of centers.
6. for the theme method for expressing at a plurality of centers, select the maximal value of the similarity at each center in text and class as the similarity of this class text.
Improvement Text Clustering Algorithm based on semantic similarity is suitable under large-scale network environment, the cluster analysis of dynamic data and public sentiment theme focus being found, can new events be detected in time, detects, follows the tracks of new public sentiment theme; The public sentiment theme method for expressing that adopts a plurality of centers in class, has improved running efficiency of system effectively, and along with the increase of amount of text, effect can be more obvious.
5) public feelings information analysis
The data that described public feelings information analysis module excavates the process step c depositing in public feelings information database are carried out OLAP multidimensional statistics analysis, analyzing the public sentiment evaluation metrics ,Wei relevant departments such as public sentiment subject content attention rate, public sentiment theme emotion tendency grasps in time public sentiment and issues public feelings information dynamically, in good time, makes correct decisions and provide support.
The public sentiment theme producing by collection, processing and mining analysis, is expressed as: T=(T
1, T
2..., T
n), T wherein
ithe text that represents public sentiment theme.The attention rate of public sentiment subject text is expressed as: T
i=α N
p+ β N
r, the attention rate tolerance formula of public sentiment theme is:
α wherein, β represents weight, N
pthe clicks that represents public sentiment subject text, N
rrepresent comment number; Np_i represents the clicks of i public sentiment subject text, and Nr_i represents the comment number of i public sentiment subject text.Due to N
p>N
r, through statistics, α value is that 0.02, β value is 0.98.
The cluster analysis data description of the emotion tendency of public sentiment theme based on public sentiment subject text.First set a threshold, only have the tendency metric when text to be greater than threshold, text just shows polarity (front property, negative property).The tendency metric of text is for just, and the text is positive comment, otherwise is negative comment.
Public feelings information, through collection, pre-service, Information Extracting, excavation and analysis, can obtain the detailed data of public sentiment theme, according to the public sentiment indicator evaluation system of setting up, processes, and the result of processing provides decision-making to help.
Claims (9)
1. based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that: adopt the network public sentiment information analytic system that comprises that network public sentiment information acquisition module, public feelings information extraction module, public feelings information pretreatment module, public feelings information excavate module, public feelings information analysis module and comprise public feelings information database, and comprise the steps:
A. network public sentiment information acquisition module gathers various public feelings informations from webpage, and stores in public feelings information database;
B. the public feelings information that public feelings information extraction module and public feelings information pretreatment module gather step a tentatively filters and cutting, extracts the content information that text comprises, and for public feelings information excavates, provides data, services;
C. on step b basis, public feelings information excavates module and adopts the improvement Clustering Analysis of Text method based on semantic similarity, generates classification descriptor, filters out the text message comprising in cluster analysis result; The TFIDF words-frequency feature computing method statistics category feature of utilization based on characteristic statistics, obtain Based on Class Feature Word Quadric, select noun as candidate's Based on Class Feature Word Quadric, according to candidate feature word weight, sort, the larger candidate feature word of the weighted value of usining is as classification keyword, utilize the semantic relation between classification keyword, form classification results; Identify and set up new network public-opinion theme, the related content that detects, follows the tracks of existing public sentiment theme;
D. last, public feelings information analysis module excavates public feelings information data through step c are carried out OLAP multidimensional statistics analysis, analyze the public sentiment evaluation metricses such as public sentiment subject content attention rate, public sentiment theme emotion tendency.
2. according to claim 1 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, in step a, described public feelings information acquisition module, is that network public sentiment information source is gathered, and not only will complete crawling of webpage, and web page contents to be formatd to processing, theme and the content of extracting public sentiment, the data obtained deposits txt form or html formatted file in, and stores public feelings information database into; Network public sentiment information acquisition module adopts timesharing access, timing to change IP address and simulation browser carries out three kinds of technology of single-sign-on in conjunction with carrying out anti-shielding.
3. according to claim 2 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, the concrete steps that described public feelings information acquisition module is carried out are, from the URL of predefined Topic relative webpage, obtain the text message in webpage, and from current web page, extract new URL and put into queue, until that the public feelings information satisfying condition gathers is complete, till URL queue is sky; The web page text information collecting is stored in public feelings information database according to field classification, provide public feelings information extraction module to call.
4. according to claim 1 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, in step b, described public feelings information extraction module, it is the irrelevant contents of removing in webpage, the metamessage of extraction to the useful body part of the analysis of public opinion, is reconstructed text, will have the representational information aggregation of theme together; Described public feelings information pretreatment module, that public feelings information source to gathering is after the extraction of described public feelings information extraction module, carry out Chinese word segmentation processing, filtration stop words, named entity recognition, part-of-speech tagging, syntax parsing and Feature Words and extract, set up positive sequence index and inverted index; Set up text feature semantic network figure, using the entity E that comprises in the text node as figure, semantic relation between two entities is as the directed edge of figure, semantic relation between entity is the weight as node in conjunction with word frequency information, the weight of directed edge represents the significance level of entity relationship in text, and described entity E comprises things entity NE, event entity VE, event relation entity RE; Word frequency and the text frequency information of statistics text, then carry out Feature Words extraction, and the vocabulary of choosing embodiment text feature shows the text.
5. according to claim 4 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, in step c, described public feelings information excavates module, that text set is being carried out to pre-service, comprise Chinese word segmentation processing, after stop words filtration and structuring label information are analyzed, the text data set that Information Extracting module is generated, the text semantic feature description scheme building according to text feature semantic network figure, utilize method for evaluating similarity to calculate the semantic similarity between text, build similarity matrix, the improvement Clustering Analysis of Text method of employing based on semantic similarity generates cluster result, cluster analysis result generates classification descriptor, filters out the text message comprising in cluster analysis result, the TFIDF words-frequency feature computing method statistics category feature of utilization based on characteristic statistics, obtain candidate's Based on Class Feature Word Quadric, select noun as candidate's Based on Class Feature Word Quadric, according to candidate feature word weight, sort, the weighted value of usining determines that candidate feature word is as classification keyword, utilize the semantic relation between classification keyword, form classification results, Result is built to knowledge base.
According to described in claim 4 or 5 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, text feature semantic network figure utilizes entity and semantic relation thereof to express the digraph of public feelings information, and the word representing by network node merges, and node weights are added; Remerge directed edge, directed edge weights are added, and build text feature semantic network figure, describe semantic information and theme feature in text.
7. according to claim 5ly based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, the semantic similarity evaluation method between text is:
If public feelings information extraction and pretreated text through step b are D
1(t
11, t
12, t
13..., t
1m), D
2(t
21, t
22, t
23..., t
2m), calculate text D
1in all keyword t
1iwith text D
2in all keyword t
2isimilarity, form similarity matrix as follows:
Sim
ij(1=i, j=m) represents text D
1keyword t
1iwith text D
2keyword t
2jsimilarity; M(D
1, D
2) expression text D
1with text D
2between similarity matrix; I is text D
1keyword number; M is text D
2keyword number;
Word similarity formula S (T
1, T
2)=Max
(i=1,2 ..., n; J=1,2 ..., m)s(y
1i, y
2j), word similarity is the maximal value in all senses of a dictionary entry similarities of two words, the described senses of a dictionary entry refers to a plurality of meaning of a word that a word comprises;
Travel through successively similarity matrix M, find the corresponding combination of the maximum keyword of similarity Sim value, and delete corresponding row and column; Then continue traversal similarity matrix M and find the maximum keyword combination of Sim value, iterative cycles is until matrix M is null value matrix; Finally utilize the maximum keyword composite sequence of the similarity obtaining, try to achieve text D
1and D
2semantic similarity, computing formula is as follows:
Wherein, max is the maximal value of similarity Sim; I is text D
1keyword number; J is text D
2keyword number.
8. according to claim 7ly based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, the improvement Clustering Analysis of Text method based on semantic similarity is:
1) first to the text of all collections after pre-service, adopt TFIDF weighted method to carry out characteristic weighing to all categories keyword, extracts m optimal characteristics keyword and forms original in the vectorial Di* of keyword feature;
2) according to described knowledge base, original carried out to pre-service based on keyword in keyword feature vector Di*: in knowledge base, find the vocabulary mating with keyword and replaced, forming new proper vector D
i, D
i=(T
1, T
2..., T
i), i=1,2,3 ..., m;
3) form m proper vector D of n text
i, utilize text semantic calculating formula of similarity to calculate the semantic similarity between the text gathering, form the similarity matrix M of text set, and obtain the average similarity MA of all proper vectors; Computing formula is as follows:
4) set three similarity thresholds, a multiplicity threshold value is that 0.9, one theme central threshold is 0.5, and a new theme threshold value is 0.3;
5), by text and central theme comparison, if the initial center similarity of text and central theme is greater than multiplicity threshold value 0.9, think that the text belongs to the same content text of same subject; If similarity is less than new theme threshold value 0.3, the text needs a newly-built class; If similarity is in 0~0.5 scope, the text belongs to the core content text of the not ipsilateral discussion of same subject, is labeled as second center, by that analogy, forms the cluster result of the stratification at a plurality of centers;
6), for the theme method for expressing at a plurality of centers, select the maximal value of the similarity at each center in text and class as the similarity of this class text.
9. according to claim 1 based on the relevant network public sentiment information analytical approach of text semantic, it is characterized in that, in steps d, described public feelings information analysis module, is that the data that the process step c to depositing in public feelings information database excavates are carried out OLAP multidimensional statistics analysis.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310482522.5A CN103544255B (en) | 2013-10-15 | 2013-10-15 | Text semantic relativity based network public opinion information analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310482522.5A CN103544255B (en) | 2013-10-15 | 2013-10-15 | Text semantic relativity based network public opinion information analysis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103544255A true CN103544255A (en) | 2014-01-29 |
CN103544255B CN103544255B (en) | 2017-01-11 |
Family
ID=49967707
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310482522.5A Active CN103544255B (en) | 2013-10-15 | 2013-10-15 | Text semantic relativity based network public opinion information analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103544255B (en) |
Cited By (153)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838886A (en) * | 2014-03-31 | 2014-06-04 | 辽宁四维科技发展有限公司 | Text content classification method based on representative word knowledge base |
CN103841216A (en) * | 2014-04-01 | 2014-06-04 | 深圳市科盾科技有限公司 | Network public opinion monitoring system based on cloud platform |
CN103886051A (en) * | 2014-03-13 | 2014-06-25 | 电子科技大学 | Comment analysis method based on entities and features |
CN103902674A (en) * | 2014-03-19 | 2014-07-02 | 百度在线网络技术(北京)有限公司 | Method and device for collecting evaluation data of specific subject |
CN103902659A (en) * | 2014-03-04 | 2014-07-02 | 深圳市至高通信技术发展有限公司 | Public opinion analysis method and corresponding device |
CN103927545A (en) * | 2014-03-14 | 2014-07-16 | 小米科技有限责任公司 | Clustering method and device |
CN104199829A (en) * | 2014-07-25 | 2014-12-10 | 中国科学院自动化研究所 | Emotion data classifying method and system |
CN104268194A (en) * | 2014-09-19 | 2015-01-07 | 国家电网公司 | Method for dynamically generating public opinion brief report |
CN104346425A (en) * | 2014-07-28 | 2015-02-11 | 中国科学院计算技术研究所 | Method and system of hierarchical internet public sentiment indication system |
CN104504150A (en) * | 2015-01-09 | 2015-04-08 | 成都布林特信息技术有限公司 | News public opinion monitoring system |
CN104699763A (en) * | 2015-02-11 | 2015-06-10 | 中国科学院新疆理化技术研究所 | Text similarity measuring system based on multi-feature fusion |
CN104820629A (en) * | 2015-05-14 | 2015-08-05 | 中国电子科技集团公司第五十四研究所 | Intelligent system and method for emergently processing public sentiment emergency |
CN104899339A (en) * | 2015-07-01 | 2015-09-09 | 北京奇虎科技有限公司 | Method and system for classifying POI (Point of Interest) information |
CN104915359A (en) * | 2014-03-14 | 2015-09-16 | 华为技术有限公司 | Theme label recommending method and device |
CN104915453A (en) * | 2015-07-01 | 2015-09-16 | 北京奇虎科技有限公司 | Method, device and system for classifying POI information |
CN105183478A (en) * | 2015-09-11 | 2015-12-23 | 中山大学 | Webpage reestablishing method and device based on color transmission |
CN105183803A (en) * | 2015-08-25 | 2015-12-23 | 天津大学 | Personalized search method and search apparatus thereof in social network platform |
CN105279277A (en) * | 2015-11-12 | 2016-01-27 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and device |
CN105389389A (en) * | 2015-12-10 | 2016-03-09 | 安徽博约信息科技有限责任公司 | Network public opinion transmission situation media linked analysis method |
CN105447202A (en) * | 2015-12-31 | 2016-03-30 | 宁波公众信息产业有限公司 | Internet information collecting system |
WO2016058267A1 (en) * | 2014-10-17 | 2016-04-21 | 任子行网络技术股份有限公司 | Chinese website classification method and system based on characteristic analysis of website homepage |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
CN105677802A (en) * | 2015-12-31 | 2016-06-15 | 宁波公众信息产业有限公司 | Internet information analysis system |
CN105740238A (en) * | 2016-03-04 | 2016-07-06 | 北京理工大学 | Method for constructing event relationship strength graph fusing sentence meaning information |
CN105956070A (en) * | 2016-04-28 | 2016-09-21 | 优品财富管理有限公司 | Method and system for integrating repetitive records |
CN105956069A (en) * | 2016-04-28 | 2016-09-21 | 优品财富管理有限公司 | Network information collection and analysis method and network information collection and analysis system |
CN105992194A (en) * | 2015-01-30 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Network data content acquiring method and network data content acquiring device |
CN106126558A (en) * | 2016-06-16 | 2016-11-16 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106156192A (en) * | 2015-04-21 | 2016-11-23 | 北大方正集团有限公司 | Public sentiment data clustering method and public sentiment data clustering system |
CN106156041A (en) * | 2015-03-26 | 2016-11-23 | 科大讯飞股份有限公司 | Hot information finds method and system |
CN106294358A (en) * | 2015-05-14 | 2017-01-04 | 北京大学 | The search method of a kind of information and system |
CN106294542A (en) * | 2016-07-25 | 2017-01-04 | 北京市信访矛盾分析研究中心 | A kind of letters and calls data mining methods of marking and system |
CN106294619A (en) * | 2016-08-01 | 2017-01-04 | 上海交通大学 | Public sentiment intelligent supervision method |
CN106528581A (en) * | 2015-09-15 | 2017-03-22 | 阿里巴巴集团控股有限公司 | Text detection method and apparatus |
CN106599054A (en) * | 2016-11-16 | 2017-04-26 | 福建天泉教育科技有限公司 | Method and system for title classification and push |
CN106649367A (en) * | 2015-10-30 | 2017-05-10 | 北京国双科技有限公司 | Method and device for detecting popularization degree of keyword |
CN106651696A (en) * | 2016-11-16 | 2017-05-10 | 福建天泉教育科技有限公司 | Approximate question push method and system |
CN104217718B (en) * | 2014-09-03 | 2017-05-17 | 陈飞 | Method and system for voice recognition based on environmental parameter and group trend data |
CN106776724A (en) * | 2016-11-16 | 2017-05-31 | 福建天泉教育科技有限公司 | A kind of exercise question sorting technique and system |
CN107016068A (en) * | 2017-03-21 | 2017-08-04 | 深圳前海乘方互联网金融服务有限公司 | Knowledge mapping construction method and device |
CN107038156A (en) * | 2017-04-28 | 2017-08-11 | 北京清博大数据科技有限公司 | A kind of hot spot of public opinions Forecasting Methodology based on big data |
CN107045524A (en) * | 2016-12-30 | 2017-08-15 | 中央民族大学 | A kind of method and system of network text public sentiment classification |
CN107066585A (en) * | 2017-04-17 | 2017-08-18 | 济南大学 | A kind of probability topic calculates the public sentiment monitoring method and system with matching |
CN107077640A (en) * | 2014-09-03 | 2017-08-18 | 邓白氏公司 | Analyzed via experience ownership, it is qualification and intake unstructured data sources system and processing |
CN107085608A (en) * | 2017-04-21 | 2017-08-22 | 上海喆之信息科技有限公司 | A kind of effective network hotspot monitoring system |
CN107093021A (en) * | 2017-04-21 | 2017-08-25 | 深圳市创艺工业技术有限公司 | Electricity power engineering goods and materials contract is honoured an agreement sincere public sentiment monitoring system |
CN107145516A (en) * | 2017-04-07 | 2017-09-08 | 北京捷通华声科技股份有限公司 | A kind of Text Clustering Method and system |
CN107220236A (en) * | 2017-05-23 | 2017-09-29 | 武汉朱雀闻天科技有限公司 | It is a kind of to determine the doubtful naked method and device for borrowing student |
CN107231570A (en) * | 2017-06-13 | 2017-10-03 | 中国传媒大学 | News data content characteristic obtains system and application system |
CN107276854A (en) * | 2017-07-27 | 2017-10-20 | 中兴软创科技股份有限公司 | A kind of method of MOLAP statistical analyses under big data |
CN107291697A (en) * | 2017-06-29 | 2017-10-24 | 浙江图讯科技股份有限公司 | A kind of semantic analysis, electronic equipment, storage medium and its diagnostic system |
CN107291808A (en) * | 2017-05-16 | 2017-10-24 | 南京邮电大学 | It is a kind of that big data sorting technique is manufactured based on semantic cloud |
CN107292743A (en) * | 2017-06-07 | 2017-10-24 | 前海梧桐(深圳)数据有限公司 | The intelligent decision making method and its system invested and financed for enterprise |
CN107315778A (en) * | 2017-05-31 | 2017-11-03 | 温州市鹿城区中津先进科技研究院 | A kind of natural language the analysis of public opinion method based on big data sentiment analysis |
CN107358344A (en) * | 2017-06-29 | 2017-11-17 | 浙江图讯科技股份有限公司 | Enterprise's hidden danger management method and its management system, electronic equipment and storage medium |
CN107430633A (en) * | 2015-11-03 | 2017-12-01 | 慧与发展有限责任合伙企业 | The representative content through related optimization being associated to data-storage system |
CN107491438A (en) * | 2017-08-25 | 2017-12-19 | 前海梧桐(深圳)数据有限公司 | Business decision elements recognition method and its system based on natural language |
CN107527289A (en) * | 2017-08-25 | 2017-12-29 | 百度在线网络技术(北京)有限公司 | A kind of investment combination industry distribution method, apparatus, server and storage medium |
CN107679977A (en) * | 2017-09-06 | 2018-02-09 | 广东中标数据科技股份有限公司 | A kind of tax administration platform and implementation method based on semantic analysis |
CN107679084A (en) * | 2017-08-31 | 2018-02-09 | 平安科技(深圳)有限公司 | Cluster labels generation method, electronic equipment and computer-readable recording medium |
CN107918644A (en) * | 2017-10-31 | 2018-04-17 | 北京锐思爱特咨询股份有限公司 | News subject under discussion analysis method and implementation system in reputation Governance framework |
CN107918633A (en) * | 2017-03-23 | 2018-04-17 | 广州思涵信息科技有限公司 | Sensitive public sentiment content identification method and early warning system based on semantic analysis technology |
CN108052527A (en) * | 2017-11-08 | 2018-05-18 | 中国传媒大学 | Method is recommended in film bridge piecewise analysis based on label system |
CN108062306A (en) * | 2017-12-29 | 2018-05-22 | 国信优易数据有限公司 | A kind of index system establishment system and method for business environment evaluation |
CN108090040A (en) * | 2016-11-23 | 2018-05-29 | 北京国双科技有限公司 | A kind of text message sorting technique and system |
CN108170666A (en) * | 2017-11-29 | 2018-06-15 | 同济大学 | A kind of improved method based on TF-IDF keyword extractions |
CN108197638A (en) * | 2017-12-12 | 2018-06-22 | 阿里巴巴集团控股有限公司 | The method and device classified to sample to be assessed |
CN108287922A (en) * | 2018-02-28 | 2018-07-17 | 福州大学 | A kind of text data viewpoint abstract method for digging of fusion topic attribute and emotion information |
CN108363784A (en) * | 2018-01-20 | 2018-08-03 | 西北工业大学 | A kind of public sentiment trend estimate method based on text machine learning |
CN108536762A (en) * | 2018-03-21 | 2018-09-14 | 上海蔚界信息科技有限公司 | A kind of high-volume text data automatically analyzes scheme |
CN108550380A (en) * | 2018-04-12 | 2018-09-18 | 北京深度智耀科技有限公司 | A kind of drug safety information monitoring method and device based on public network |
CN108595466A (en) * | 2018-02-09 | 2018-09-28 | 中山大学 | A kind of filtering of internet information and Internet user's information and net note structure analysis method |
CN108628994A (en) * | 2018-04-28 | 2018-10-09 | 广东亿迅科技有限公司 | A kind of public sentiment data processing system |
CN108681977A (en) * | 2018-03-27 | 2018-10-19 | 成都律云科技有限公司 | A kind of lawyer's information processing method and system |
CN108804594A (en) * | 2018-05-28 | 2018-11-13 | 国家计算机网络与信息安全管理中心 | A kind of construction method and device of news content full-text search engine |
CN108932291A (en) * | 2018-05-23 | 2018-12-04 | 福建亿榕信息技术有限公司 | Power grid public sentiment evaluation method, storage medium and computer |
CN109145085A (en) * | 2018-07-18 | 2019-01-04 | 北京市农林科学院 | The calculation method and system of semantic similarity |
CN109189934A (en) * | 2018-11-13 | 2019-01-11 | 平安科技(深圳)有限公司 | Public sentiment recommended method, device, computer equipment and storage medium |
CN109214008A (en) * | 2018-09-28 | 2019-01-15 | 珠海中科先进技术研究院有限公司 | A kind of sentiment analysis method and system based on keyword extraction |
CN109299271A (en) * | 2018-10-30 | 2019-02-01 | 腾讯科技(深圳)有限公司 | Training sample generation, text data, public sentiment event category method and relevant device |
CN109376237A (en) * | 2018-09-04 | 2019-02-22 | 中国平安人寿保险股份有限公司 | Prediction technique, device, computer equipment and the storage medium of client's stability |
CN109408808A (en) * | 2018-09-12 | 2019-03-01 | 中国传媒大学 | A kind of appraisal procedure and assessment system of artistic works |
CN109446409A (en) * | 2018-09-19 | 2019-03-08 | 杭州安恒信息技术股份有限公司 | A kind of recognition methods of the target object of doubtful multiple level marketing behavior |
CN109526027A (en) * | 2018-11-27 | 2019-03-26 | 中国移动通信集团福建有限公司 | A kind of cell capacity optimization method, device, equipment and computer storage medium |
CN109558586A (en) * | 2018-11-02 | 2019-04-02 | 中国科学院自动化研究所 | A kind of speech of information is according to from card methods of marking, equipment and storage medium |
CN109582953A (en) * | 2018-11-02 | 2019-04-05 | 中国科学院自动化研究所 | A kind of speech of information is according to support methods of marking, equipment and storage medium |
CN109635074A (en) * | 2018-11-13 | 2019-04-16 | 平安科技(深圳)有限公司 | A kind of entity relationship analysis method and terminal device based on public feelings information |
CN109635107A (en) * | 2018-11-19 | 2019-04-16 | 北京亚鸿世纪科技发展有限公司 | The method and device of semantic intellectual analysis and the event scenarios reduction of multi-data source |
WO2019085355A1 (en) * | 2017-11-01 | 2019-05-09 | 平安科技(深圳)有限公司 | Public sentiment clustering analysis method for internet news, application server, and computer-readable storage medium |
CN109766438A (en) * | 2018-12-12 | 2019-05-17 | 平安科技(深圳)有限公司 | Biographic information extracting method, device, computer equipment and storage medium |
CN109891517A (en) * | 2016-10-25 | 2019-06-14 | 皇家飞利浦有限公司 | The clinical diagnosis assistant of knowledge based figure |
CN110019720A (en) * | 2017-12-19 | 2019-07-16 | 优酷网络技术(北京)有限公司 | A kind of content of comment, which is separately won, takes method and system |
CN110046292A (en) * | 2018-12-13 | 2019-07-23 | 阿里巴巴集团控股有限公司 | Public sentiment data processing method, device, equipment and storage medium |
CN110110156A (en) * | 2019-04-04 | 2019-08-09 | 平安科技(深圳)有限公司 | Industry public sentiment monitoring method, device, computer equipment and storage medium |
CN110119416A (en) * | 2019-05-16 | 2019-08-13 | 重庆八戒传媒有限公司 | A kind of service data analysis system and method |
CN110134844A (en) * | 2019-04-04 | 2019-08-16 | 平安科技(深圳)有限公司 | Subdivision field public sentiment monitoring method, device, computer equipment and storage medium |
CN110188196A (en) * | 2019-04-29 | 2019-08-30 | 同济大学 | A kind of text increment dimension reduction method based on random forest |
CN110188168A (en) * | 2019-05-24 | 2019-08-30 | 北京邮电大学 | Semantic relation recognition methods and device |
CN110222172A (en) * | 2019-05-15 | 2019-09-10 | 北京邮电大学 | A kind of multi-source network public sentiment Topics Crawling method based on improvement hierarchical clustering |
CN110348539A (en) * | 2019-07-19 | 2019-10-18 | 知者信息技术服务成都有限公司 | Short text correlation method of discrimination |
CN110472055A (en) * | 2019-08-21 | 2019-11-19 | 北京百度网讯科技有限公司 | Method and apparatus for labeled data |
CN110532492A (en) * | 2019-08-27 | 2019-12-03 | 东北大学 | A kind of forum data management classification system and method |
CN110633373A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Automobile public opinion analysis method based on knowledge graph and deep learning |
CN110705288A (en) * | 2019-09-29 | 2020-01-17 | 武汉海昌信息技术有限公司 | Big data-based public opinion analysis system |
CN110727794A (en) * | 2018-06-28 | 2020-01-24 | 上海传漾广告有限公司 | System and method for collecting and analyzing network semantics and summarizing and analyzing content |
CN110852090A (en) * | 2019-11-07 | 2020-02-28 | 中科天玑数据科技股份有限公司 | Public opinion crawling mechanism characteristic vocabulary extension system and method |
CN110968668A (en) * | 2019-11-29 | 2020-04-07 | 中国农业科学院农业信息研究所 | Method and device for calculating similarity of network public sentiment subjects based on hyper-network |
CN110991190A (en) * | 2019-11-29 | 2020-04-10 | 华中科技大学 | Document theme enhanced self-attention network, text emotion prediction system and method |
CN110990389A (en) * | 2019-11-29 | 2020-04-10 | 上海易点时空网络有限公司 | Method and device for simplifying question bank and computer readable storage medium |
CN111144575A (en) * | 2019-12-05 | 2020-05-12 | 支付宝(杭州)信息技术有限公司 | Public opinion early warning model training method, early warning method, device, equipment and medium |
CN111160019A (en) * | 2019-12-30 | 2020-05-15 | 中国联合网络通信集团有限公司 | Public opinion monitoring method, device and system |
CN111241077A (en) * | 2020-01-03 | 2020-06-05 | 四川新网银行股份有限公司 | Financial fraud behavior identification method based on internet data |
CN111259635A (en) * | 2020-01-09 | 2020-06-09 | 智业软件股份有限公司 | Method and system for completing and predicting medical record written text |
CN111291186A (en) * | 2020-01-21 | 2020-06-16 | 北京捷通华声科技股份有限公司 | Context mining method and device based on clustering algorithm and electronic equipment |
CN111291162A (en) * | 2020-02-26 | 2020-06-16 | 深圳前海微众银行股份有限公司 | Quality test example sentence mining method, device, equipment and computer readable storage medium |
CN111401074A (en) * | 2020-04-03 | 2020-07-10 | 山东爱城市网信息技术有限公司 | Short text emotion tendency analysis method, system and device based on Hadoop |
CN111435594A (en) * | 2019-01-14 | 2020-07-21 | 珠海格力电器股份有限公司 | Method and device for acquiring cooking parameters of cooking appliance and cooking appliance |
WO2020164204A1 (en) * | 2019-02-11 | 2020-08-20 | 平安科技(深圳)有限公司 | Text template recognition method and apparatus, and computer readable storage medium |
CN111563190A (en) * | 2020-04-07 | 2020-08-21 | 中国电子科技集团公司第二十九研究所 | Multi-dimensional analysis and supervision method and system for user behaviors of regional network |
CN111708886A (en) * | 2020-06-11 | 2020-09-25 | 国网天津市电力公司 | Public opinion analysis terminal and public opinion text analysis method based on data driving |
CN111797333A (en) * | 2020-06-04 | 2020-10-20 | 南京擎盾信息科技有限公司 | Public opinion spreading task display method and device |
CN111831922A (en) * | 2020-07-14 | 2020-10-27 | 深圳市众创达企业咨询策划有限公司 | Recommendation system and method based on internet information |
CN111914096A (en) * | 2020-07-06 | 2020-11-10 | 同济大学 | Public transport passenger satisfaction evaluation method and system based on public opinion knowledge graph |
CN111914141A (en) * | 2020-07-30 | 2020-11-10 | 广州城市信息研究所有限公司 | Public opinion knowledge base construction method and public opinion knowledge base |
CN112084298A (en) * | 2020-07-31 | 2020-12-15 | 北京明略昭辉科技有限公司 | Public opinion theme processing method and device based on rapid BTM |
CN112184323A (en) * | 2020-10-13 | 2021-01-05 | 上海风秩科技有限公司 | Evaluation label generation method and device, storage medium and electronic equipment |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112348421A (en) * | 2019-08-08 | 2021-02-09 | 北京国双科技有限公司 | Data processing method and device |
CN112464653A (en) * | 2020-12-03 | 2021-03-09 | 合肥天源迪科信息技术有限公司 | Real-time event identification and matching method based on communication short message |
CN112528197A (en) * | 2020-11-20 | 2021-03-19 | 四川新网银行股份有限公司 | System and method for monitoring network public sentiment in real time based on artificial intelligence |
CN112541105A (en) * | 2019-09-20 | 2021-03-23 | 福建师范大学地理研究所 | Keyword generation method, public opinion monitoring method, device, equipment and medium |
CN112650848A (en) * | 2020-12-30 | 2021-04-13 | 交控科技股份有限公司 | Urban railway public opinion information analysis method based on text semantic related passenger evaluation |
CN111158973B (en) * | 2019-12-05 | 2021-06-18 | 北京大学 | Web application dynamic evolution monitoring method |
CN113032653A (en) * | 2021-04-02 | 2021-06-25 | 盐城师范学院 | Big data-based public opinion monitoring platform |
CN113282702A (en) * | 2021-03-16 | 2021-08-20 | 广东医通软件有限公司 | Intelligent retrieval method and retrieval system |
CN113468333A (en) * | 2021-09-02 | 2021-10-01 | 华东交通大学 | Event detection method and system fusing hierarchical category information |
CN113822038A (en) * | 2021-06-03 | 2021-12-21 | 腾讯科技(深圳)有限公司 | Abstract generation method and related device |
CN113836307A (en) * | 2021-10-15 | 2021-12-24 | 国网北京市电力公司 | Power supply service work order hotspot discovery method, system and device and storage medium |
CN114281994A (en) * | 2021-12-27 | 2022-04-05 | 盐城工学院 | Text clustering integration method and system based on three-layer weighting model |
CN114385890A (en) * | 2022-03-22 | 2022-04-22 | 深圳市世纪联想广告有限公司 | Internet public opinion monitoring system |
CN114386422A (en) * | 2022-01-14 | 2022-04-22 | 淮安市创新创业科技服务中心 | Intelligent aid decision-making method and device based on enterprise pollution public opinion extraction |
CN114462393A (en) * | 2022-04-12 | 2022-05-10 | 安徽数智建造研究院有限公司 | Webpage text information extraction method and device, terminal equipment and storage medium |
CN114491207A (en) * | 2022-01-18 | 2022-05-13 | 平安普惠企业管理有限公司 | Public opinion analysis method and related product |
CN114692593A (en) * | 2022-03-21 | 2022-07-01 | 中国刑事警察学院 | Network information safety monitoring and early warning method |
CN115082947A (en) * | 2022-07-12 | 2022-09-20 | 江苏楚淮软件科技开发有限公司 | Paper letter rapid collecting, sorting and reading system |
CN115757793A (en) * | 2022-11-29 | 2023-03-07 | 石家庄赞润信息技术有限公司 | Topic analysis and early warning method and system based on artificial intelligence and cloud platform |
CN116521858A (en) * | 2023-04-20 | 2023-08-01 | 浙江浙里信征信有限公司 | Context semantic sequence comparison method based on dynamic clustering and visualization |
CN117743376A (en) * | 2024-02-19 | 2024-03-22 | 蓝色火焰科技成都有限公司 | Big data mining method, device and storage medium for digital financial service |
CN117786249A (en) * | 2023-12-27 | 2024-03-29 | 王冰 | Network real-time hot topic mining analysis and public opinion extraction system |
CN117910467A (en) * | 2024-03-15 | 2024-04-19 | 成都启英泰伦科技有限公司 | Word segmentation processing method in offline voice recognition process |
CN118520174A (en) * | 2024-07-19 | 2024-08-20 | 西安银信博锐信息科技有限公司 | Customer behavior feature extraction method based on data analysis |
CN118656495A (en) * | 2024-08-20 | 2024-09-17 | 湖南数据产业集团有限公司 | Public opinion publishing traceability method, device, equipment and storage medium thereof |
CN118656496A (en) * | 2024-08-21 | 2024-09-17 | 舟谱数据技术南京有限公司 | NLP-based search data management method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101529418A (en) * | 2006-01-19 | 2009-09-09 | 维里德克斯有限责任公司 | Systems and methods for acquiring analyzing mining data and information |
CN101788988A (en) * | 2009-01-22 | 2010-07-28 | 蔡亮华 | Information extraction method |
US20120030206A1 (en) * | 2010-07-29 | 2012-02-02 | Microsoft Corporation | Employing Topic Models for Semantic Class Mining |
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
-
2013
- 2013-10-15 CN CN201310482522.5A patent/CN103544255B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101529418A (en) * | 2006-01-19 | 2009-09-09 | 维里德克斯有限责任公司 | Systems and methods for acquiring analyzing mining data and information |
CN101788988A (en) * | 2009-01-22 | 2010-07-28 | 蔡亮华 | Information extraction method |
US20120030206A1 (en) * | 2010-07-29 | 2012-02-02 | Microsoft Corporation | Employing Topic Models for Semantic Class Mining |
CN102708096A (en) * | 2012-05-29 | 2012-10-03 | 代松 | Network intelligence public sentiment monitoring system based on semantics and work method thereof |
Non-Patent Citations (1)
Title |
---|
孙爽: "基于语义相似度的文本聚类算法的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (218)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103902659A (en) * | 2014-03-04 | 2014-07-02 | 深圳市至高通信技术发展有限公司 | Public opinion analysis method and corresponding device |
CN103902659B (en) * | 2014-03-04 | 2017-06-27 | 深圳市至高通信技术发展有限公司 | A kind of the analysis of public opinion method and corresponding device |
CN103886051A (en) * | 2014-03-13 | 2014-06-25 | 电子科技大学 | Comment analysis method based on entities and features |
CN103927545A (en) * | 2014-03-14 | 2014-07-16 | 小米科技有限责任公司 | Clustering method and device |
CN104915359A (en) * | 2014-03-14 | 2015-09-16 | 华为技术有限公司 | Theme label recommending method and device |
CN103927545B (en) * | 2014-03-14 | 2017-10-17 | 小米科技有限责任公司 | Clustering method and relevant apparatus |
CN103902674A (en) * | 2014-03-19 | 2014-07-02 | 百度在线网络技术(北京)有限公司 | Method and device for collecting evaluation data of specific subject |
CN103838886A (en) * | 2014-03-31 | 2014-06-04 | 辽宁四维科技发展有限公司 | Text content classification method based on representative word knowledge base |
CN103841216A (en) * | 2014-04-01 | 2014-06-04 | 深圳市科盾科技有限公司 | Network public opinion monitoring system based on cloud platform |
CN104199829A (en) * | 2014-07-25 | 2014-12-10 | 中国科学院自动化研究所 | Emotion data classifying method and system |
CN104199829B (en) * | 2014-07-25 | 2017-07-04 | 中国科学院自动化研究所 | Affection data sorting technique and system |
CN104346425B (en) * | 2014-07-28 | 2017-10-31 | 中国科学院计算技术研究所 | A kind of method and system of the internet public feelings index system of stratification |
CN104346425A (en) * | 2014-07-28 | 2015-02-11 | 中国科学院计算技术研究所 | Method and system of hierarchical internet public sentiment indication system |
CN107077640B (en) * | 2014-09-03 | 2021-07-06 | 邓白氏公司 | System and process for analyzing, qualifying, and ingesting unstructured data sources via empirical attribution |
CN107077640A (en) * | 2014-09-03 | 2017-08-18 | 邓白氏公司 | Analyzed via experience ownership, it is qualification and intake unstructured data sources system and processing |
CN104217718B (en) * | 2014-09-03 | 2017-05-17 | 陈飞 | Method and system for voice recognition based on environmental parameter and group trend data |
CN104268194A (en) * | 2014-09-19 | 2015-01-07 | 国家电网公司 | Method for dynamically generating public opinion brief report |
CN105574047A (en) * | 2014-10-17 | 2016-05-11 | 任子行网络技术股份有限公司 | Website main page feature analysis based Chinese website sorting method and system |
WO2016058267A1 (en) * | 2014-10-17 | 2016-04-21 | 任子行网络技术股份有限公司 | Chinese website classification method and system based on characteristic analysis of website homepage |
CN104504150A (en) * | 2015-01-09 | 2015-04-08 | 成都布林特信息技术有限公司 | News public opinion monitoring system |
CN104504150B (en) * | 2015-01-09 | 2017-09-29 | 成都布林特信息技术有限公司 | News public sentiment monitoring system |
CN105992194B (en) * | 2015-01-30 | 2019-10-29 | 阿里巴巴集团控股有限公司 | The acquisition methods and device of network data content |
CN105992194A (en) * | 2015-01-30 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Network data content acquiring method and network data content acquiring device |
CN104699763B (en) * | 2015-02-11 | 2017-10-17 | 中国科学院新疆理化技术研究所 | The text similarity gauging system of multiple features fusion |
CN104699763A (en) * | 2015-02-11 | 2015-06-10 | 中国科学院新疆理化技术研究所 | Text similarity measuring system based on multi-feature fusion |
CN106156041B (en) * | 2015-03-26 | 2019-05-28 | 科大讯飞股份有限公司 | Hot information finds method and system |
CN106156041A (en) * | 2015-03-26 | 2016-11-23 | 科大讯飞股份有限公司 | Hot information finds method and system |
CN106156192A (en) * | 2015-04-21 | 2016-11-23 | 北大方正集团有限公司 | Public sentiment data clustering method and public sentiment data clustering system |
CN104820629B (en) * | 2015-05-14 | 2018-01-30 | 中国电子科技集团公司第五十四研究所 | A kind of intelligent public sentiment accident emergent treatment system and method |
CN106294358A (en) * | 2015-05-14 | 2017-01-04 | 北京大学 | The search method of a kind of information and system |
CN104820629A (en) * | 2015-05-14 | 2015-08-05 | 中国电子科技集团公司第五十四研究所 | Intelligent system and method for emergently processing public sentiment emergency |
CN104915453A (en) * | 2015-07-01 | 2015-09-16 | 北京奇虎科技有限公司 | Method, device and system for classifying POI information |
CN104899339A (en) * | 2015-07-01 | 2015-09-09 | 北京奇虎科技有限公司 | Method and system for classifying POI (Point of Interest) information |
CN105183803A (en) * | 2015-08-25 | 2015-12-23 | 天津大学 | Personalized search method and search apparatus thereof in social network platform |
CN105183478A (en) * | 2015-09-11 | 2015-12-23 | 中山大学 | Webpage reestablishing method and device based on color transmission |
CN105183478B (en) * | 2015-09-11 | 2018-11-23 | 中山大学 | A kind of webpage reconstructing method and its device based on color transfer |
CN106528581B (en) * | 2015-09-15 | 2019-05-07 | 阿里巴巴集团控股有限公司 | Method for text detection and device |
CN106528581A (en) * | 2015-09-15 | 2017-03-22 | 阿里巴巴集团控股有限公司 | Text detection method and apparatus |
CN106649367A (en) * | 2015-10-30 | 2017-05-10 | 北京国双科技有限公司 | Method and device for detecting popularization degree of keyword |
CN107430633A (en) * | 2015-11-03 | 2017-12-01 | 慧与发展有限责任合伙企业 | The representative content through related optimization being associated to data-storage system |
WO2017080220A1 (en) * | 2015-11-12 | 2017-05-18 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and apparatus |
CN105279277A (en) * | 2015-11-12 | 2016-01-27 | 百度在线网络技术(北京)有限公司 | Knowledge data processing method and device |
CN105389389B (en) * | 2015-12-10 | 2018-09-25 | 安徽博约信息科技股份有限公司 | A kind of network public-opinion propagation situation medium control analysis method |
CN105389389A (en) * | 2015-12-10 | 2016-03-09 | 安徽博约信息科技有限责任公司 | Network public opinion transmission situation media linked analysis method |
CN105677802A (en) * | 2015-12-31 | 2016-06-15 | 宁波公众信息产业有限公司 | Internet information analysis system |
CN105447202A (en) * | 2015-12-31 | 2016-03-30 | 宁波公众信息产业有限公司 | Internet information collecting system |
CN105677873B (en) * | 2016-01-11 | 2019-03-26 | 中国电子科技集团公司第十研究所 | Text Intelligence association cluster based on model of the domain knowledge collects processing method |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
CN105740238A (en) * | 2016-03-04 | 2016-07-06 | 北京理工大学 | Method for constructing event relationship strength graph fusing sentence meaning information |
CN105740238B (en) * | 2016-03-04 | 2019-02-01 | 北京理工大学 | A kind of event relation intensity map construction method merging sentence justice information |
CN105956070A (en) * | 2016-04-28 | 2016-09-21 | 优品财富管理有限公司 | Method and system for integrating repetitive records |
CN105956069A (en) * | 2016-04-28 | 2016-09-21 | 优品财富管理有限公司 | Network information collection and analysis method and network information collection and analysis system |
CN106126558B (en) * | 2016-06-16 | 2019-09-20 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106126558A (en) * | 2016-06-16 | 2016-11-16 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106294542B (en) * | 2016-07-25 | 2018-03-30 | 北京市信访矛盾分析研究中心 | A kind of letters and calls data mining methods of marking and system |
CN106294542A (en) * | 2016-07-25 | 2017-01-04 | 北京市信访矛盾分析研究中心 | A kind of letters and calls data mining methods of marking and system |
CN106294619A (en) * | 2016-08-01 | 2017-01-04 | 上海交通大学 | Public sentiment intelligent supervision method |
CN109891517A (en) * | 2016-10-25 | 2019-06-14 | 皇家飞利浦有限公司 | The clinical diagnosis assistant of knowledge based figure |
CN106776724A (en) * | 2016-11-16 | 2017-05-31 | 福建天泉教育科技有限公司 | A kind of exercise question sorting technique and system |
CN106599054A (en) * | 2016-11-16 | 2017-04-26 | 福建天泉教育科技有限公司 | Method and system for title classification and push |
CN106651696A (en) * | 2016-11-16 | 2017-05-10 | 福建天泉教育科技有限公司 | Approximate question push method and system |
CN108090040A (en) * | 2016-11-23 | 2018-05-29 | 北京国双科技有限公司 | A kind of text message sorting technique and system |
CN107045524A (en) * | 2016-12-30 | 2017-08-15 | 中央民族大学 | A kind of method and system of network text public sentiment classification |
CN107045524B (en) * | 2016-12-30 | 2019-12-27 | 中央民族大学 | Method and system for classifying network text public sentiments |
CN107016068A (en) * | 2017-03-21 | 2017-08-04 | 深圳前海乘方互联网金融服务有限公司 | Knowledge mapping construction method and device |
CN107918633B (en) * | 2017-03-23 | 2021-07-02 | 广州思涵信息科技有限公司 | Sensitive public opinion content identification method and early warning system based on semantic analysis technology |
CN107918633A (en) * | 2017-03-23 | 2018-04-17 | 广州思涵信息科技有限公司 | Sensitive public sentiment content identification method and early warning system based on semantic analysis technology |
CN107145516A (en) * | 2017-04-07 | 2017-09-08 | 北京捷通华声科技股份有限公司 | A kind of Text Clustering Method and system |
CN107145516B (en) * | 2017-04-07 | 2021-03-19 | 北京捷通华声科技股份有限公司 | Text clustering method and system |
CN107066585B (en) * | 2017-04-17 | 2019-10-01 | 济南大学 | A kind of probability topic calculates and matched public sentiment monitoring method and system |
CN107066585A (en) * | 2017-04-17 | 2017-08-18 | 济南大学 | A kind of probability topic calculates the public sentiment monitoring method and system with matching |
CN107093021A (en) * | 2017-04-21 | 2017-08-25 | 深圳市创艺工业技术有限公司 | Electricity power engineering goods and materials contract is honoured an agreement sincere public sentiment monitoring system |
CN107085608A (en) * | 2017-04-21 | 2017-08-22 | 上海喆之信息科技有限公司 | A kind of effective network hotspot monitoring system |
CN107038156A (en) * | 2017-04-28 | 2017-08-11 | 北京清博大数据科技有限公司 | A kind of hot spot of public opinions Forecasting Methodology based on big data |
CN107291808A (en) * | 2017-05-16 | 2017-10-24 | 南京邮电大学 | It is a kind of that big data sorting technique is manufactured based on semantic cloud |
CN107220236A (en) * | 2017-05-23 | 2017-09-29 | 武汉朱雀闻天科技有限公司 | It is a kind of to determine the doubtful naked method and device for borrowing student |
CN107315778A (en) * | 2017-05-31 | 2017-11-03 | 温州市鹿城区中津先进科技研究院 | A kind of natural language the analysis of public opinion method based on big data sentiment analysis |
CN107292743A (en) * | 2017-06-07 | 2017-10-24 | 前海梧桐(深圳)数据有限公司 | The intelligent decision making method and its system invested and financed for enterprise |
CN107231570A (en) * | 2017-06-13 | 2017-10-03 | 中国传媒大学 | News data content characteristic obtains system and application system |
CN107358344A (en) * | 2017-06-29 | 2017-11-17 | 浙江图讯科技股份有限公司 | Enterprise's hidden danger management method and its management system, electronic equipment and storage medium |
CN107291697A (en) * | 2017-06-29 | 2017-10-24 | 浙江图讯科技股份有限公司 | A kind of semantic analysis, electronic equipment, storage medium and its diagnostic system |
CN107358344B (en) * | 2017-06-29 | 2021-09-03 | 浙江图讯科技股份有限公司 | Enterprise hidden danger management method and management system thereof, electronic equipment and storage medium |
CN107276854B (en) * | 2017-07-27 | 2021-11-09 | 浩鲸云计算科技股份有限公司 | MOLAP statistical analysis method under big data |
CN107276854A (en) * | 2017-07-27 | 2017-10-20 | 中兴软创科技股份有限公司 | A kind of method of MOLAP statistical analyses under big data |
CN107527289A (en) * | 2017-08-25 | 2017-12-29 | 百度在线网络技术(北京)有限公司 | A kind of investment combination industry distribution method, apparatus, server and storage medium |
CN107491438A (en) * | 2017-08-25 | 2017-12-19 | 前海梧桐(深圳)数据有限公司 | Business decision elements recognition method and its system based on natural language |
CN107679084B (en) * | 2017-08-31 | 2021-09-28 | 平安科技(深圳)有限公司 | Clustering label generation method, electronic device and computer readable storage medium |
CN107679084A (en) * | 2017-08-31 | 2018-02-09 | 平安科技(深圳)有限公司 | Cluster labels generation method, electronic equipment and computer-readable recording medium |
CN107679977A (en) * | 2017-09-06 | 2018-02-09 | 广东中标数据科技股份有限公司 | A kind of tax administration platform and implementation method based on semantic analysis |
CN107918644B (en) * | 2017-10-31 | 2020-12-08 | 北京锐思爱特咨询股份有限公司 | News topic analysis method and implementation system in reputation management framework |
CN107918644A (en) * | 2017-10-31 | 2018-04-17 | 北京锐思爱特咨询股份有限公司 | News subject under discussion analysis method and implementation system in reputation Governance framework |
WO2019085355A1 (en) * | 2017-11-01 | 2019-05-09 | 平安科技(深圳)有限公司 | Public sentiment clustering analysis method for internet news, application server, and computer-readable storage medium |
CN108052527A (en) * | 2017-11-08 | 2018-05-18 | 中国传媒大学 | Method is recommended in film bridge piecewise analysis based on label system |
CN108170666A (en) * | 2017-11-29 | 2018-06-15 | 同济大学 | A kind of improved method based on TF-IDF keyword extractions |
CN108197638A (en) * | 2017-12-12 | 2018-06-22 | 阿里巴巴集团控股有限公司 | The method and device classified to sample to be assessed |
CN108197638B (en) * | 2017-12-12 | 2020-03-20 | 阿里巴巴集团控股有限公司 | Method and device for classifying sample to be evaluated |
CN110019720A (en) * | 2017-12-19 | 2019-07-16 | 优酷网络技术(北京)有限公司 | A kind of content of comment, which is separately won, takes method and system |
CN108062306A (en) * | 2017-12-29 | 2018-05-22 | 国信优易数据有限公司 | A kind of index system establishment system and method for business environment evaluation |
CN108363784A (en) * | 2018-01-20 | 2018-08-03 | 西北工业大学 | A kind of public sentiment trend estimate method based on text machine learning |
CN108595466A (en) * | 2018-02-09 | 2018-09-28 | 中山大学 | A kind of filtering of internet information and Internet user's information and net note structure analysis method |
CN108287922B (en) * | 2018-02-28 | 2022-03-08 | 福州大学 | Text data viewpoint abstract mining method fusing topic attributes and emotional information |
CN108287922A (en) * | 2018-02-28 | 2018-07-17 | 福州大学 | A kind of text data viewpoint abstract method for digging of fusion topic attribute and emotion information |
CN108536762A (en) * | 2018-03-21 | 2018-09-14 | 上海蔚界信息科技有限公司 | A kind of high-volume text data automatically analyzes scheme |
CN108681977A (en) * | 2018-03-27 | 2018-10-19 | 成都律云科技有限公司 | A kind of lawyer's information processing method and system |
CN108681977B (en) * | 2018-03-27 | 2022-05-31 | 成都律云科技有限公司 | Lawyer information processing method and system |
CN108550380A (en) * | 2018-04-12 | 2018-09-18 | 北京深度智耀科技有限公司 | A kind of drug safety information monitoring method and device based on public network |
CN108628994A (en) * | 2018-04-28 | 2018-10-09 | 广东亿迅科技有限公司 | A kind of public sentiment data processing system |
CN108932291B (en) * | 2018-05-23 | 2022-08-23 | 福建亿榕信息技术有限公司 | Power grid public opinion evaluation method, storage medium and computer |
CN108932291A (en) * | 2018-05-23 | 2018-12-04 | 福建亿榕信息技术有限公司 | Power grid public sentiment evaluation method, storage medium and computer |
CN108804594A (en) * | 2018-05-28 | 2018-11-13 | 国家计算机网络与信息安全管理中心 | A kind of construction method and device of news content full-text search engine |
CN110633373B (en) * | 2018-06-20 | 2023-06-09 | 上海财经大学 | Automobile public opinion analysis method based on knowledge graph and deep learning |
CN110633373A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Automobile public opinion analysis method based on knowledge graph and deep learning |
CN110727794A (en) * | 2018-06-28 | 2020-01-24 | 上海传漾广告有限公司 | System and method for collecting and analyzing network semantics and summarizing and analyzing content |
CN109145085A (en) * | 2018-07-18 | 2019-01-04 | 北京市农林科学院 | The calculation method and system of semantic similarity |
CN109145085B (en) * | 2018-07-18 | 2020-11-27 | 北京市农林科学院 | Semantic similarity calculation method and system |
CN109376237A (en) * | 2018-09-04 | 2019-02-22 | 中国平安人寿保险股份有限公司 | Prediction technique, device, computer equipment and the storage medium of client's stability |
CN109376237B (en) * | 2018-09-04 | 2024-05-28 | 中国平安人寿保险股份有限公司 | Client stability prediction method, device, computer equipment and storage medium |
CN109408808B (en) * | 2018-09-12 | 2023-08-22 | 中国传媒大学 | Evaluation method and evaluation system for literature works |
CN109408808A (en) * | 2018-09-12 | 2019-03-01 | 中国传媒大学 | A kind of appraisal procedure and assessment system of artistic works |
CN109446409A (en) * | 2018-09-19 | 2019-03-08 | 杭州安恒信息技术股份有限公司 | A kind of recognition methods of the target object of doubtful multiple level marketing behavior |
CN109214008A (en) * | 2018-09-28 | 2019-01-15 | 珠海中科先进技术研究院有限公司 | A kind of sentiment analysis method and system based on keyword extraction |
CN109299271B (en) * | 2018-10-30 | 2022-04-05 | 腾讯科技(深圳)有限公司 | Training sample generation method, text data method, public opinion event classification method and related equipment |
CN109299271A (en) * | 2018-10-30 | 2019-02-01 | 腾讯科技(深圳)有限公司 | Training sample generation, text data, public sentiment event category method and relevant device |
CN109582953B (en) * | 2018-11-02 | 2023-04-07 | 中国科学院自动化研究所 | Data support scoring method and equipment for information and storage medium |
CN109558586A (en) * | 2018-11-02 | 2019-04-02 | 中国科学院自动化研究所 | A kind of speech of information is according to from card methods of marking, equipment and storage medium |
CN109558586B (en) * | 2018-11-02 | 2023-04-18 | 中国科学院自动化研究所 | Self-evidence scoring method, equipment and storage medium for statement of information |
CN109582953A (en) * | 2018-11-02 | 2019-04-05 | 中国科学院自动化研究所 | A kind of speech of information is according to support methods of marking, equipment and storage medium |
CN109189934A (en) * | 2018-11-13 | 2019-01-11 | 平安科技(深圳)有限公司 | Public sentiment recommended method, device, computer equipment and storage medium |
CN109635074A (en) * | 2018-11-13 | 2019-04-16 | 平安科技(深圳)有限公司 | A kind of entity relationship analysis method and terminal device based on public feelings information |
CN109635074B (en) * | 2018-11-13 | 2024-05-07 | 平安科技(深圳)有限公司 | Entity relationship analysis method and terminal equipment based on public opinion information |
CN109635107A (en) * | 2018-11-19 | 2019-04-16 | 北京亚鸿世纪科技发展有限公司 | The method and device of semantic intellectual analysis and the event scenarios reduction of multi-data source |
CN109526027B (en) * | 2018-11-27 | 2022-07-01 | 中国移动通信集团福建有限公司 | Cell capacity optimization method, device, equipment and computer storage medium |
CN109526027A (en) * | 2018-11-27 | 2019-03-26 | 中国移动通信集团福建有限公司 | A kind of cell capacity optimization method, device, equipment and computer storage medium |
CN109766438A (en) * | 2018-12-12 | 2019-05-17 | 平安科技(深圳)有限公司 | Biographic information extracting method, device, computer equipment and storage medium |
CN110046292B (en) * | 2018-12-13 | 2024-04-23 | 创新先进技术有限公司 | Public opinion data processing method, device, equipment and storage medium |
CN110046292A (en) * | 2018-12-13 | 2019-07-23 | 阿里巴巴集团控股有限公司 | Public sentiment data processing method, device, equipment and storage medium |
CN111435594A (en) * | 2019-01-14 | 2020-07-21 | 珠海格力电器股份有限公司 | Method and device for acquiring cooking parameters of cooking appliance and cooking appliance |
WO2020164204A1 (en) * | 2019-02-11 | 2020-08-20 | 平安科技(深圳)有限公司 | Text template recognition method and apparatus, and computer readable storage medium |
CN110110156A (en) * | 2019-04-04 | 2019-08-09 | 平安科技(深圳)有限公司 | Industry public sentiment monitoring method, device, computer equipment and storage medium |
CN110134844A (en) * | 2019-04-04 | 2019-08-16 | 平安科技(深圳)有限公司 | Subdivision field public sentiment monitoring method, device, computer equipment and storage medium |
CN110188196A (en) * | 2019-04-29 | 2019-08-30 | 同济大学 | A kind of text increment dimension reduction method based on random forest |
CN110188196B (en) * | 2019-04-29 | 2021-10-08 | 同济大学 | Random forest based text increment dimension reduction method |
CN110222172A (en) * | 2019-05-15 | 2019-09-10 | 北京邮电大学 | A kind of multi-source network public sentiment Topics Crawling method based on improvement hierarchical clustering |
CN110222172B (en) * | 2019-05-15 | 2021-03-16 | 北京邮电大学 | Multi-source network public opinion theme mining method based on improved hierarchical clustering |
CN110119416A (en) * | 2019-05-16 | 2019-08-13 | 重庆八戒传媒有限公司 | A kind of service data analysis system and method |
CN110188168A (en) * | 2019-05-24 | 2019-08-30 | 北京邮电大学 | Semantic relation recognition methods and device |
CN110188168B (en) * | 2019-05-24 | 2021-09-03 | 北京邮电大学 | Semantic relation recognition method and device |
CN110348539A (en) * | 2019-07-19 | 2019-10-18 | 知者信息技术服务成都有限公司 | Short text correlation method of discrimination |
CN112348421A (en) * | 2019-08-08 | 2021-02-09 | 北京国双科技有限公司 | Data processing method and device |
CN110472055B (en) * | 2019-08-21 | 2021-09-14 | 北京百度网讯科技有限公司 | Method and device for marking data |
CN110472055A (en) * | 2019-08-21 | 2019-11-19 | 北京百度网讯科技有限公司 | Method and apparatus for labeled data |
CN110532492A (en) * | 2019-08-27 | 2019-12-03 | 东北大学 | A kind of forum data management classification system and method |
CN112541105A (en) * | 2019-09-20 | 2021-03-23 | 福建师范大学地理研究所 | Keyword generation method, public opinion monitoring method, device, equipment and medium |
CN110705288A (en) * | 2019-09-29 | 2020-01-17 | 武汉海昌信息技术有限公司 | Big data-based public opinion analysis system |
CN110852090B (en) * | 2019-11-07 | 2024-03-19 | 中科天玑数据科技股份有限公司 | Mechanism characteristic vocabulary expansion system and method for public opinion crawling |
CN110852090A (en) * | 2019-11-07 | 2020-02-28 | 中科天玑数据科技股份有限公司 | Public opinion crawling mechanism characteristic vocabulary extension system and method |
CN110968668A (en) * | 2019-11-29 | 2020-04-07 | 中国农业科学院农业信息研究所 | Method and device for calculating similarity of network public sentiment subjects based on hyper-network |
CN110991190A (en) * | 2019-11-29 | 2020-04-10 | 华中科技大学 | Document theme enhanced self-attention network, text emotion prediction system and method |
CN110968668B (en) * | 2019-11-29 | 2023-03-14 | 中国农业科学院农业信息研究所 | Method and device for calculating similarity of network public sentiment topics based on hyper-network |
CN110990389A (en) * | 2019-11-29 | 2020-04-10 | 上海易点时空网络有限公司 | Method and device for simplifying question bank and computer readable storage medium |
CN111158973B (en) * | 2019-12-05 | 2021-06-18 | 北京大学 | Web application dynamic evolution monitoring method |
CN111144575A (en) * | 2019-12-05 | 2020-05-12 | 支付宝(杭州)信息技术有限公司 | Public opinion early warning model training method, early warning method, device, equipment and medium |
CN111160019B (en) * | 2019-12-30 | 2023-08-15 | 中国联合网络通信集团有限公司 | Public opinion monitoring method, device and system |
CN111160019A (en) * | 2019-12-30 | 2020-05-15 | 中国联合网络通信集团有限公司 | Public opinion monitoring method, device and system |
CN111241077B (en) * | 2020-01-03 | 2023-06-09 | 四川新网银行股份有限公司 | Identification method of financial fraud based on internet data |
CN111241077A (en) * | 2020-01-03 | 2020-06-05 | 四川新网银行股份有限公司 | Financial fraud behavior identification method based on internet data |
CN111259635A (en) * | 2020-01-09 | 2020-06-09 | 智业软件股份有限公司 | Method and system for completing and predicting medical record written text |
CN111291186B (en) * | 2020-01-21 | 2024-01-09 | 北京捷通华声科技股份有限公司 | Context mining method and device based on clustering algorithm and electronic equipment |
CN111291186A (en) * | 2020-01-21 | 2020-06-16 | 北京捷通华声科技股份有限公司 | Context mining method and device based on clustering algorithm and electronic equipment |
CN111291162A (en) * | 2020-02-26 | 2020-06-16 | 深圳前海微众银行股份有限公司 | Quality test example sentence mining method, device, equipment and computer readable storage medium |
CN111291162B (en) * | 2020-02-26 | 2024-04-09 | 深圳前海微众银行股份有限公司 | Quality inspection example sentence mining method, device, equipment and computer readable storage medium |
CN111401074A (en) * | 2020-04-03 | 2020-07-10 | 山东爱城市网信息技术有限公司 | Short text emotion tendency analysis method, system and device based on Hadoop |
CN111563190A (en) * | 2020-04-07 | 2020-08-21 | 中国电子科技集团公司第二十九研究所 | Multi-dimensional analysis and supervision method and system for user behaviors of regional network |
CN111797333B (en) * | 2020-06-04 | 2021-04-20 | 南京擎盾信息科技有限公司 | Public opinion spreading task display method and device |
CN111797333A (en) * | 2020-06-04 | 2020-10-20 | 南京擎盾信息科技有限公司 | Public opinion spreading task display method and device |
CN111708886A (en) * | 2020-06-11 | 2020-09-25 | 国网天津市电力公司 | Public opinion analysis terminal and public opinion text analysis method based on data driving |
CN111914096A (en) * | 2020-07-06 | 2020-11-10 | 同济大学 | Public transport passenger satisfaction evaluation method and system based on public opinion knowledge graph |
CN111914096B (en) * | 2020-07-06 | 2024-02-02 | 同济大学 | Public opinion knowledge graph-based public transportation passenger satisfaction evaluation method and system |
CN111831922A (en) * | 2020-07-14 | 2020-10-27 | 深圳市众创达企业咨询策划有限公司 | Recommendation system and method based on internet information |
CN111831922B (en) * | 2020-07-14 | 2021-02-05 | 深圳市众创达企业咨询策划有限公司 | Recommendation system and method based on internet information |
CN111914141A (en) * | 2020-07-30 | 2020-11-10 | 广州城市信息研究所有限公司 | Public opinion knowledge base construction method and public opinion knowledge base |
CN112084298A (en) * | 2020-07-31 | 2020-12-15 | 北京明略昭辉科技有限公司 | Public opinion theme processing method and device based on rapid BTM |
CN112214576B (en) * | 2020-09-10 | 2024-02-06 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, public opinion analysis device, terminal equipment and computer readable storage medium |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112184323A (en) * | 2020-10-13 | 2021-01-05 | 上海风秩科技有限公司 | Evaluation label generation method and device, storage medium and electronic equipment |
CN112528197B (en) * | 2020-11-20 | 2023-07-07 | 四川新网银行股份有限公司 | System and method for monitoring network public opinion in real time based on artificial intelligence |
CN112528197A (en) * | 2020-11-20 | 2021-03-19 | 四川新网银行股份有限公司 | System and method for monitoring network public sentiment in real time based on artificial intelligence |
CN112464653A (en) * | 2020-12-03 | 2021-03-09 | 合肥天源迪科信息技术有限公司 | Real-time event identification and matching method based on communication short message |
CN112650848A (en) * | 2020-12-30 | 2021-04-13 | 交控科技股份有限公司 | Urban railway public opinion information analysis method based on text semantic related passenger evaluation |
CN113282702B (en) * | 2021-03-16 | 2023-12-19 | 广东医通软件有限公司 | Intelligent retrieval method and retrieval system |
CN113282702A (en) * | 2021-03-16 | 2021-08-20 | 广东医通软件有限公司 | Intelligent retrieval method and retrieval system |
CN113032653A (en) * | 2021-04-02 | 2021-06-25 | 盐城师范学院 | Big data-based public opinion monitoring platform |
CN113822038A (en) * | 2021-06-03 | 2021-12-21 | 腾讯科技(深圳)有限公司 | Abstract generation method and related device |
CN113468333A (en) * | 2021-09-02 | 2021-10-01 | 华东交通大学 | Event detection method and system fusing hierarchical category information |
CN113836307A (en) * | 2021-10-15 | 2021-12-24 | 国网北京市电力公司 | Power supply service work order hotspot discovery method, system and device and storage medium |
CN113836307B (en) * | 2021-10-15 | 2024-02-20 | 国网北京市电力公司 | Power supply service work order hot spot discovery method, system, device and storage medium |
CN114281994A (en) * | 2021-12-27 | 2022-04-05 | 盐城工学院 | Text clustering integration method and system based on three-layer weighting model |
CN114386422A (en) * | 2022-01-14 | 2022-04-22 | 淮安市创新创业科技服务中心 | Intelligent aid decision-making method and device based on enterprise pollution public opinion extraction |
CN114386422B (en) * | 2022-01-14 | 2023-09-15 | 淮安市创新创业科技服务中心 | Intelligent auxiliary decision-making method and device based on enterprise pollution public opinion extraction |
CN114491207A (en) * | 2022-01-18 | 2022-05-13 | 平安普惠企业管理有限公司 | Public opinion analysis method and related product |
CN114692593A (en) * | 2022-03-21 | 2022-07-01 | 中国刑事警察学院 | Network information safety monitoring and early warning method |
CN114385890B (en) * | 2022-03-22 | 2022-05-20 | 深圳市世纪联想广告有限公司 | Internet public opinion monitoring system |
CN114385890A (en) * | 2022-03-22 | 2022-04-22 | 深圳市世纪联想广告有限公司 | Internet public opinion monitoring system |
CN114462393A (en) * | 2022-04-12 | 2022-05-10 | 安徽数智建造研究院有限公司 | Webpage text information extraction method and device, terminal equipment and storage medium |
CN115082947A (en) * | 2022-07-12 | 2022-09-20 | 江苏楚淮软件科技开发有限公司 | Paper letter rapid collecting, sorting and reading system |
CN115082947B (en) * | 2022-07-12 | 2023-08-15 | 江苏楚淮软件科技开发有限公司 | Paper letter quick collecting, sorting and reading system |
CN115757793B (en) * | 2022-11-29 | 2023-09-05 | 海南达润丰企业管理合伙企业(有限合伙) | Topic analysis early warning method and system based on artificial intelligence and cloud platform |
CN115757793A (en) * | 2022-11-29 | 2023-03-07 | 石家庄赞润信息技术有限公司 | Topic analysis and early warning method and system based on artificial intelligence and cloud platform |
CN116521858A (en) * | 2023-04-20 | 2023-08-01 | 浙江浙里信征信有限公司 | Context semantic sequence comparison method based on dynamic clustering and visualization |
CN116521858B (en) * | 2023-04-20 | 2024-04-30 | 浙江浙里信征信有限公司 | Context semantic sequence comparison method based on dynamic clustering and visualization |
CN117786249A (en) * | 2023-12-27 | 2024-03-29 | 王冰 | Network real-time hot topic mining analysis and public opinion extraction system |
CN117743376A (en) * | 2024-02-19 | 2024-03-22 | 蓝色火焰科技成都有限公司 | Big data mining method, device and storage medium for digital financial service |
CN117743376B (en) * | 2024-02-19 | 2024-05-03 | 蓝色火焰科技成都有限公司 | Big data mining method, device and storage medium for digital financial service |
CN117910467B (en) * | 2024-03-15 | 2024-05-10 | 成都启英泰伦科技有限公司 | Word segmentation processing method in offline voice recognition process |
CN117910467A (en) * | 2024-03-15 | 2024-04-19 | 成都启英泰伦科技有限公司 | Word segmentation processing method in offline voice recognition process |
CN118520174A (en) * | 2024-07-19 | 2024-08-20 | 西安银信博锐信息科技有限公司 | Customer behavior feature extraction method based on data analysis |
CN118656495A (en) * | 2024-08-20 | 2024-09-17 | 湖南数据产业集团有限公司 | Public opinion publishing traceability method, device, equipment and storage medium thereof |
CN118656496A (en) * | 2024-08-21 | 2024-09-17 | 舟谱数据技术南京有限公司 | NLP-based search data management method and system |
Also Published As
Publication number | Publication date |
---|---|
CN103544255B (en) | 2017-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
Chen et al. | Websrc: A dataset for web-based structural reading comprehension | |
CN107229668B (en) | Text extraction method based on keyword matching | |
CN106649260B (en) | Product characteristic structure tree construction method based on comment text mining | |
CN101593200B (en) | Method for classifying Chinese webpages based on keyword frequency analysis | |
CN103365924B (en) | A kind of method of internet information search, device and terminal | |
CN101231661B (en) | Method and system for digging object grade knowledge | |
CN108280114B (en) | Deep learning-based user literature reading interest analysis method | |
CN103023714B (en) | The liveness of topic Network Based and cluster topology analytical system and method | |
CN112650848A (en) | Urban railway public opinion information analysis method based on text semantic related passenger evaluation | |
CN113822067A (en) | Key information extraction method and device, computer equipment and storage medium | |
Yin et al. | Facto: a fact lookup engine based on web tables | |
CN103324666A (en) | Topic tracing method and device based on micro-blog data | |
CN103678564A (en) | Internet product research system based on data mining | |
CN105068991A (en) | Big data based public sentiment discovery method | |
CN106815307A (en) | Public Culture knowledge mapping platform and its use method | |
CN105279277A (en) | Knowledge data processing method and device | |
CN111899089A (en) | Enterprise risk early warning method and system based on knowledge graph | |
CN104978332B (en) | User-generated content label data generation method, device and correlation technique and device | |
CN103389998A (en) | Novel Internet commercial intelligence information semantic analysis technology based on cloud service | |
CN104268148A (en) | Forum page information auto-extraction method and system based on time strings | |
CN103309862A (en) | Webpage type recognition method and system | |
CN102929902A (en) | Character splitting method and device based on Chinese retrieval | |
CN108416034B (en) | Information acquisition system based on financial heterogeneous big data and control method thereof | |
CN102654873A (en) | Tourism information extraction and aggregation method based on Chinese word segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220425 Address after: 213000 room 1505, No. 9-1, Taihu East Road, Xinbei District, Changzhou City, Jiangsu Province Patentee after: CHANGZHOU HUALONG NETWORK TECHNOLOGY CO.,LTD. Address before: Gehu Lake Road Wujin District 213164 Jiangsu city of Changzhou province No. 1 Patentee before: CHANGZHOU University |
|
TR01 | Transfer of patent right |