CN104281694A - Analysis system of emotional tendency of text - Google Patents
Analysis system of emotional tendency of text Download PDFInfo
- Publication number
- CN104281694A CN104281694A CN201410537881.0A CN201410537881A CN104281694A CN 104281694 A CN104281694 A CN 104281694A CN 201410537881 A CN201410537881 A CN 201410537881A CN 104281694 A CN104281694 A CN 104281694A
- Authority
- CN
- China
- Prior art keywords
- text
- analysis system
- module
- trend analysis
- texts
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an analysis system of emotional tendency of a text. The analysis system comprises a sample training module, an entity extracting module, a characteristic extracting module and an emotional tendency recognizing module. The sample training module is used for receiving texts to by analyzed and training the sample to acquire a discrimination template. The entity extracting module is used for extracting the entities of texts to be discriminated and filtering texts without entities. The characteristic extracting module is used for extracting tendency relative characteristics of the texts. The emotional tendency recognizing module is used for discriminating the tendency of the texts according to maximum entropy method. According to the arrangement, the forums and blogs belongs to the field of the enterprise user are collected, the texts of the webs are extracted, the emotional tendency of the texts and aimed entities are acquired through the analysis system, a chart showing image changes of the enterprise and the competitor is automatically generated, and accordingly the emotional tendency of the text is accurately judged through text classification.
Description
Technical field
The present invention relates to grid computing technology field, particularly relate to a kind of text emotion trend analysis system.
Background technology
Analyze enterprise and image product in order to search for CIS, usually can use sentiment classification, and according to tendentious degree, text being divided into a few class.Because the tendentiousness of text is not only decided by these words such as polarity word, degree words, also with the relative position of these words and relevant with the relation of entity word, and text classification can only consider the feature of word, so utilize text classification all lower to judge the certain methods accuracy rate of emotion tendentiousness of text at present.
Summary of the invention
In order to solve the technical matters existed in background technology, the present invention proposes a kind of text emotion trend analysis system, improving and utilizing text classification to judge the accuracy rate of emotion tendentiousness of text.
A kind of text emotion trend analysis system that the present invention proposes, comprising:
Sample training module, for receiving text to be analyzed, training sample, obtaining and differentiating template;
Entity extraction module, extracts text entities to be discriminated, filters not containing the text of entity;
Characteristic extracting module, extracts the tendentiousness correlated characteristic in text;
Sentiment orientation identification module, utilizes maximum entropy method to differentiate Text Orientation.
Preferably, described tendentiousness correlated characteristic comprises: polarity word, dimension word, qualifier, negative word.
Preferably, to text carry out trend analysis before set up entity dictionary, polarity dictionary, dimension dictionary, qualifier dictionary and other relevant dictionary.
Preferably, described entity extraction module, specifically for:
Pre-service;
The calculating of item weight;
According to pretreated training set;
Learning model building, constructs sorter;
Utilize test set document to test the performance of the sorter established by certain method of testing, and constantly feedback, study improve this classifier performance, until make it.
Preferably, described pre-service is specially: the form being expressed as document sets to be easy to computer disposal according to the disaggregated model adopted.
Preferably, the calculating of described item weight, is specially: represent importance every in document according to suitable weighing computation method.
Preferably, described characteristic extracting module, specifically for:
By the Feature Words in keyword abstraction or feature extraction text;
By vector space model by document vectorization;
Calculate the similarity between document, and select appropriate algorithm to carry out cluster.
In the present invention, gather the forum in field, user enterprise place, blog, extract the text in webpage, by text emotion trend analysis obtain text Sentiment orientation and for entity (enterprise, enterprise product, rival etc.), and automatically generate enterprise and rival's image change chart, utilize text classification to judge the accuracy rate of emotion tendentiousness of text to improve.
Accompanying drawing explanation
Fig. 1 is a kind of text emotion trend analysis system that the embodiment of the present invention proposes;
Fig. 2 is the functional diagram of Fig. 1 Chinese version sort module;
Fig. 3 is the functional diagram of Fig. 1 Chinese version cluster module.
Embodiment
As shown in Figure 1, the embodiment of the present invention proposes a kind of text emotion trend analysis system, comprising: sample training module 10, for receiving text to be analyzed, training sample, obtaining and differentiating template; Entity extraction module 20, extracts text entities to be discriminated, filters not containing the text of entity; Characteristic extracting module 30, extracts the tendentiousness correlated characteristic (polarity word, dimension word, qualifier, negative word etc.) in text; Sentiment orientation identification module 40, utilizes maximum entropy method to differentiate Text Orientation.In addition, to text carry out trend analysis before set up entity dictionary, polarity dictionary, dimension dictionary, qualifier dictionary and other relevant dictionary.
Wherein, the function of entity extraction module 20 as shown in Figure 2, comprising: be first pre-service, and document sets is expressed as the form being easy to computer disposal by the disaggregated model according to adopting; Next is the calculating of a weight, represents importance every in document according to suitable weighing computation method; Be according to pretreated training set (having predicted the document of classification) learning model building again, construct sorter; Finally utilize test set document to test the performance of the sorter established by certain method of testing, and constantly feedback, study improve this classifier performance, until make it.
Wherein, the function of characteristic extracting module 30 as shown in Figure 3, comprising: by the Feature Words in keyword abstraction or feature extraction text, then by vector space model by document vectorization, finally calculate the similarity between document, and select appropriate algorithm to carry out cluster.
The above; be only the present invention's preferably embodiment; but protection scope of the present invention is not limited thereto; anyly be familiar with those skilled in the art in the technical scope that the present invention discloses; be equal to according to technical scheme of the present invention and inventive concept thereof and replace or change, all should be encompassed within protection scope of the present invention.
Claims (7)
1. a text emotion trend analysis system, is characterized in that, comprising:
Sample training module, for receiving text to be analyzed, training sample, obtaining and differentiating template;
Entity extraction module, extracts text entities to be discriminated, filters not containing the text of entity;
Characteristic extracting module, extracts the tendentiousness correlated characteristic in text;
Sentiment orientation identification module, utilizes maximum entropy method to differentiate Text Orientation.
2. text emotion trend analysis system according to claim 1, is characterized in that, described tendentiousness correlated characteristic comprises: polarity word, dimension word, qualifier, negative word.
3. text emotion trend analysis system according to claim 1, is characterized in that, to text carry out trend analysis before set up entity dictionary, polarity dictionary, dimension dictionary, qualifier dictionary and other relevant dictionary.
4. text emotion trend analysis system according to claim 1, is characterized in that, described entity extraction module, specifically for:
Pre-service;
The calculating of item weight;
According to pretreated training set;
Learning model building, constructs sorter;
Utilize test set document to test the performance of the sorter established by certain method of testing, and constantly feedback, study improve this classifier performance, until make it.
5. text emotion trend analysis system according to claim 4, it is characterized in that, described pre-service is specially: document sets is expressed as the form being easy to computer disposal by the disaggregated model according to adopting.
6. text emotion trend analysis system according to claim 4, is characterized in that, the calculating of described item weight, is specially: represent importance every in document according to suitable weighing computation method.
7. text emotion trend analysis system according to claim 1, is characterized in that, described characteristic extracting module, specifically for:
By the Feature Words in keyword abstraction or feature extraction text;
By vector space model by document vectorization;
Calculate the similarity between document, and select appropriate algorithm to carry out cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410537881.0A CN104281694A (en) | 2014-10-13 | 2014-10-13 | Analysis system of emotional tendency of text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410537881.0A CN104281694A (en) | 2014-10-13 | 2014-10-13 | Analysis system of emotional tendency of text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104281694A true CN104281694A (en) | 2015-01-14 |
Family
ID=52256567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410537881.0A Pending CN104281694A (en) | 2014-10-13 | 2014-10-13 | Analysis system of emotional tendency of text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104281694A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106407177A (en) * | 2016-08-26 | 2017-02-15 | 西南大学 | Emergency online group behavior detection method based on clustering analysis |
CN107102984A (en) * | 2017-04-21 | 2017-08-29 | 中央民族大学 | A kind of Tibetan language microblog emotional sentiment classification method and system |
CN107194739A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107526831A (en) * | 2017-09-04 | 2017-12-29 | 华为技术有限公司 | A kind of natural language processing method and apparatus |
CN109165298A (en) * | 2018-08-15 | 2019-01-08 | 上海文军信息技术有限公司 | A kind of text emotion analysis system of autonomous upgrading and anti-noise |
CN109783800A (en) * | 2018-12-13 | 2019-05-21 | 北京百度网讯科技有限公司 | Acquisition methods, device, equipment and the storage medium of emotion keyword |
WO2019174423A1 (en) * | 2018-03-16 | 2019-09-19 | 北京国双科技有限公司 | Entity sentiment analysis method and related apparatus |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894102A (en) * | 2010-07-16 | 2010-11-24 | 浙江工商大学 | Method and device for analyzing emotion tendentiousness of subjective text |
CN102663046A (en) * | 2012-03-29 | 2012-09-12 | 中国科学院自动化研究所 | Sentiment analysis method oriented to micro-blog short text |
CN104182387A (en) * | 2014-07-21 | 2014-12-03 | 安徽华贞信息科技有限公司 | Text emotional tendency analysis system |
-
2014
- 2014-10-13 CN CN201410537881.0A patent/CN104281694A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894102A (en) * | 2010-07-16 | 2010-11-24 | 浙江工商大学 | Method and device for analyzing emotion tendentiousness of subjective text |
CN102663046A (en) * | 2012-03-29 | 2012-09-12 | 中国科学院自动化研究所 | Sentiment analysis method oriented to micro-blog short text |
CN104182387A (en) * | 2014-07-21 | 2014-12-03 | 安徽华贞信息科技有限公司 | Text emotional tendency analysis system |
Non-Patent Citations (1)
Title |
---|
邓时滔: "中文文本情感倾向性分类研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106407177A (en) * | 2016-08-26 | 2017-02-15 | 西南大学 | Emergency online group behavior detection method based on clustering analysis |
CN107102984A (en) * | 2017-04-21 | 2017-08-29 | 中央民族大学 | A kind of Tibetan language microblog emotional sentiment classification method and system |
CN107194739A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107194739B (en) * | 2017-05-25 | 2018-10-26 | 广州百奕信息科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107526831A (en) * | 2017-09-04 | 2017-12-29 | 华为技术有限公司 | A kind of natural language processing method and apparatus |
US11630957B2 (en) | 2017-09-04 | 2023-04-18 | Huawei Technologies Co., Ltd. | Natural language processing method and apparatus |
WO2019174423A1 (en) * | 2018-03-16 | 2019-09-19 | 北京国双科技有限公司 | Entity sentiment analysis method and related apparatus |
CN109165298A (en) * | 2018-08-15 | 2019-01-08 | 上海文军信息技术有限公司 | A kind of text emotion analysis system of autonomous upgrading and anti-noise |
CN109165298B (en) * | 2018-08-15 | 2022-11-15 | 上海五节数据科技有限公司 | Text emotion analysis system capable of achieving automatic upgrading and resisting noise |
CN109783800A (en) * | 2018-12-13 | 2019-05-21 | 北京百度网讯科技有限公司 | Acquisition methods, device, equipment and the storage medium of emotion keyword |
CN109783800B (en) * | 2018-12-13 | 2024-04-12 | 北京百度网讯科技有限公司 | Emotion keyword acquisition method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104281694A (en) | Analysis system of emotional tendency of text | |
Wen et al. | Emotion classification in microblog texts using class sequential rules | |
CN104598535B (en) | A kind of event extraction method based on maximum entropy | |
CN104331506A (en) | Multiclass emotion analyzing method and system facing bilingual microblog text | |
CN103617157A (en) | Text similarity calculation method based on semantics | |
CN104268160A (en) | Evaluation object extraction method based on domain dictionary and semantic roles | |
CN103336766A (en) | Short text garbage identification and modeling method and device | |
CN103744953A (en) | Network hotspot mining method based on Chinese text emotion recognition | |
CN104361037B (en) | Microblogging sorting technique and device | |
CN104317965A (en) | Establishment method of emotion dictionary based on linguistic data | |
CN106682123A (en) | Hot event acquiring method and device | |
US9652997B2 (en) | Method and apparatus for building emotion basis lexeme information on an emotion lexicon comprising calculation of an emotion strength for each lexeme | |
CN108959329A (en) | A kind of file classification method, device, medium and equipment | |
CN108090178A (en) | A kind of text data analysis method, device, server and storage medium | |
CN104008187A (en) | Semi-structured text matching method based on the minimum edit distance | |
CN103617245A (en) | Bilingual sentiment classification method and device | |
CN104794209B (en) | Chinese microblogging mood sorting technique based on Markov logical network and system | |
CN103927342A (en) | Vertical search engine system on basis of big data | |
CN104268214B (en) | A kind of user's gender identification method and system based on microblog users relation | |
Campbell et al. | Content+ context networks for user classification in twitter | |
CN103744958A (en) | Webpage classification algorithm based on distributed computation | |
CN102541935A (en) | Novel Chinese Web document representing method based on characteristic vectors | |
CN114065749A (en) | Text-oriented Guangdong language recognition model and training and recognition method of system | |
CN105243095A (en) | Microblog text based emotion classification method and system | |
CN113626604A (en) | Webpage text classification system based on maximum interval criterion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20150114 |
|
RJ01 | Rejection of invention patent application after publication |