CN113705238A - Method and model for analyzing aspect level emotion based on BERT and aspect feature positioning model - Google Patents
Method and model for analyzing aspect level emotion based on BERT and aspect feature positioning model Download PDFInfo
- Publication number
- CN113705238A CN113705238A CN202110670846.6A CN202110670846A CN113705238A CN 113705238 A CN113705238 A CN 113705238A CN 202110670846 A CN202110670846 A CN 202110670846A CN 113705238 A CN113705238 A CN 113705238A
- Authority
- CN
- China
- Prior art keywords
- information
- context
- words
- model
- representation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 137
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004458 analytical method Methods 0.000 claims abstract description 76
- 230000007246 mechanism Effects 0.000 claims abstract description 45
- 230000002452 interceptive effect Effects 0.000 claims abstract description 40
- 230000003993 interaction Effects 0.000 claims abstract description 26
- 239000013598 vector Substances 0.000 claims description 32
- 230000004807 localization Effects 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 24
- 238000000605 extraction Methods 0.000 claims description 19
- 230000007774 longterm Effects 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 16
- 230000002996 emotional effect Effects 0.000 claims description 15
- 238000006243 chemical reaction Methods 0.000 claims description 14
- 238000011176 pooling Methods 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 230000011218 segmentation Effects 0.000 claims description 11
- 239000003550 marker Substances 0.000 claims description 10
- 230000002457 bidirectional effect Effects 0.000 claims description 7
- 238000012886 linear function Methods 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 7
- 238000012512 characterization method Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000008447 perception Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 238000012797 qualification Methods 0.000 claims description 5
- 238000011156 evaluation Methods 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 9
- 230000007935 neutral effect Effects 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000012552 review Methods 0.000 description 6
- 238000005457 optimization Methods 0.000 description 5
- 238000013145 classification model Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 3
- 238000012418 validation experiment Methods 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000036651 mood Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to an aspect level emotion analysis method and a model based on a BERT and an aspect feature positioning model, wherein the method comprises the following steps: firstly, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information; then constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the surface and the context representation, integrating the relationship between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result; then, an aspect feature positioning model is constructed to capture aspect information during sentence modeling, and complete information of aspects is integrated into interactive semantics, so that the influence of interference words irrelevant to the aspect words is reduced, and the integrity of the aspect word information is improved; and finally, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information. The implicit relationship between the contexts can be better simulated, the information of the aspect words is better utilized, and the interference of the information which is not related to the aspect words is reduced, so that higher accuracy and the macro F1 are obtained.
Description
Technical Field
The invention belongs to the technical field of aspect level emotion analysis, and particularly relates to an aspect level emotion analysis method and model (ALM-BERT) based on a BERT and an aspect feature positioning model.
Background
Electronic commerce is a rapidly developing industry, and the importance of electronic commerce to global economy is increasing day by day. In particular, with the rapid development of social media and the continuous popularization of social networking platforms, more and more users begin to express comments with emotion on various networking platforms. These reviews reflect the mood of the user and consumer and provide the seller and government with a lot of valuable feedback information about the quality of goods or services. For example: before purchasing an item, a user may browse through a number of reviews on the item on an e-commerce platform to determine whether the item is worth purchasing. Also, governments and enterprises can collect a large amount of public comments directly from the internet, analyze the opinions and satisfaction of users, and further meet their needs. Therefore, sentiment analysis has attracted a great deal of attention from the theoretical and practical world as a fundamental and critical task of natural language processing.
However, common emotion analysis tasks (e.g., sentence-level emotion analysis) can only determine the user's emotional polarity (e.g., positive, negative, and neutral) for a product or event from the entire sentence, and cannot determine the emotional polarity of a particular aspect of the sentence. In contrast, aspect level sentiment analysis is a more granular classification task that can identify sentiment polarity of aspects in a sentence. For example, as shown in FIG. 9, some examples of sentence-level sentiment analysis and aspect-based sentiment analysis are provided (a consumer review example with three aspect terms), and we can see from the review text that "it does not have any accompanying software installed outside the windows media, but for price i are very satisfied with its condition and overall product", the emotional polarity of the aspect term "software" is negative, "windows media" is neutral, "price" and "very satisfied" are positive.
In prior studies, researchers have proposed various methods to accomplish aspect level emotion analysis tasks. Most of the methods are based on supervised machine learning algorithm, and certain effect is achieved. However, these statistical methods require careful design of manual features on large-scale data sets, resulting in significant labor and time costs. In view of the ability of neural network models to automatically learn low-dimensional representations of facets and contexts from comment text without relying on artificial feature engineering, neural networks have received increasing attention in recent years for facet-level emotion analysis tasks.
Unfortunately, existing methods mostly utilize either a Recurrent Neural Network (RNN) or a Convolutional Neural Network (CNN) directly to model independently and express semantic information of aspect words (aspect words) and their contexts, but ignore the fact that they lack sensitivity to the location of critical components. In practice, researchers have demonstrated that the emotional polarity of body words is highly correlated with body word information and word order information, which means that the emotional polarity of facet words is more susceptible to contextual words that are closer to the facet words. In addition, it is difficult for neural networks to capture long-term dependencies between facet words and context, resulting in loss of valuable information.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a BERT and facet feature localization model-based facet emotion analysis method capable of better utilizing information of facet words and reducing interference of information irrelevant to facet words, thereby obtaining higher accuracy and a macro F1, and a system based on the BERT and facet feature localization model-based facet emotion analysis method.
In order to solve the technical problems, the invention adopts the following technical scheme:
the invention provides an aspect level emotion analysis method based on a BERT and an aspect feature positioning model, which comprises the following steps:
s1, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;
s2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the surface and the context representation, integrating the relation between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result;
s3, constructing an aspect feature positioning model to capture aspect information during sentence modeling, and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;
and S4, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information.
Further, the "obtaining high-quality context information representation and aspect information representation by using BERT model" refers to generating high-quality text feature vector representation by using a pre-trained BERT model as a text vectorization mechanism, where BERT is a pre-trained language representation model, and the text vectorization mechanism refers to mapping each word to a high-dimensional vector space, specifically: the BERT model generates text representation by using a deep-layer bidirectional converter coder, simultaneously divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.
Wherein, the "dividing a given word sequence into different segments by adding special word segmentation markers at the beginning and end of the input sequence, generating marker embedding, segment embedding and position embedding for different segments, and finally converting the annotation text and the aspect words respectively to obtain context information representation and aspect information representation" specifically includes:
the BERT model adds special word segmentation marks [ CLS ] at the beginning and the end of an input sequence respectively]And [ SEP ]]Dividing a given word sequence into different segments, generating mark embedding, segment embedding and position embedding for different segments, enabling the embedded representation of the input sequence to contain all the information of the three embedding, and finally respectively converting the annotation text and the aspect words into 'CLS' in a BERT model]+ annotate text + [ SEP]"and" [ CLS]+ target + [ SEP]"get context representation EcAnd aspect represents Ea:
Ec={we[CLS],we1,we2,...,we[SEP]};
Ea={ae[CLS],ae1,ae2,...,ae[SEP]};
Wherein we[CLS],ae[CLS]Indicates a Classification marker [ CLS]Vector of (2), we[SEP]And ae[SEP]Representation delimiter [ SEP]The vector of (2).
Further, the "constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the representation and the context characterization and integrate the relationship between the body word and the context" means that the important feature extraction of the aspect-level emotion analysis is realized based on the multi-head attention mechanism, and the important information of the context and the target is extracted, specifically: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; and then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, the long-term dependence information and the context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and the final interactive hidden state of the context interaction and the final interactive hidden state of the context and the aspect words are obtained after mean value pooling operation.
The "extracting interactive semantics from the aspect information representation and the context information representation generated by the BERT model through the transcoder, determining the context which is most important for emotion qualification of the aspect word, simultaneously generating hidden states by using the long-term dependence information and the context perception information of the context as the input data of the position feed-forward network, and obtaining the final interactive hidden state of the context interaction and the final interactive hidden state of the context and the aspect word after the mean pooling operation" specifically includes:
s201, mapping a query sequence and a series of key (K) values (V) for capturing different important information in a parallel subspace from aspect information representation and context information representation generated by a BERT model through a plurality of self-attention mechanisms forming a multi-head attention mechanism in a conversion encoder;
s202, through an attention score function formula fs(Q,K,V)=σ(fe(Q, K)) V calculates an attention score for each important captured message, where σ (f)e(Q, K)) represents a normalized exponential function, fe(Q, K) is an energy function for learning the correlation characteristics between K and Q, and is calculated by the following formula;
s203, inputting the context expression and the aspect expression into attentionFractional function formula fmh(Q,K,V)=[a1;a2;...;ai;...;an-head]WdRespectively obtaining long-term dependency information c of contextccAnd context-aware information tcaTo capture long term dependencies of contexts and to determine which contexts are most important for sentiment characterization of the facet words; wherein, aiAttention score, which represents the ith important information captured, [ a ]1;a2;...;ai;...;an-head]Representing a concatenation vector, WdIs an attention weight matrix, ccc=fmh(Ec,Ec),tca=fmh(Ec,Ea);
S204, converting the encoder with cccAnd tcaGenerating hidden states h as input data to a position feedforward networkcAnd haThe position feedforward network PFN (h) is a variant of the multi-layer perceptroncAnd haThe definition is as follows:
hc=PFN(ccc)
ha=PFN(tca);
PFN(h)=ζ(hW1+b1)W2+b2;
wherein, ζ (hW)1+b1) Is a corrected linear unit, b1And b2Is an offset value, W1And W2Representing a learnable weight parameter;
s205. in the hidden state hcAnd haAfter the mean value pooling operation is carried out, the final interactive hidden state h of the context interaction is obtainedcmAnd final interactive hidden state h of context and aspect wordam。
Further, the working process of the aspect feature localization model is as follows, algorithm 1:
in particular, the feature localization algorithm represents E from the context according to the position and length of the facet wordscExtracting the most important related information of the aspect word af; while taking the most important feature AF from the AF using max pooling, then performing a dropout operation on the most important feature AF, and representing E in contextcImportant characteristics h of the Chinese obtained aspect wordaf。
Further, the "fusing context and target important information related to the target, and predicting probabilities and category numbers of different emotion polarities by using the emotion prediction factors on the basis of the fused information" specifically includes:
s401, h is spliced by using a vector splicing modecm、hamAnd hafTaken together to give the overall characteristic r:
r=[hcm;ham;haf];
s402, performing data preprocessing on r by adopting a linear function, namely:
x=Wur+buwherein W isuIs a weight matrix, buIs a bias value;
and S403, calculating the probability Pr (a is p) that the emotion polarity of the aspect word a in the sentence is p by using a softmax function:
where p represents the candidate emotion polarity and C is the number of categories of emotion polarities.
Further, the method for analyzing the aspect level emotion based on the BERT and the aspect feature positioning model further comprises the following steps: training was performed using cross entropy and L2 regularization as a loss function, defined as:
where D represents all training data, j and i are indices of the training data samples and emotion classes, respectively, λ represents a factor for L2 regularization, θ represents a parameter set for the model, y represents predicted emotion polarity,indicating the correct emotional polarity.
The invention also provides an aspect level emotion analysis model, which comprises the following components:
the text vectorization mechanism obtains high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;
the feature extraction model of the aspect level emotion analysis is used for learning the interaction between the surface representation and the context representation, integrating the relationship between the body words and the context to distinguish the contributions of different sentences and the aspect words to the classification result, capturing aspect information during sentence modeling, and integrating the complete information of the aspects into the interactive semantics to reduce the influence of interference words which are irrelevant to the aspect words and improve the integrity of the information of the aspect words;
and the emotion predictor is used for fusing context related to the target and important information of the target and predicting the probability of different emotion polarities by utilizing the emotion prediction factors on the basis of the fused information.
Further, the BERT model is a pre-trained language representation model, a text representation is generated by using a deep-layer multi-layer bidirectional converter encoder, meanwhile, a given word sequence is divided into different segments by adding special word segmentation marks at the beginning and the end of the input sequence respectively, mark embedding, segment embedding and position embedding are generated for the different segments, and finally, the annotation text and the aspect words are respectively converted to obtain a context information representation and an aspect information representation;
the feature extraction model of the aspect level emotion analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and is used for learning the interaction between the feature and the context representation and integrating the relationship between the body words and the context so as to distinguish the contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;
the emotion predictor connects the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performs data preprocessing on the overall features by adopting a linear function, and finally calculates the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function.
The invention has the beneficial effects that:
according to the technical scheme, the implicit relation between the contexts can be better simulated through the conversion encoder, the information of the aspect words can be better utilized through the aspect feature positioning model, the interference of the information irrelevant to the aspect words is reduced, and therefore higher accuracy and macro F1 (the accuracy rate of the macro F1 and the accuracy rate of the macro F1 on sentences of different lengths are respectively 3.1% higher and 6.56% higher) are obtained, and meanwhile the feasibility and the effectiveness of the BERT model and the aspect information in the aspect-level emotion analysis task are verified.
Drawings
FIG. 1 is a flow chart of an embodiment of a method for facet emotion analysis based on BERT and facet feature location models;
FIG. 2 is a schematic structural diagram of an embodiment of an aspect level sentiment analysis system based on BERT and aspect feature localization models according to the present invention;
FIG. 3 is a graph of experimental results of drop rate parameter optimization in an evaluation experiment according to the aspect level emotion analysis method based on BERT and the aspect feature localization model of the present invention;
FIG. 4 is a graph of experimental results of learning rate parameter optimization in an evaluation experiment of the aspect level emotion analysis method based on BERT and the aspect feature localization model according to the present invention;
FIG. 5 is a graph of experimental results of L2 regularization parameter optimization in an evaluation experiment of the aspect level emotion analysis method based on BERT and the aspect feature localization model according to the present invention;
FIG. 6 is a graph of ROUGE scores (ROUGE-1) of different lengths of source text for a BERT and aspect feature localization model-based aspect level emotion analysis method and TD-LSTM validation experiment according to the present invention;
FIG. 7 is a graph of ROUGE scores (ROUGE-2) of different lengths of source text for a BERT and aspect feature localization model-based aspect level emotion analysis method and TD-LSTM validation experiment in accordance with the present invention;
FIG. 8 is a graph of ROUGE scores (ROUGE-L) of different lengths of source text for a BERT and aspect feature localization model-based aspect level emotion analysis method and TD-LSTM validation experiment in accordance with the present invention;
FIG. 9 is an example of prior art facet sentiment analysis.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, an aspect level emotion analysis method based on BERT and aspect feature localization models according to an embodiment of the present invention includes the following steps:
s1, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information; specifically, a pre-trained BERT model is used as a text vectorization mechanism to generate high-quality text feature vector representation, the BERT is a pre-trained language representation model, and the text vectorization mechanism is used for mapping each word to a high-dimensional vector space, and specifically includes: the BERT model generates text representation by using a deep-layer bidirectional converter coder, simultaneously divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.
S2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the surface and the context representation, integrating the relation between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result; specifically, the method is to extract important features of aspect-level emotion analysis based on a multi-head attention mechanism, and extract important information of context and a target, and specifically includes: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; and then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, the long-term dependence information and the context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and the final interactive hidden state of the context interaction and the final interactive hidden state of the context and the aspect words are obtained after mean value pooling operation.
S3, constructing an aspect feature positioning model to capture aspect information during sentence modeling, and integrating complete information of aspects into interactive semantics to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information; the aspect feature positioning module is constructed based on a maximum pooling function, namely, the extracted aspect words and the context hidden features thereof are divided into a plurality of areas, the maximum value is selected in each area to represent the area, and the aspect feature positioning module (positioning core features) is constructed in such a way; the working process of the aspect feature positioning model expresses E from the context according to the position and the length of the aspect word through a feature positioning algorithmcExtracting the most important related information of the aspect word af; all in oneThe most important feature AF is obtained from the AF by maximum pooling, then a dropout operation is performed on the most important feature AF, and E is indicated in the contextcImportant characteristics h of the Chinese obtained aspect wordaf。
S4, fusing context related to the target and important target information, and predicting probabilities of different emotion polarities by using emotion prediction factors on the basis of the fused information; the method specifically comprises the following steps: and connecting the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performing data preprocessing on the overall features by adopting a linear function, and finally calculating the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function.
As shown in FIG. 2, the present invention further provides an aspect level emotion analysis model, which includes a text vectorization mechanism 100, a feature extraction model 200 for aspect level emotion analysis, and an emotion predictor 300.
The text vectorization mechanism 100 is a multi-angle text vectorization mechanism, and obtains high-quality context information representation and aspect information representation by using a BERT model to maintain the integrity of text information; the BERT model is a pre-trained language representation model, generates text representation by using a deep-layer bidirectional converter coder, simultaneously divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.
The feature extraction model 200 of the aspect level emotion analysis is used for learning the interaction between the representation of the features and the representation of the context, integrating the relationship between the body words and the context to distinguish the contributions of different sentences and the body words to the classification results, capturing aspect information during sentence modeling and integrating the complete information of the aspects into interactive semantics; the method specifically comprises the following steps: the feature extraction model 200 of the aspect level emotion analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and is used for learning the interaction between the feature and the context representation and integrating the relationship between the body words and the context so as to distinguish the contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics; therefore, the influence of interference words irrelevant to the aspect words can be reduced, and the completeness of the aspect word information is improved;
the emotion predictor 300 is used for fusing context related to the target and important information of the target and predicting the probability of different emotion polarities by using emotion prediction factors on the basis of the fused information; specifically, the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words are connected in a vector splicing mode to obtain overall features, then data preprocessing is performed on the overall features by adopting a linear function, and finally the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity is calculated by utilizing a softmax function.
In general, aspect level emotion analysis refers to a process of taking a sentence and some predefined aspect words as input data, and finally outputting the emotion polarity of each aspect word in the sentence. Here we use some practical review examples to illustrate the aspect level sentiment analysis task.
It is apparent that each example sentence contains two aspect words, each having four different emotional polarities, i.e., positive, neutral, negative, and conflicting, as shown in table 1. Then the aspect level sentiment analysis is defined as follows:
table 1 some examples of aspect level sentiment analysis
Defining one: formally, a comment sentence S ═ w is given1,w2,...,wnWhere n is the total number of words in S. One squareFace word list a ═ a1,...,ai,...,amIs m in length, wherein aiRepresents the ith aspect word in the aspect word table a, a being a subsequence of sentence S. P ═ P1,...,pj,...,pCDenotes the candidate emotion polarity, where C denotes the number of categories of emotion polarity, pjIndicating the jth emotion polarity.
The problems are as follows: the goal of the aspect level emotion analysis model is to predict the most likely emotion polarity for a particular aspect word, which can be expressed as:
where phi denotes a function for quantizing the facet word aiAnd the emotional polarity p in the sentence sjThe degree of match between. And finally, outputting the emotion polarity with the highest matching degree as a classification result by the model. Table 2 summarizes the symbols in the model and their descriptions.
TABLE 2 symbols used and their description
The invention relates to an aspect level emotion analysis method based on a BERT and aspect feature positioning model, which comprises the following steps: firstly, generating a high-quality sequence word vector by utilizing a pre-training BERT model, and providing effective support for the subsequent steps; then, in the feature extraction method of aspect-level emotion analysis, an important feature extraction module is realized based on a multi-head attention mechanism, and important information of context and a target is extracted; then, providing an aspect feature positioning model, and comprehensively considering the important features of the target words to obtain target related features; and finally, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information. The specific method and principle are as follows:
1. multi-angle text vectorization mechanism
The text vectorization mechanism essentially maps each word to a high-dimensional vector space. Generally, two context-based Word embedding models, namely Word2vec and Glove, are widely applied to text vectorization, and achieve great performance in aspect-level emotion analysis tasks. However, research has shown that the two-word embedding model cannot obtain enough information in the text, which results in insufficient classification accuracy and reduced performance. Therefore, a high-quality word embedding model has an important influence on improving the accuracy of the classification result.
The key to realizing the aspect-level emotion analysis is that natural language processing can be effectively understood, the method highly depends on large-scale high-quality labeled texts under normal conditions, and fortunately, a BERT model is a language pre-training model capable of effectively utilizing unlabeled texts, the BERT model adopts a mode of randomly shielding partial vocabularies, a deep multi-layer bidirectional converter encoder is utilized to extract a general natural language recognition model from massive unlabeled texts, and a small amount of labeled data is further used for fine tuning, so that high-quality text feature vector representation can be generated. It is inspired by this that in the ALM-BERT method proposed by the present invention, for a given word sequence, special segmentation markers [ CLS ] are added at the beginning and end of the input sequence, respectively]And [ SEP ]]In order to divide the sequence into different segments. That is, the word embedding vector input in this way includes vectors such as mark embedding, segment embedding, and position embedding generated for different segments. Specifically, the comment text and the aspect word are converted into "[ CLS ], respectively]+ annotate text + [ SEP]"and" [ CLS]+ target + [ SEP]", the resulting context indicates EcAnd aspect represents Ea:
Ec={we[CLS],we1,we2,...,we[SEP]} (2)
Ea={ae[CLS],ae1,ae2,...,ae[SEP]} (3)
Wherein we[CLS],ae[CLS]Indicates a Classification marker [ CLS]Vector of (2), we[SEP]And ae[SEP]Representation delimiter [ SEP]The vector of (2).
2. Feature extraction method for aspect-level emotion analysis
In order to extract hidden features of an aspect word and context thereof and emphatically consider auxiliary information contained in the aspect word, a converter encoder is introduced, and an aspect word feature positioning module is provided. The basic idea is to model the context and the target word interactively to integrate the information of the aspect words and the context fully. In addition, the emotion classification accuracy can be improved by acquiring the feature information of the aspect words in the context.
2.1 important feature extraction model
A transform encoder (transform encoder) is a new type of feature extractor based on a multi-head attention mechanism and a position feed-forward network. It can learn different important information in different feature representation subspaces. Moreover, the transcoder can directly capture long-term correlation in the sequence, is easier to parallelize than a recurrent neural network and a convolutional neural network, and greatly reduces training time. The invention extracts interactive semantics from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, determines the most important context for emotion qualification of aspect words, simultaneously uses the long-term dependence information and the context perception information of the context as the input data of a position feed-forward network, respectively generates hidden states, and obtains the final interactive hidden state of context interaction and the final interactive hidden state of the context and the aspect words after mean pooling operation.
Intuitively, a multi-head attention mechanism is composed of a plurality of self-attention mechanisms (self-attention mechanisms) that can map to a query sequence (Q) and a series of key (K) values (V) that capture different important information in a parallel subspace. Attention score function fs(.) the calculation process in the self-attention mechanism is as follows:
fs(Q,K,V)=σ(fe(Q,K))V (4)
where σ () denotes a normalized exponential functionNumber fe(.) is an energy function that learns the correlation characteristics between K and Q, which can be calculated using the following formula:
Attention score function f of multi-head attention mechanismmh(.) by connecting the attention scores of the self-attention mechanism:
fmh(Q,K,V)=[a1;a2;...;ai;...;an-head]Wd (6)
wherein a isiAttention score, which represents the ith important information captured, [ a ]1;a2;...;ai;...;an-head]Representing a concatenation vector, WdIs the attention weight matrix.
As shown in equations (8) - (9) below, the context representation and the facet representation are input into a multi-attention mechanism to capture the long-term dependencies of the contexts and determine which contexts are most important for sentiment characterization of the facet words.
ccc=fmh(Ec,Ec) (8)
tca=fmh(Ec,Ea) (9)
Wherein, cccAnd tcaLong term dependency information and context-aware information of the context, respectively.
Then, the encoders are switched to c respectivelyccAnd tcaGenerating hidden states h as input data to a position feedforward networkcAnd ha. In particular, the position feedforward network pfn (h) is a variant of a multi-layer perceptron. Formally, a position feedforward network PFN, hcAnd haThe definition is as follows:
hc=PFN(ccc) (10)
ha=PFN(tca) (11)
PFN(h)=ζ(hW1+b1)W2+b2 (12)
wherein, ζ (hW)1+b1) Is a corrected linear unit, b1And b2Is an offset value, W1And W2Representing a learnable weight parameter.
In pair hcAnd haAfter the mean value pooling operation is carried out, the final interactive hidden state h of the context interaction is obtainedcmAnd final interactive hidden state h of context and aspect wordam。
2.2 aspect feature localization model
The transcoder captures long term dependencies of the context and generates semantic information of the interaction between the facet words and the context. In order to highlight the importance of different aspect words, the invention establishes an aspect word feature positioning model, and the main idea is to select information related to the aspect words from context feature representations, and better integrate the aspect information by capturing feature representation vectors containing the aspect information, thereby improving the accuracy of aspect level emotion classification. The working process of the aspect feature positioning model is shown as an algorithm 1:
in particular, the feature localization algorithm represents E from the context according to the position and length of the aspect wordcExtracting the most important related information of the aspect word af; at the same timeThe most important feature AF is obtained from AF with maximum pooling as follows:
AF=Maxpooling(af,dim=0) (13)
thereafter, a dropout operation is performed on the most important feature AF, and E is indicated in the contextcImportant characteristics h of the Chinese obtained aspect wordaf。
3. Emotion predictor
Firstly, h is spliced by using a vectorcm、hamAnd hafTaken together to give the overall characteristic r:
r=[hcm;ham;haf] (14)
then, a linear function is used to perform data preprocessing on r, namely:
x=Wur+bu (15)
wherein, WuIs a weight matrix, buIs the offset value.
Finally, calculating the probability Pr (a is p) that the emotion polarity of the aspect word a in the sentence is p by using a softmax function:
where p represents the candidate emotion polarity and C is the number of categories of emotion polarities.
In summary, the method for analyzing the emotion at the aspect level based on the BERT and the aspect feature positioning model of the present invention is an end-to-end operation process. Furthermore, to optimize the parameters of the method, the predicted emotional polarity y and the correct emotional polarity are madeAnd (3) minimizing losses therebetween, further comprising: training was performed using cross entropy and L2 regularization as a loss function, defined as:
where D represents all training data, j and i are indices of the training data samples and emotion classes, respectively, λ represents a factor for L2 regularization, θ represents a parameter set for the model, y represents predicted emotion polarity,indicating the correct emotional polarity.
4. Evaluation test
In order to evaluate the rationality and effectiveness of the BERT and aspect feature positioning model-based aspect level emotion analysis method and the model, the analysis is carried out through the following evaluation experiments.
4.1 data set and evaluation index
We constructed our relevant evaluation experiments in three published English review data sets. The details of these three data sets are shown in table 3: restaurant (Restaurant) and notebook (Laptop) datasets are provided by SemEval (references: Pontiki M, D Galanis, Pavlooplos J, et al. SemEval-2014 Task 4: Aspect Based Sentiment analysis. Proceedings of International Workshop on semiconductor Evaluation at, 2014.), each of these datasets containing facet words and corresponding Sentiment polarities, labeled as positive, negative, neutral and conflicting; the Twitter dataset consists of user comments on Twitter collected by Li et al (ref: Li D, Wei F, Tan C, et al. adaptive reactive Neural Network for Target-dependent Twitter sententiment Classification [ C ]// Meeting of the Association for Computational Linear constraints.2014.), with emotional polarities labeled as positive, negative and neutral. The three data sets are currently popular comment data sets and are widely applied to aspect-level emotion analysis tasks.
TABLE 3 statistical information of data sets
In addition, in order to objectively evaluate the performance of the BERT and aspect feature localization model-based aspect-level emotion analysis method and model, evaluation indexes commonly used in aspect-level emotion analysis tasks, namely macroscopic F1(macro-F1) and accuracy (Acc), are adopted. Is defined as:
Acc=SC/N (18)
where SC represents the number of correctly sorted samples and N represents the total number of samples. In general, the higher the accuracy, the better the performance of the model.
In addition, the macro F1(macro-F1) is used to truly reflect the performance of the model, i.e., the weighted average of accuracy and recall. macro-F1 is calculated according to the following formula:
where T is the number of samples correctly classified as emotion polarity i, FP is the number of samples misclassified as emotion polarity i, FN is the number of samples whose emotion polarity i is misclassified as other emotion polarities, C is the number of categories of emotion polarities,is the accuracy of the mood polarity i (precision),indicating the recall (recall) of emotional polarity i. In our experiments, to more fully evaluate the performance of our model, we classified the emotional polarity as 3C ═ positive,neutral, negative } and 4C ═ positive, neutral, negative, conflict }.
4.2 parameter optimization
During the training of the model, we utilize the BERT model to generate vector representations of context and aspect words. Specifically, we use the standard parameters BERT of the BERT modelBASETo complete the model training. Wherein, in BERTBASEThe number of conversion modules, the number of hidden neurons, and the number of self-attentive heads in (1) are 12, 768, and 12, respectively. Furthermore, to analyze the optimal hyper-parameter settings, we provide several important hyper-parameter setting examples.
First, the drop rate (Dropout) refers to the probability of dropping some neurons during the training of the neural network to solve the overfitting and enhance the generalization ability of the model. Where we initialize the value of dropout to 0.3 and then search for the best value at intervals of 0.1. Experimental results as shown in fig. 3, when dropout is 0.5, the precision and F1 value of the aspect level emotion analysis method and model based on BERT and aspect feature localization model of the present invention are best on three data sets.
Second, the learning rate (learning rate) determines whether and when the objective function converges to a local minimum. In our experiments, we used the Adam optimization algorithm to update the parameters of the model and explore at [10 ]-5,0.1]An optimal learning rate parameter within a range. As shown in fig. 4, when the learning rate is 2 × 10-5In time, the performance of the aspect level emotion analysis method and the aspect level emotion analysis model based on the BERT and the aspect feature positioning model is the best.
Finally, the L2 regularization parameter is a hyper-parameter that can prevent the model from overfitting. As shown in FIG. 5, when the value of the L2 regularization parameter is set to 0.01, the performance of the aspect level emotion analysis method and model based on the BERT and aspect feature localization model of the present invention is the best; meanwhile, the weights of the model are initialized by a Glorot parameter initialization method, the batch size is set to be 16, and 10 iteration times are trained in total.
4.3 comparison Algorithm
In order to verify the effectiveness of the BERT and aspect feature positioning model-based aspect level emotion analysis method and model of the present invention, the BERT and aspect feature positioning model-based aspect level emotion analysis method and model are compared with many popular aspect level emotion analysis models, as follows:
TD-LSTM is a classical classification model, which integrates related information of the aspect words and their contexts into the LSTM-based classification model, improving the classification accuracy.
ATAE-LSTM is a classification model that inputs the embedded representation of the aspect words as an embedded representation of the sentence into the model, and then applies an attention mechanism to compute weights to achieve high-precision emotion classification.
MemNet is a data-driven classification model that uses multiple attention-based models to capture the importance of each context word to complete emotion classification.
IAN is an interactive attention network that models the aspect words and their contexts, respectively, and generates an associative representation of the target and context.
RAM builds a framework based on the multi-attention mechanism to capture distant features in the text, enhancing the representation ability of the model.
TNet generates hidden representations of context and aspect words using bi-directional LSTM. The CNN layer is used instead of the attention mechanism to extract important features from the hidden representation.
Cabasc utilizes two attention-enhancing mechanisms, focusing on the aspect words and the context separately, and comprehensively considering the context and the correlation between the aspect words.
AOA constructs a dual attention module that links emotion words to facet words. In addition, the dual attention module automatically generates mutual attention weights from facet to text and from text to facet.
MGAN is a multi-granular attention model that captures information about interactions between terms and context from coarse to fine.
AEN-BERT is a model based on attention mechanism and BERT, showing good performance in the aspect-level sentiment analysis task.
BERT-base is a pre-trained BERT based aspect-level sentiment analysis model with complete connectivity layers and softmax layers for classification tasks.
To more accurately measure the performance of the models, we extended the AOA, IAN and MemNet models, replacing the embedded layers of these models with BERT models, resulting in AOA-BERT, IAN-BERT and MemNet-BERT models. The structure of the rest model is consistent with that described herein.
4.4 evaluation test analysis
As shown in table 4 below, the results of emotion classification when the emotion polarity C is 3 are shown. We can easily observe from the table that BERT-based (BERT pre-training based aspect-level sentiment analysis method) accuracy and macroscopic Fl are significantly higher than the model based on glove and word2vec methods. Particularly for restaurant data sets, the precision and macro F1 of the aspect-level emotion analysis method and model based on the BERT and aspect feature localization model are respectively 12.77% and 30.97% higher than those of the classical IAN model. This shows that BERT can better express semantic and grammatical features of text, and the facet-level emotion analysis method and model based on the BERT and facet feature localization model of the invention achieve the best classification performance on the three data sets. Specifically, in the restaurant dataset, the Bert and facet feature localization model-based facet emotion analysis method of the present invention has 4.2% and 8.81% improved accuracy and macro F1, respectively, compared to the AEN method. In addition, it can be easily found that on a notebook computer data set, the classification accuracy of the aspect-level emotion analysis method based on the BERT and aspect feature localization models and the macro F1 are respectively 3.29% and 3.15% higher than that of the BERT-base model, which shows that the aspect feature localization module in the invention plays a positive role in the aspect-level emotion analysis.
TABLE 4 Experimental evaluation results for various comparative methods
From the perspective of capturing long-term dependency relationships in comment texts, a series of verification experiments are constructed on texts with different lengths.
As shown in FIGS. 6-8, the aspect level emotion analysis method and model based on the BERT and aspect feature localization model of the present invention generally achieves higher accuracy and macro F1 than TD-LSTM, which means that we build a transform coder that can better simulate the implicit relationship between contexts than LSTM based coders. Furthermore, as shown in the following graph 7, we also note that the ALM-BERT model has 3.1% and 6.56% higher accuracy and mean of macro F1 on sentences of different lengths than AEN, respectively, because the aspect-level sentiment analysis method and model based on BERT and aspect feature localization model according to the present invention make better use of information of the aspect words than AEN, reducing interference of information unrelated to the aspect words.
In conclusion, the experiments show that the BERT and aspect feature positioning model-based aspect-level emotion analysis method and model can obtain higher accuracy and macro F1, and further verify the feasibility and effectiveness of the BERT model and aspect information in aspect-level emotion analysis tasks.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.
Claims (10)
1. An aspect level sentiment analysis method based on BERT and an aspect feature localization model is characterized by comprising the following steps:
s1, obtaining high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;
s2, constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the surface and the context representation, integrating the relation between the body words and the context, and further distinguishing the contributions of different sentences and aspect words to the classification result;
s3, constructing an aspect feature positioning model to capture aspect information during sentence modeling, and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;
and S4, fusing context related to the target and important target information, and predicting the probability of different emotion polarities by using the emotion prediction factor on the basis of the fused information.
2. The method according to claim 1, wherein the "obtaining high-quality context information representation and aspect information representation using BERT model" refers to generating high-quality text feature vector representation using pre-trained BERT model as a text vectorization mechanism, wherein BERT is a pre-trained language representation model, and the text vectorization mechanism refers to mapping each word to a high-dimensional vector space, specifically: the BERT model generates text representation by using a deep-layer bidirectional converter coder, simultaneously divides a given word sequence into different segments by adding special word segmentation markers at the beginning and the end of the input sequence respectively, generates marker embedding, segment embedding and position embedding for different segments, and finally converts annotation text and aspect words respectively to obtain context information representation and aspect information representation.
3. The method according to claim 2, wherein the "constructing an attention encoder based on a multi-head attention mechanism to learn the interaction between the feature and the context characterization, and integrate the relationship between the volume word and the context" means that the important feature extraction for the aspect-level emotion analysis is realized based on the multi-head attention mechanism, and the important information of the context and the target is extracted, specifically: firstly, introducing a conversion encoder, wherein the conversion encoder is a novel feature extractor based on a multi-head attention mechanism and a position feedforward network, and can learn different important information in different feature representation subspaces and directly capture long-term correlation in a sequence; and then, interactive semantics are extracted from the aspect information representation and the context information representation generated by the BERT model through a conversion encoder, the context which is most important for emotion qualification of the aspect words is determined, meanwhile, the long-term dependence information and the context perception information of the context are used as input data of a position feed-forward network to respectively generate hidden states, and the final interactive hidden state of the context interaction and the final interactive hidden state of the context and the aspect words are obtained after mean value pooling operation.
4. The method according to claim 3, wherein said "dividing a given word sequence into different segments by adding special segmentation markers at the beginning and end of the input sequence, respectively, generating marker embedding, segment embedding and position embedding for different segments, and finally converting the annotation text and the facet words, respectively, to obtain the context information representation and the facet information representation", specifically:
the BERT model adds special word segmentation marks [ CLS ] at the beginning and the end of an input sequence respectively]And [ SEP ]]Dividing a given word sequence into different segments, generating mark embedding, segment embedding and position embedding for different segments, enabling the embedded representation of the input sequence to contain all the information of the three embedding, and finally respectively converting the annotation text and the aspect words into 'CLS' in a BERT model]+ annotate text + [ SEP]"and" [ CLS]+ target + [ SEP]"get context representation EcAnd aspect represents Ea:
Ec={we[CLS],we1,we2,...,we[SEP]};
Ea={ae[CLS],ae1,ae2,...,ae[SEP]};
Wherein we[CLS],ae[CLS]Indicates a Classification marker [ CLS]Vector of (2), we[SEP]And ae[SEP]Representation delimiter [ SEP]The vector of (2).
5. The method according to claim 4, wherein the "extracting interaction semantics from the aspect information representation and the context information representation generated by the BERT model through the transcoder, determining a context which is most important for emotion characterization of the aspect word, and simultaneously generating hidden states by using long-term dependency information and context perception information of the context as input data of the location feed-forward network, and obtaining a final interaction hidden state of the context interaction and a final interaction hidden state of the context and the aspect word after the mean pooling operation" specifically comprises:
s201, mapping a query sequence and a series of key (K) values (V) for capturing different important information in a parallel subspace from aspect information representation and context information representation generated by a BERT model through a plurality of self-attention mechanisms forming a multi-head attention mechanism in a conversion encoder;
s202, through an attention score function formula fs(Q,K,V)=σ(fe(Q, K)) V calculates an attention score for each important captured message, where σ (f)e(Q, K)) represents a normalized exponential function, fe(Q, K) is an energy function for learning the correlation characteristics between K and Q, and is calculated by the following formula;
s203, inputting the context expression and the aspect expression into the attention score function formula fmh(Q,K,V)=[a1;a2,...;ai;...;an-head]WdRespectively obtaining long-term dependency information c of contextccAnd context-aware information tcaTo capture long term dependencies of contexts and to determine which contexts are most important for sentiment characterization of the facet words;wherein, aiAttention score, which represents the ith important information captured, [ a ]1;a2;...;ai;...;an-head]Representing a concatenation vector, WdIs an attention weight matrix, ccc=fmh(Ec,Ec),tca=fmh(Ec,Ea);
S204, converting the encoder with cccAnd tcaGenerating hidden states h as input data to a position feedforward networkcAnd haThe position feedforward network PFN (h) is a variant of the multi-layer perceptroncAnd haThe definition is as follows:
hc=PFN(ccc)
ha=PFN(tca);
PFN(h)=ζ(hW1+b1)W2+b2;
wherein, ζ (hW)1+b1) Is a corrected linear unit, b1And b2Is an offset value, W1And W2Representing a learnable weight parameter;
s205. in the hidden state hcAnd haAfter the mean value pooling operation is carried out, the final interactive hidden state h of the context interaction is obtainedcmAnd final interactive hidden state h of context and aspect wordam。
6. The method of claim 5, wherein the aspect feature localization model works as the following algorithm 1:
algorithm 1 aspect feature positioning algorithm
In particular, the feature localization algorithm represents E from the context according to the position and length of the facet wordscExtracting the most important related information of the aspect word af; simultaneous interestMaximum pooling is used to obtain the most important feature AF from the AF, and then a dropout operation is performed on the most important feature AF, and E is indicated in the contextcImportant characteristics h of the Chinese obtained aspect wordaf。
7. The method according to claim 6, wherein the fusing context related to the target and target importance information and predicting probabilities and category numbers of different emotion polarities using emotion prediction factors on the basis of the fused information specifically comprises:
s401, h is spliced by using a vector splicing modecm、hamAnd hafTaken together to give the overall characteristic r:
r=[hcm;ham;haf];
s402, performing data preprocessing on r by adopting a linear function, namely:
x=Wur+buwherein W isuIs a weight matrix, buIs a bias value;
and S403, calculating the probability Pr (a is p) that the emotion polarity of the aspect word a in the sentence is p by using a softmax function:
8. The method of any of claims 1-7, further comprising: training was performed using cross entropy and L2 regularization as a loss function, defined as:
9. An aspect-level sentiment analysis model, comprising:
the text vectorization mechanism obtains high-quality context information representation and aspect information representation by using a BERT model so as to keep the integrity of text information;
the feature extraction model of the aspect level emotion analysis is used for learning the interaction between the surface representation and the context representation, integrating the relationship between the body words and the context to distinguish the contributions of different sentences and the aspect words to the classification result, capturing aspect information during sentence modeling, and integrating the complete information of the aspects into the interactive semantics to reduce the influence of interference words which are irrelevant to the aspect words and improve the integrity of the information of the aspect words;
and the emotion predictor is used for fusing context related to the target and important information of the target and predicting the probability of different emotion polarities by utilizing the emotion prediction factors on the basis of the fused information.
10. The aspect level emotion analysis model of claim 9,
the BERT model is a pre-trained language representation model, generates text representation by using a deep-layer bidirectional converter encoder, simultaneously divides a given word sequence into different segments by respectively adding special word segmentation markers at the beginning and the end of the input sequence, generates marker embedding, segment embedding and position embedding for the different segments, and finally converts a comment text and an aspect word respectively to obtain context information representation and aspect information representation;
the feature extraction model of the aspect level emotion analysis comprises an important feature extraction model and an aspect feature positioning model; the important feature extraction model is an attention encoder based on a multi-head attention mechanism and is used for learning the interaction between the feature and the context representation and integrating the relationship between the body words and the context so as to distinguish the contributions of different sentences and aspect words to the classification result; the aspect feature positioning model is used for capturing aspect information during sentence modeling and integrating complete information of aspects into interactive semantics so as to reduce the influence of interference words irrelevant to the aspect words and improve the integrity of the aspect word information;
the emotion predictor connects the final interactive hidden state, the context, the final interactive hidden state of the aspect words and the important features of the aspect words by using a vector splicing mode to obtain overall features, then performs data preprocessing on the overall features by adopting a linear function, and finally calculates the probability that the emotion polarity of the aspect words in the sentence is the candidate emotion polarity by utilizing a softmax function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110670846.6A CN113705238B (en) | 2021-06-17 | 2021-06-17 | Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110670846.6A CN113705238B (en) | 2021-06-17 | 2021-06-17 | Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113705238A true CN113705238A (en) | 2021-11-26 |
CN113705238B CN113705238B (en) | 2022-11-08 |
Family
ID=78648134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110670846.6A Active CN113705238B (en) | 2021-06-17 | 2021-06-17 | Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113705238B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114003726A (en) * | 2021-12-31 | 2022-02-01 | 山东大学 | Subspace embedding-based academic thesis difference analysis method |
CN114462380A (en) * | 2022-01-12 | 2022-05-10 | 广西大学 | Story ending generation method based on emotion pre-training model |
CN114548099A (en) * | 2022-02-25 | 2022-05-27 | 桂林电子科技大学 | Method for jointly extracting and detecting aspect words and aspect categories based on multitask framework |
CN114626377A (en) * | 2022-03-28 | 2022-06-14 | 中南大学 | Aspect level emotion analysis method and system |
CN114647725A (en) * | 2022-02-22 | 2022-06-21 | 广东外语外贸大学 | Cross-domain emotion classification method based on BERT, LSTM and breadth learning |
CN116841609A (en) * | 2023-08-28 | 2023-10-03 | 中国兵器装备集团兵器装备研究所 | Method, system, electronic device and storage medium for supplementing code annotation information |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457480A (en) * | 2019-08-16 | 2019-11-15 | 国网天津市电力公司 | The construction method of fine granularity sentiment classification model based on interactive attention mechanism |
CN110717334A (en) * | 2019-09-10 | 2020-01-21 | 上海理工大学 | Text emotion analysis method based on BERT model and double-channel attention |
US20200073937A1 (en) * | 2018-08-30 | 2020-03-05 | International Business Machines Corporation | Multi-aspect sentiment analysis by collaborative attention allocation |
CN112199956A (en) * | 2020-11-02 | 2021-01-08 | 天津大学 | Entity emotion analysis method based on deep representation learning |
CN112231478A (en) * | 2020-10-22 | 2021-01-15 | 电子科技大学 | Aspect-level emotion classification method based on BERT and multi-layer attention mechanism |
-
2021
- 2021-06-17 CN CN202110670846.6A patent/CN113705238B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200073937A1 (en) * | 2018-08-30 | 2020-03-05 | International Business Machines Corporation | Multi-aspect sentiment analysis by collaborative attention allocation |
CN110457480A (en) * | 2019-08-16 | 2019-11-15 | 国网天津市电力公司 | The construction method of fine granularity sentiment classification model based on interactive attention mechanism |
CN110717334A (en) * | 2019-09-10 | 2020-01-21 | 上海理工大学 | Text emotion analysis method based on BERT model and double-channel attention |
CN112231478A (en) * | 2020-10-22 | 2021-01-15 | 电子科技大学 | Aspect-level emotion classification method based on BERT and multi-layer attention mechanism |
CN112199956A (en) * | 2020-11-02 | 2021-01-08 | 天津大学 | Entity emotion analysis method based on deep representation learning |
Non-Patent Citations (2)
Title |
---|
YEQUANWANG ET AL.: "Attention-based LSTM for Aspect-level Sentiment Classification", 《PROCEEDINGS OF THE 2016 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 * |
ZHENGXUAN WU ET AL.: "Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis", 《THE THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-21)》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114003726A (en) * | 2021-12-31 | 2022-02-01 | 山东大学 | Subspace embedding-based academic thesis difference analysis method |
CN114462380A (en) * | 2022-01-12 | 2022-05-10 | 广西大学 | Story ending generation method based on emotion pre-training model |
CN114647725A (en) * | 2022-02-22 | 2022-06-21 | 广东外语外贸大学 | Cross-domain emotion classification method based on BERT, LSTM and breadth learning |
CN114548099A (en) * | 2022-02-25 | 2022-05-27 | 桂林电子科技大学 | Method for jointly extracting and detecting aspect words and aspect categories based on multitask framework |
CN114548099B (en) * | 2022-02-25 | 2024-03-26 | 桂林电子科技大学 | Method for extracting and detecting aspect words and aspect categories jointly based on multitasking framework |
CN114626377A (en) * | 2022-03-28 | 2022-06-14 | 中南大学 | Aspect level emotion analysis method and system |
CN114626377B (en) * | 2022-03-28 | 2024-08-23 | 中南大学 | Aspect-level emotion analysis method and system |
CN116841609A (en) * | 2023-08-28 | 2023-10-03 | 中国兵器装备集团兵器装备研究所 | Method, system, electronic device and storage medium for supplementing code annotation information |
CN116841609B (en) * | 2023-08-28 | 2023-11-24 | 中国兵器装备集团兵器装备研究所 | Method, system, electronic device and storage medium for supplementing code annotation information |
Also Published As
Publication number | Publication date |
---|---|
CN113705238B (en) | 2022-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113705238B (en) | Method and system for analyzing aspect level emotion based on BERT and aspect feature positioning model | |
CN110929030B (en) | Text abstract and emotion classification combined training method | |
CN110210037B (en) | Syndrome-oriented medical field category detection method | |
CN117453921B (en) | Data information label processing method of large language model | |
Zhang et al. | Aspect-based sentiment analysis for user reviews | |
CN117574904A (en) | Named entity recognition method based on contrast learning and multi-modal semantic interaction | |
CN113255366B (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
Deroy et al. | Question generation: Past, present & future | |
CN115510230A (en) | Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism | |
Lakizadeh et al. | Text sentiment classification based on separate embedding of aspect and context | |
CN116757195B (en) | Implicit emotion recognition method based on prompt learning | |
Fan | The Entity Relationship Extraction Method Using Improved RoBERTa and Multi-Task Learning. | |
Zengeya et al. | A Review of State of the Art Deep Learning Models for Ontology Construction | |
CN115964497A (en) | Event extraction method integrating attention mechanism and convolutional neural network | |
CN115169429A (en) | Lightweight aspect-level text emotion analysis method | |
Sahoo et al. | Comparative Analysis of BERT Models for Sentiment Analysis on Twitter Data | |
Li et al. | Sentiment Analysis of User Comment Text based on LSTM | |
CN114595324A (en) | Method, device, terminal and non-transitory storage medium for power grid service data domain division | |
Mouthami et al. | Text Sentiment Analysis of Film Reviews Using Bi-LSTM and GRU | |
Wang et al. | Event extraction via dmcnn in open domain public sentiment information | |
Laurelli | Adaptive Meta-Domain Transfer Learning (AMDTL): A Novel Approach for Knowledge Transfer in AI | |
Jiang et al. | DCASAM: advancing aspect-based sentiment analysis through a deep context-aware sentiment analysis model | |
Seema | Deep learning approaches for sentiment analysis challenges and future issues | |
Berkani et al. | Sentiment deep learning algorithm for multi-criteria recommendation | |
Sindhu et al. | Deep Learning for Sentiment Analysis: Exploring the Power of Deep Learning Techniques in Opinion Mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20211126 Assignee: Pubei County Hongwei Nut Planting Co.,Ltd. Assignor: WUZHOU University Contract record no.: X2023980046051 Denomination of invention: Aspect level sentiment analysis method and system based on BERT and aspect feature localization model Granted publication date: 20221108 License type: Common License Record date: 20231108 |
|
EE01 | Entry into force of recordation of patent licensing contract |