CN115659995B

CN115659995B - Text emotion analysis method and device

Info

Publication number: CN115659995B
Application number: CN202211718347.0A
Authority: CN
Inventors: 许辉鹏
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-05-23
Anticipated expiration: 2042-12-30
Also published as: CN115659995A

Abstract

The embodiment of the application provides a text emotion analysis method and device, relates to the field of machine learning algorithms, and can improve accuracy of emotion type analysis results. The method comprises the following steps: obtaining comment information and source information corresponding to the comment information; inputting comment information and source information corresponding to the comment information into an emotion analysis model to obtain emotion analysis results of the comment information output by the emotion analysis model; the emotion analysis model is a model obtained by training a preset model through a training data set, wherein the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information, and the preset model comprises a BERT layer, an expert neural network layer, a classifier layer and a normalized index function softmax layer.

Description

Text emotion analysis method and device

Technical Field

The present application relates to the field of machine learning algorithms, and in particular, to a text emotion analysis method and apparatus.

Background

At present, in order to better improve products and improve service quality, enterprises can judge emotional tendency of massive internet public opinion data (comments of users on the internet), and the consumer comments are used for knowing product advantages and disadvantages so as to guide the enterprises to improve the products and improve the service quality.

In the related technology, a method based on a long-term and short-term memory network and an attention mechanism can be adopted to process comments of users, a model based on a self-attention mechanism constructed by a text sequence is obtained, and comments of the users to be analyzed are imported into the model, so that emotion trend analysis of a target text can be completed. However, this method requires a large amount of supervision data, and the accuracy of emotion type analysis results is not high in the case where the data amount of supervision data is small.

Disclosure of Invention

The embodiment of the application provides a text emotion analysis method and device, which can improve the accuracy of emotion type analysis results.

In a first aspect, an embodiment of the present application provides a text emotion analysis method, including: obtaining comment information and source information corresponding to the comment information; inputting comment information and source information corresponding to the comment information into an emotion analysis model to obtain emotion analysis results of the comment information output by the emotion analysis model; the emotion analysis model is a model obtained by training a preset model through a training data set, wherein the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information, and the preset model comprises a BERT layer, an expert neural network layer, a classifier layer and a normalized index function softmax layer.

According to the text emotion analysis method provided by the embodiment of the application, when the emotion analysis is carried out on the comment information through the emotion analysis model, the source information of the comment information is considered, and the emotion analysis result of the comment information can be more accurate. Compared with the prior art, the emotion analysis model can effectively improve the accuracy rate of emotion analysis (for example, can be improved by about 15 percent) and provides more accurate and real-time emotion analysis capability for a service. By accurately analyzing the emotion tendencies of comment information in real time, the recall capability of poor comment can be improved, the efficiency of recall problem work order entering is improved, the efficiency of business intervention poor comment problem is further improved, and the problems of damage to the image of a company (enterprise), loss of clients, slipping of sales and the like are avoided.

In one possible implementation manner, inputting comment information and source information corresponding to the comment information into an emotion analysis model, and obtaining an emotion analysis result of the comment information output by the emotion analysis model includes: inputting comment information into the BERT layer to obtain a first text vector output by the BERT layer; inputting the source information into the BERT layer to obtain an embedded vector of the source information output by the BERT layer; inputting the first text vector and the embedded vector of the source information into an expert neural network layer to obtain a second text vector output by the expert neural network layer; inputting the second text vector into a classifier layer to obtain a plurality of logic values corresponding to comment information output by the classifier layer, wherein each logic value indicates a score of one emotion tendency corresponding to the comment information; inputting a plurality of logic values into the softmax layer to obtain the probability of at least one emotion tendency output by the softmax layer; and determining the emotion analysis result of the comment information according to the probability of the at least one emotion tendency, wherein the emotion analysis result of the comment information is the emotion tendency with the highest probability in the probability of the at least one emotion tendency.

The expert neural network layer can promote learning of emotion analysis models by utilizing the relation between text characters corresponding to comment information of different sources so as to improve emotion analysis accuracy. The classifier layer can keep the feature independence among the comment information of different sources, so that emotion analysis of the comment information of different sources can be more accurate.

In one possible implementation, inputting the first text vector into the expert neural network layer, obtaining the second text vector output by the expert neural network layer includes: respectively inputting the first text vector into a plurality of expert neural networks included in the expert neural network layer to obtain a third text vector respectively output by each expert neural network in the expert neural networks; calculating the routing weight of each expert neural network according to the embedded vector of the source information and the third text vector respectively output by each expert neural network in the plurality of expert neural networks; and carrying out weighted average on the third text vector output by each expert neural network according to the routing weight of each expert neural network to obtain a second text vector.

In one possible implementation, the embedded vector of source information satisfies the following formula:

Wherein,,

an embedded vector representing the source information,

an embedded vector corresponding to each text character included in the source information is represented, and count (source) represents the number of text characters corresponding to the source information.

It should be noted that, the Embedding vector corresponding to the source information may be calculated according to the Embedding vector corresponding to the comment information, and training the emotion analysis model based on the comment information and the Embedding vector corresponding to the source information may accelerate the training speed of the emotion analysis model.

In one possible implementation, the routing weight and the second text vector for each expert neural network satisfy the following formula:

wherein,,

representing the routing weight of the ith expert neural network,

a second text vector is represented and is displayed,

an embedded vector representing source information, m representing comment information as mth comment information to be processed, K representing the number of expert neural networks,

and a third text vector representing the output of the ith expert neural network.

The expert neural networks in the expert neural network layer can promote learning of the emotion analysis model by utilizing the relation between text characters corresponding to comment information of different sources so as to improve emotion analysis accuracy.

In one possible implementation, the classifier layer includes a plurality of classifiers, each classifier of the plurality of classifiers being configured to process comment information corresponding to one of the source information; inputting the second text vector into a classifier layer to obtain a plurality of logic values corresponding to comment information output by the classifier layer, wherein the logic values comprise; and inputting the second text vector into a first classifier in the plurality of classifiers to obtain a logic value corresponding to comment information output by the first classifier, wherein the first classifier corresponds to source information corresponding to the comment information.

Each classifier in the classifier layer can keep the feature independence among comment information of different sources, so that emotion analysis of the comment information of different sources can be more accurate.

In one possible implementation, the probability of at least one emotion tendency output by the softmax layer satisfies the following formula:

wherein,,

the probability of the softmax layer output is represented, k represents the number of selected logic values, the probability of the selected logic values is greater than or equal to a preset threshold, S represents the set of the first k logic values, C is a constant, and C represents the number of classified categories.

The softmax layer may set the probability of a logical value other than the selected logical value to be a constant c (the constant c may be 0), so as to avoid overfitting of the emotion analysis model to a less relevant class probability, and enhance the generalization capability of the emotion analysis model (i.e., the capability of the emotion analysis model to give appropriate emotion analysis results to comment information of different sources may be enhanced).

In a second aspect, an embodiment of the present application provides a training method for emotion analysis model, including: constructing a training data set, wherein the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information; training a preset model based on a training data set to obtain an emotion analysis model, wherein the emotion analysis model is used for performing emotion analysis on comment information, and the preset model comprises a BERT layer, an expert neural network layer, a classifier layer and a normalized exponential function softmax layer.

Based on the method provided by the embodiment of the application, the preset model can be trained based on the training data set to obtain the emotion analysis model. As the training data set comprises the source information of the comment information, the emotion analysis result of the comment information by the emotion analysis model obtained by training according to the training data set is more accurate. Compared with the prior art, the emotion analysis model can effectively improve the accuracy rate of emotion analysis (for example, can be improved by about 15 percent) and provides more accurate and real-time emotion analysis capability for a service.

In one possible implementation, training the preset model based on the training data set, to obtain the emotion analysis model includes: selecting one piece of sample comment information in the training data set, and taking the sample comment information as target comment information; processing the target comment information through a preset model to obtain an emotion classification result of the preset model on the target comment information; calculating a loss value of the preset model based on the difference between the emotion classification result of the preset model on the target comment information and the tag emotion analysis result of the target comment information; and adjusting parameters of the preset model based on the loss value, and returning to the step of selecting one piece of sample comment information in the training data set until the preset model converges, wherein the preset model obtained through training is used as an emotion analysis model.

Based on the method provided by the embodiment of the application, the loss value of the preset model can be calculated based on the difference between the emotion classification result of the preset model on the target comment information and the tag emotion analysis result of the target comment information, and the parameters of the preset model are adjusted according to the loss value. Because the training data set not only comprises a plurality of pieces of sample comment information, but also comprises different source information corresponding to different sample comment information, the emotion analysis model obtained by training according to the training data set can better process the comment information of different sources to obtain more accurate emotion analysis results.

In a third aspect, the present application provides a computer program product which, when run on a computer, causes the computer to perform the method of the first aspect, the second aspect and any one of its possible designs.

In a fourth aspect, embodiments of the present application provide an emotion analysis device, including a processor, the processor being coupled to a memory, the memory storing program instructions that, when executed by the processor, cause the device to implement the method of any one of the first aspect, the second aspect, and any one of the possible designs thereof. The apparatus may be an electronic device or a server device; or may be an integral part of an electronic device or server device, such as a chip.

In a fifth aspect, an embodiment of the present application provides an emotion analysis device, where the device may be functionally divided into different logic units or modules, and each unit or module performs a different function, so that the device performs the method described in any one of the foregoing first aspect, second aspect, and any one of possible design manners thereof.

In a sixth aspect, the present application provides a chip system comprising one or more interface circuits and one or more processors. The interface circuit and the processor are interconnected by a wire. The chip system described above may be applied to an electronic device including a communication module and a memory. The interface circuit is for receiving signals from a memory of the electronic device and transmitting the received signals to the processor, the signals including computer instructions stored in the memory. When executed by a processor, the electronic device may perform the method as described in the first aspect, the second aspect and any one of its possible designs.

In a seventh aspect, the present application provides a computer-readable storage medium comprising computer instructions. When the computer instructions are run on an electronic device or server, the electronic device or server is caused to perform the method as described in the first aspect, the second aspect and any one of its possible designs.

It will be appreciated that the benefits achieved by the computer program product according to the third aspect, the apparatus according to the fourth aspect, the apparatus according to the fifth aspect, the chip system according to the sixth aspect, and the computer readable storage medium according to the seventh aspect provided above may refer to the benefits as in the first aspect, the second aspect, and any possible design manners thereof, and are not described herein again.

Drawings

Fig. 1 is a schematic diagram of comment information provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of a system architecture according to an embodiment of the present disclosure;

fig. 3 is a schematic hardware structure of an apparatus according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of a text emotion analysis method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an emotion analysis model according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an emotion analysis flow provided in an embodiment of the present application;

fig. 7 is a schematic diagram of an embedded vector corresponding to comment information provided in an embodiment of the present application;

FIG. 8 is a schematic diagram of an embedded vector of source information according to an embodiment of the present disclosure;

FIG. 9 is a schematic output diagram of a sparse softmax provided by an embodiment of the present application;

FIG. 10 is a schematic flow chart of a training method for emotion analysis model according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a chip system according to an embodiment of the present application.

Detailed Description

For clarity and conciseness in the description of the embodiments below, a brief introduction to related concepts or technologies is first given:

(1) Encoding-decoding model (encoder-decoder model)

Coding, namely converting an input sequence into a vector with fixed length; decoding is to re-convert the previously generated fixed length vector into an output sequence. The encoding-decoding model is a model applied to the seq2seq problem. The seq2seq problem is simply that one input sequence x is used to generate another output sequence y. seq2seq has many applications such as translation tasks, question-answer tasks, emotion analysis tasks, etc. For example, in a translation task, the input sequence is text to be translated and the output sequence is translated text; in a question-and-answer task, the input sequence is the question posed and the output sequence is the answer. In the emotion analysis task, the input sequence is comment information (public opinion information), and the output sequence is emotion tendency (emotion analysis result) of the comment information.

(2) Transformer model

The transducer model is essentially an Encoder-Decoder architecture. The transducer can be divided into two parts: an encoding component and a decoding component. Wherein the coding component consists of a multi-layer Encoder (Encoder). The decoding component is also composed of decoders (decoders) of the same number of layers. Each encoder consists of two sublayers: self-Attention layer (Self-Attention layer) and location feed forward network (location-wise Feed Forward Network, location-wise FFN). The structure of each encoder is identical, but they use different weight parameters. The decoder also has two layers in the Encoder, but there is also an attention layer between them (i.e., encoder-Decoder Attention), which applies the decoder to the relevant part of the input sentence.

(3) BERT (Bidirectional Encoder Representations from Transformer) model

The BERT model is an Encoder based on a transducer, and the main model structure is a stack of transducers.

BERT is a pre-trained model, the pre-training process is simply as follows: assuming that the training set of the A task exists, the training set of the A task can be used for pre-training the network, learning network parameters on the A task and then storing the network parameters for later use. When a new task (B task) exists, the same network structure is adopted, the parameters learned by the task A can be loaded when the network parameters are initialized, other high-level parameters are randomly initialized, and then the training data of the task B is used for training the network. When the loaded parameters remain unchanged, referred to as "frezen", when the loaded parameters are changed continuously with the training of the B task, referred to as "fine-tuning", the parameters are better adjusted so as to be more suitable for the current B task. Thus, when the training data of the B task is less, a good training network is difficult, but the parameters of the A task training are obtained and are better than the parameters of the B task training only.

The BERT model converts each word in the text into a one-dimensional vector by inquiring a word vector table, and takes the one-dimensional vector as model input, and model output is vector representation after the corresponding fusion full-text semantic information of each word is input. Furthermore, the model input may contain two other parts in addition to the word vector:

1. text vector: the value of the vector is automatically learned in the model training process, is used for describing the global semantic information of the text, and is fused with the semantic information of the single word/word.

2. Position vector: because of the difference in semantic information carried by words/words appearing in different locations of the text (e.g. "I love you" and "I love me"), the BERT model attaches a different vector to each word/word in different locations to distinguish.

There are multiple layers of data in the BERT model, each layer of data indicating a different angle of the input data, such as the syntax or semantics of the input text, and the specific content is not limited herein.

(4) Embedding (Embedding):

is one way to convert discrete variables into a continuous vector representation. In neural networks, embedding can reduce the spatial dimension of a discrete variable, while also meaningfully representing the variable.

In short, e_coding represents an object, which may be a word, a commodity, or the like, by a vector of low dimension. The nature of the packing vector is such that objects corresponding to similar vectors have similar meaning, e.g., the distance between the pattern and the grass is very close, but the distance between the pattern and the dog is far.

(5) softmax function: also called normalized exponential function, the multi-classification result can be displayed in the form of probability. It can be appreciated that probability has two properties: 1) The predicted probability is a non-negative number; 2) The sum of the probabilities of the various predictors is equal to 1. The softmax function may transform the prediction result on negative infinity to positive infinity into a probability in these two steps.

Firstly, softmax can convert the prediction result of the model into an exponential function, and the non-negativity of the probability can be ensured because the value range of the exponential function is zero to positive infinity. And secondly, normalizing the result converted into the exponential function. Dividing the converted result by the sum of all converted results is understood to be the percentage of converted results in the total number, so that approximate probabilities are obtained, and the sum of the probabilities of the various predicted results can be ensured to be equal to 1.

Currently, the ways in which users comment (or evaluate) on products (e.g., electronic products) are diverse. For example, the user may comment on various platforms on the internet, as shown in fig. 1, which is an example of a network comment (bad comment). The user can also feed back the comments of the user on the consumer products to the enterprise through an offline store, or the user can comment on the use feelings of the user on the consumer products through filling out a questionnaire. Enterprises (e.g. equipment manufacturers) can carry out emotion analysis on comments of users, for example, according to defined text emotion tendency categories (good, medium and bad comments), and timely and accurate judgment can be carried out on massive Internet public opinion data (comments of users on the Internet) and emotion tendency of store feedback and questionnaire data by using a text multi-classification method. Consumer reviews are used to learn product advantages and disadvantages to guide enterprises in improving products and improving service quality.

In the related technology, a method based on a long-short-term memory network attention adding mechanism can be adopted, the method is used for processing a target text, an emotion analysis model based on a self-attention mechanism is built by a text sequence, and the text sequence is imported into the model to complete emotion trend analysis of the target text. The method needs a large amount of supervision data, and the accuracy of emotion type analysis results for comment texts of products is not high under the condition that the data amount of the supervision data is small.

In the related technology, emotion analysis can be carried out on text comment information of a product by adopting an emotion analysis method based on BERT, and the method obtains a final model by simply preprocessing comment information of a single source and then finely adjusting the weight of the BERT directly based on an open source. And carrying out emotion analysis on the text comment information of the product according to the final model. The model has poor generalization capability and cannot adapt to the situation of multi-source text comment information.

The embodiment of the application provides a text emotion analysis method, which can accurately analyze emotion of comment information from different sources through an emotion analysis model. The emotion analysis model has strong generalization capability and can adapt to the situation of comment information with multiple sources.

Referring to fig. 2, an embodiment of the present application provides an emotion analysis system architecture, including a data acquisition device 210, a database 220, a training device 230, an execution device 240, a data storage system 250, and the like. The data collection device 210 is configured to collect a training sample, where the training sample may include a plurality of pieces of comment information of the sample and source information corresponding to each piece of comment information. The collected training samples may be stored in database 220. Training device 230 trains emotion analysis models based on training samples in database 220. The trained emotion analysis model may be stored in the processor of execution device 240. The emotion analysis model may be used to perform emotion analysis on comment information to determine whether the comment information is a good comment or a bad comment, or the like. The execution device 240 may be disposed in the cloud server or in the user client. The execution device 240 may invoke data, code, etc. in the data storage system 250 and may store the output data in the data storage system 250. The data storage system 250 may be disposed in the execution device 240, may be disposed independently, or may be disposed in other network entities, and the number may be one or multiple.

Fig. 2 is merely a schematic diagram of a system architecture according to an embodiment of the present application, and the positional relationship among devices, modules, etc. shown in the figure does not constitute any limitation. For example, in FIG. 2, data storage system 250 is external memory to execution device 240, and in some cases, data storage system 250 may be located within execution device 240.

In this embodiment of the present application, the data acquisition device 210, the training device 230, and the execution device 240 may be separate physical devices (e.g., servers), or may be located on the same physical device or a device cluster, which is not limited in this application.

As shown in fig. 3, taking a hardware structure of the data acquisition device 210, the training device 230 or the execution device 240 as an example of a hardware structure of the server 200, the server 200 includes at least one processor 201, a communication line 202, a memory 203 and at least one communication interface 204.

The processor 201 may be a general purpose central processing unit (central processing unit, CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs in accordance with aspects of the present application.

Communication line 202 may include a pathway to transfer information between the aforementioned components.

The communication interface 204 uses any transceiver-like device for communicating with other devices or communication networks, such as ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks, WLAN), etc.

The memory 203 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), a compact disc read-only memory (compact disc read-only memory) or other optical disc storage, optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be stand alone and be coupled to the processor via communication line 202. The memory may also be integrated with the processor.

The memory 203 is used for storing computer-executable instructions for executing the embodiments of the present application, and is controlled by the processor 201 to execute the instructions. The processor 201 is configured to execute computer-executable instructions stored in the memory 203, thereby implementing the abnormal order processing method provided in the following embodiments of the present application.

Alternatively, the computer-executable instructions in the embodiments of the present application may be referred to as application program codes, which are not specifically limited in the embodiments of the present application.

In a particular implementation, as one embodiment, processor 201 may include one or more CPUs, such as CPU0 and CPU1 of FIG. 3.

In a particular implementation, as one embodiment, the server 200 may include multiple processors, such as processor 201 and processor 207 in FIG. 3. Each of these processors may be a single-core (single-CPU) processor or may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

Optionally, the server 200 may also include an output device 205 and an input device 206. The output device 205 communicates with the processor 201 and may display information in a variety of ways. For example, the output device 205 may be a liquid crystal display (liquid crystal display, LCD), a light emitting diode (light emitting diode, LED) display device, a Cathode Ray Tube (CRT) display device, or a projector (projector), or the like. The input device 206 is in communication with the processor 201 and may receive user input in a variety of ways. For example, the input device 206 may be a mouse, a keyboard, a touch screen device, a sensing device, or the like.

The server 200 may be a general purpose device or a special purpose device. In particular implementations, server 200 may be a desktop, laptop, web server, palmtop (personal digital assistant, PDA), mobile handset, tablet, wireless terminal device, embedded device, or device having a similar structure as in fig. 3. The embodiments of the present application are not limited to the type of server 200.

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Wherein, in the description of the present application, unless otherwise indicated, "at least one" means one or more, and "a plurality" means two or more. In addition, in order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

For ease of understanding, the methods provided in the embodiments of the present application are specifically described below with reference to the accompanying drawings.

As shown in fig. 4, an embodiment of the present application provides a text emotion analysis method, applied to an execution device, including:

s101, comment information and source information corresponding to the comment information are obtained.

The source indicated by the source information corresponding to the comment information (i.e., the source of the comment information) may be one of a plurality of sources. Wherein the plurality of sources may include an online source and an offline source. Online sources may include multiple websites (e.g., e-commerce websites, social websites, news websites, search engine websites, etc.), forums (e.g., technical forums, business forums, social forums, etc.), bars (e.g., hundred degree bars), applications (e.g., shopping applications, social applications, news applications, etc.), and the like. The offline sources may include store feedback, offline questionnaires, and the like. The comment (evaluation) information of the product (commodity) issued by the consumer through the Internet can be obtained, and the comment information of the product issued by the user through an off-line questionnaire can be obtained.

By way of example, the product reviewed by the user may be an electronic product, such as a mobile phone, a tablet computer, a notebook computer, a television, a watch, a bracelet, a sound box, a weight scale, an electric rice cooker, etc., which is not particularly limited in this application.

By way of example, a piece of comment information may include a title (subject) of a comment and a specific description of the comment. For example, the title (subject) of the comment may be a product name (e.g., XX phone), and the specific description of the comment may be performance, appearance description, usage experience, logistic service description, after-sales service description, etc. for XX phone.

It should be appreciated that different sources of comment information typically have different characteristics. For example, comment information from shopping applications (for example, panning) is usually given based on evaluation indexes given by panning, taking a product as a mobile phone as an example, the panning can give evaluation indexes such as endurance, photographing effect, running speed and the like, and a user can evaluate the mobile phone based on the evaluation indexes such as endurance, photographing effect, running speed and the like. Moreover, most of the evaluations on shopping applications are evaluations on new products that have just been purchased, and less evaluations are experienced for use after a while. For another example, comment information from an enterprise forum (e.g., a glory forum) is generally related to the idiomatic and focus directions of users active in that forum, and also related to the way in which posts in the forum are browsed in a higher amount. It can be seen that comment information from different sources has different characteristics, so that it is necessary to consider different sources of different comment information when performing emotion analysis. Based on the method provided by the application, the characteristics of comment information of different sources can be effectively captured.

S102, inputting comment information and source information into the emotion analysis model to obtain emotion analysis results of the comment information output by the emotion analysis model.

The emotion analysis model is a model obtained by training a preset model through a training data set, the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information, and the preset model can comprise at least one of a BERT layer, an expert neural network layer, a classifier layer and a normalized index function softmax layer. The BERT layer is a BERT model, and the BERT model may include a text Embedding module (i.e., an embedded layer), the expert neural network layer may also be referred to as a multi-expert automatic routing module, the classifier layer may also be referred to as a multi-tower classifier module, and the softmax layer may also be referred to as a sparse softmax module.

As shown in fig. 5, first, comment information and source information may be input into the BERT model, and the comment information and the source information of the comment information may be initialized according to a text eimbedding module of the BERT model, to obtain an eimbedding vector of the comment information and an eimbedding vector of the source information. And then, the BERT model processes the assembled vector of the comment information to obtain a first text vector, wherein the first text vector is a semantic analysis result of the comment information. The BERT model may output a first text vector and an assembled vector of source information. And then, the first text vector and the Embedding vector of the source information can be input into a multi-expert automatic routing module, and the multi-expert automatic routing module can perform deep text representation on the first text vector according to the Embedding vector of the source information to obtain a second text vector. The semantics of the second text vector are more accurate than the semantics of the first text vector, i.e. the text representation of the second text vector for comment information is clearer than the first text vector. Then, the second text vector may be input into the multi-tower classifier module, so as to obtain a plurality of logic values corresponding to the comment information output by the multi-tower classifier module, where each logic value indicates a score of one emotion tendency corresponding to the comment information (i.e., the multi-tower classifier module may assign an emotion tendency score to the second text vector corresponding to the comment information). The sparse softmax module can carry out sparse normalization on the emotion tendency scores of the comments to obtain sparse probability distribution of the comment information in the emotion tendency direction. And determining the emotion analysis result of the comment information according to the emotion tendency with the maximum probability.

Illustratively, as shown in FIG. 6, assume that the comment information is: glorious 50, the power consumption increases significantly after the new, and it is not known whether me is individual or someone else. The source information of the comment information is microblog. After the comment information is input into the emotion analysis model, the emotion analysis result output by the emotion analysis model can be a bad comment. For another example, assume that the comment content is: glory 80, the new machine has received, the color value is high, the price is substantial, and the new machine is very worth to get in. The comment is made from naughty Baozi. After the comment is input into the emotion analysis model, the emotion analysis result output by the emotion analysis model can be good.

The following describes the process of converting comment information and source information into an editing vector by the text editing module:

it will be appreciated that comment information is made up of discrete text characters, and that converting comment information into an assembled vector converts discrete text characters into a numeric dense vector.

The comment information and the source information can be represented by text characters, and the translated Embedding vector of the comment information and the source information can coexist in the same representation space. Training the emotion analysis model based on the Embedding vectors corresponding to the comment information and the source information can accelerate the training speed of the emotion analysis model.

The text corresponding to the comment information and the source information can be input into the text editing module to obtain the editing vector corresponding to the comment information and the source information. Since the text corresponding to the source information is generally covered by the text corresponding to the comment information, the embedded vector corresponding to the source information may be calculated according to the embedded vector corresponding to the comment information, as shown in formula (1).

Formula (1)

Wherein,,

an Embedding vector representing source information,

the count (source) may represent the number (number) of text characters corresponding to the source information. Of course, count (source) may be a preset value, or may be proportional to the number of text characters corresponding to the source, which is not limited in this application.

It should be noted that the definition will be

And count (source), can ensure

Avoiding the text of the source (source) of comments to be more, resulting in

And

too large, thereby causing problems affecting the accuracy of the BERT model.

The reference vector of each text corresponding to the source information may be an reference vector of the source, that is, the reference vector of the source information may be determined according to the reference vector of the source. Or, the embedded vector of each text corresponding to the source information may be set by itself, and the embedded vector of the source information may be determined according to the embedded vector set by itself. Alternatively, the embedded vector of each text corresponding to the source information and the embedded vector of the source information may be obtained by pre-training using algorithms such as Word2Vec and Glove, or may be obtained by training in a transducer, which is not limited in this application.

As shown in fig. 7, an example of an embodiment of the invention is shown. Where h represents the dimension of the open source vector and m represents the number of text characters. Wherein the text characters may be individual chinese characters (e.g., chinese characters: good, micro, blogs, flares, royalties, etc. shown in fig. 6). Alternatively, the text character may be a word or phrase, or the text character may be a word composed of a plurality of chinese characters, which is not limited in this application.

As shown in fig. 8, an example of an Embedding vector representation corresponding to the source information is shown. Wherein h represents the dimension of the Embedding vector corresponding to the source information, and n represents the number of the source information. The reference vector corresponding to the source information may be determined according to the reference vector corresponding to the comment information and formula (1).

For example, assuming that the source information is a microblog (i.e., the source of the comment is a microblog), the Embedding vector representation of the microblog may be determined according to equation (2).

Formula (2)

That is, the Embedding vector of the microblog is the quotient of the sum of the Embedding vector of the text "micro" and the Embedding vector of the text "blog" and the number of texts corresponding to the "microblog" (i.e., 2).

As shown in fig. 7, the text "micro" has an Embedding vector of (0.212,0.677, … …, … …,0.546,0.282), the text "blog" has an Embedding vector of (0.342,0.233, … …, … …,0.313, -0.821), and as shown in fig. 8, the "micro blog" can be obtained according to formula (2) to have an Embedding vector of (0.277,0.455, …,0.429, -0.269).

The following describes a process of the multi-expert automatic routing module obtaining a second text vector according to the first text vector and an Embedding vector corresponding to the source information:

the multi-expert automatic routing module can be used as a multi-source information sharing module, and can utilize the relation between text characters corresponding to comment information of different sources to promote the learning of emotion analysis models so as to improve emotion analysis accuracy.

Wherein the multi-expert automatic routing module may comprise n experts (the experts may also be referred to as expert neural networks), n being greater than 1. All sample comment information can be mapped to n experts, so that for each sample comment information, the routing weights thereof on different experts can be obtained. I.e. for each sample review information there is a corresponding routing weight on n experts.

As shown in formula (3), the routing weight may be calculated by using the Embedding of the source information and the text vectors (the third text vector, which may also be referred to as text representation) output by a plurality of experts, and the expert corresponding to the maximum weight is the master expert for processing the first text vector. Then, as shown in formula (4), the text vectors output by the experts can be weighted and averaged by using the routing weights to obtain a second text vector commonly decided by the experts.

Formula (3)

Formula (4)

Wherein K represents the number of experts, and m represents the current comment informationThe information is the m th comment information to be processed,

represents a source representation of the comment information,

representing the text vector output by the ith expert,

a second text vector representing a multi-expert joint decision.

The multi-expert automatic routing module utilizes the relation between texts corresponding to comment information of different sources to promote the learning of emotion analysis models, and can improve emotion analysis accuracy. The problem that characteristics of comment information from different sources cannot be effectively captured because the comment information from different sources has different characteristics is avoided by independently modeling the comment information from a single source. And the problem of poor emotion analysis accuracy caused by the fact that networks cannot learn common behavior characteristics due to different characteristics among comment information of different sources is avoided. Moreover, in the case of a relatively large number of sources, modeling and maintenance for each source independently can result in significant human and resource consumption. The multi-expert automatic routing module provided by the application can learn the relation between texts corresponding to comment information of different sources, and can improve emotion analysis accuracy.

The following describes a process in which the multi-tower classifier obtains a plurality of logical values corresponding to comment information (each logical value indicating a score of one emotion tendency corresponding to comment information) according to the second text vector:

a text vector (i.e., a second text vector, which may also be referred to as a classification vector) commonly decided by multiple experts may be input to the multi-tower classifier. A multi-tower classifier may include multiple towers, each of which may be considered a classifier (classification model). Each tower corresponds to one kind of source information and comment information (comment information associated with the source information), that is, each classifier is used for processing comment information corresponding to one kind of source information. The multi-tower classifier can keep the feature independence among the comment information of different sources, so that the emotion analysis of the comment information of different sources can be more accurate.

The process of probability output of emotion analysis results by a sparse-softmax (sparse-softmax) module is described below.

As shown in fig. 9, the logical value of the output of a certain tower (e.g., tower 1) in the multi-tower classifier may be input into the sparse softmax module. The sparse softmax module may output probabilities (e.g., 0.02 and 0.98, respectively) corresponding to the first k (e.g., k is 2) logical values.

As shown in equation (5), the sparse softmax module has two superparameters, the first being k, representing the pre-selected k (top k) logical values for the normal softmax calculation. The probability of the first k logical values being greater than or equal to a preset threshold (e.g., the preset threshold may be 0.01). The second hyper-parameter is a constant c, indicating that the probability corresponding to the latter logical value (logical value other than the first k logical values) is uniformly set to the constant c. The probability corresponding to the logical values other than the first k logical values is smaller than a preset threshold. For example, the following logic value can be directly set to 0 (the default constant c is 0), so as to cut off, avoid the probability of overfitting the emotion analysis model to the class which is not relevant, and improve the generalization capability of the emotion analysis model (namely, the capability of the emotion analysis model to give suitable emotion analysis results to comment information of different sources can be improved).

Formula (5)

Wherein,,

representing the probability of the sparse softmax module output, k represents the number of logical values selected for the normal softmax calculation, S represents the set of the first k logical values, C represents the probability of the latter logical value (e.g., C may be 0), and C represents the number of categories of classification. For example, the number of categories of classification may include, for example: at least one of good, medium and bad scores Two.

It should be noted that, the emotion analysis result may include probabilities that comment information is positive and negative. Alternatively, the emotion analysis result may include probabilities that comment information is positive, negative, and neutral. For example, if the probability of positive is 90%, the probability of negative is 0%, and the probability of neutral is 10% in the emotion analysis result of one piece of comment information, the comment information can be determined to be a good score.

The multi-source comment information processing method and device based on the multi-expert routing module and the multi-tower classifier module can ensure that public features and private features of the multi-source comment information are clear, and can better cope with the multi-source comment information. The emotion tendency probability is obtained by using the self-defined sparse softmax module, so that the method has better generalization capability. The emotion analysis model provided by the application can improve the emotion analysis accuracy by about 15%, and provides more accurate and real-time emotion analysis capability for the service. By accurately analyzing the emotion tendencies of comment information in real time, the recall capability of poor comment can be improved, the efficiency of recall problem work order entering is improved, the efficiency of business intervention poor comment problem is further improved, and the problems of damage to the image of a company (enterprise), loss of clients, slipping of sales and the like are avoided.

The above description uses emotion analysis models to perform emotion analysis on comment information of a certain product from different sources. In practical applications, the application scenario of the emotion analysis model includes but is not limited to emotion analysis related to a commodity/product (for example, a certain electronic product), but also can be applied to scenes such as emotion analysis of search engine results (search content is passive, neutral, active, etc.), emotion analysis of music, emotion analysis of video (movies, television drama, and various shows), emotion analysis of stock, etc., that is, emotion analysis objects in various application scenarios can be a certain commodity, or video, or music, or stock, etc., which is not limited in this application.

As shown in fig. 10, the embodiment of the present application further provides an emotion analysis model training method, which is applied to training equipment, and includes the following steps:

s1, constructing a training data set, wherein the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information.

Alternatively, a plurality of pieces of comment information related to the product can be crawled from a forum, a website, a bar, an application program and the like through a web crawler technology to serve as sample comment information, and meanwhile source information corresponding to each piece of sample comment information can be obtained. The descriptions of the comment information and the source information may refer to step S101, and will not be described herein.

Alternatively, the plurality of pieces of sample data may be generated within a specified period of time, which may be, for example, one month or one week, and the present application is not particularly limited.

And S2, training the preset model based on the training data set to obtain an emotion analysis model.

In the embodiment of the application, the emotion analysis model is used for performing emotion analysis on the input comment information so as to determine an emotion analysis result of the comment information.

The preset model may include a BERT layer, an expert neural network layer, a classifier layer, and a softmax layer, among others.

The BERT layer may be a BERT model, and the BERT model may be replaced by another deep neural network (deep neural networks, DNN) model, a convolutional neural network (convolutional neural networks, CNN) model, or a recurrent neural network (recurrent neural network, RNN) model, which is not specifically limited in this application. Alternatively, the BERT model may be a model that has been pre-trained.

In one possible design, the training process for emotion analysis models may include: selecting one piece of sample comment information in the training data set, and taking the sample comment information as target comment information; processing the target comment information through a preset model to obtain an emotion classification result of the preset model on the target comment information; calculating a loss value of the preset model based on the difference between the emotion classification result of the preset model on the target comment information and the tag emotion analysis result of the target comment information; and adjusting parameters of the preset model based on the loss value, and returning to the step of selecting one piece of sample comment information in the training data set until the preset model converges, wherein the preset model obtained through training is used as an emotion analysis model.

The model convergence condition may be set according to actual requirements, for example, the model convergence condition may be that the loss value is smaller than a preset loss threshold, or the training frequency reaches a preset frequency.

Embodiments of the present application also provide a chip system, as shown in fig. 11, comprising at least one processor 1101 and at least one interface circuit 1102. The processor 1101 and interface circuit 1102 may be interconnected by wires. For example, interface circuit 1102 may be used to receive signals from other devices (e.g., a memory of a server). For another example, the interface circuit 1102 may be used to send signals to other devices (e.g., the processor 1101).

For example, the interface circuit 1102 may read instructions stored in a memory in the server and send the instructions to the processor 1101. The instructions, when executed by the processor 1101, may cause a server (such as the server 200 shown in fig. 3) to perform the steps of the above-described embodiments.

Of course, the chip system may also include other discrete devices, which are not specifically limited in this embodiment of the present application.

Embodiments of the present application also provide a computer-readable storage medium including computer instructions that, when executed on a server (such as server 200 shown in fig. 3), cause server 200 to perform the functions or steps performed by the execution device or training device in the above-described method embodiments.

Embodiments of the present application also provide a computer program product which, when run on a computer, causes the computer to perform the various functions or steps performed by the apparatus or training apparatus in the method embodiments described above.

The embodiment of the application also provides an emotion analysis device, which can be divided into different logic units or modules according to functions, and each unit or module executes different functions, so that the emotion analysis device executes each function or step executed by the execution device or training device in the embodiment of the method.

From the description of the above embodiments, it will be apparent to those skilled in the art that the above functional allocation may be performed by different functional modules, i.e., the internal structure of the apparatus is divided into different functional modules, as needed, to perform all or part of the functions described above.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of text emotion analysis, comprising:

acquiring comment information and source information corresponding to the comment information;

inputting the comment information and source information corresponding to the comment information into an emotion analysis model to obtain emotion analysis results of the comment information output by the emotion analysis model;

the emotion analysis model is a model obtained by training a preset model through a training data set, the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information, and the preset model comprises a BERT layer, an expert neural network layer, a classifier layer and a normalized exponential function softmax layer;

inputting the comment information and source information corresponding to the comment information into an emotion analysis model, and obtaining emotion analysis results of the comment information output by the emotion analysis model comprises the following steps:

Inputting the evaluation information into the BERT layer to obtain a first text vector output by the BERT layer;

inputting the source information into the BERT layer to obtain an embedded vector of the source information output by the BERT layer;

the embedded vector of the source information satisfies the following formula:

wherein (1)>

An embedded vector representing said source information, < >>

An embedded vector corresponding to each text character included in the source information is represented, and count (source) represents the number of text characters corresponding to the source information;

inputting the first text vector and the embedded vector of the source information into the expert neural network layer to obtain a second text vector output by the expert neural network layer;

inputting the second text vector into the classifier layer to obtain a plurality of logic values corresponding to the evaluation information output by the classifier layer, wherein each logic value indicates a score of one emotion tendency corresponding to the evaluation information;

inputting the logic values into the softmax layer to obtain the probability of at least one emotion tendency output by the softmax layer;

and determining the emotion analysis result of the comment information according to the probability of the at least one emotion tendency, wherein the emotion analysis result of the comment information is the emotion tendency with the highest probability in the probability of the at least one emotion tendency.

2. The method of claim 1, wherein inputting the first text vector into the expert neural network layer to obtain a second text vector output by the expert neural network layer comprises:

respectively inputting the first text vector into a plurality of expert neural networks included in the expert neural network layer to obtain a third text vector respectively output by each expert neural network in the expert neural networks;

calculating the routing weight of each expert neural network according to the embedded vector of the source information and the third text vector respectively output by each expert neural network in the plurality of expert neural networks;

and carrying out weighted average on the third text vector output by each expert neural network according to the routing weight of each expert neural network to obtain the second text vector.

3. The method according to claim 1 or 2, wherein the routing weight of each expert neural network and the second text vector satisfy the following formula:

wherein (1)>

Routing weight representing the ith expert neural network,/->

Representing the second text vector, +.>

An embedded vector representing the source information, m representing the comment information as the mth comment information to be processed, and K representing Number of expert neural networks, E _i And a third text vector representing the output of the ith expert neural network.

4. The method according to claim 1 or 2, wherein the classifier layer comprises a plurality of classifiers, each classifier of the plurality of classifiers being used for processing comment information corresponding to one kind of source information;

inputting the second text vector into the classifier layer, and obtaining a plurality of logic values corresponding to the evaluation information output by the classifier layer, wherein the logic values comprise;

and inputting the second text vector into a first classifier in the plurality of classifiers to obtain a logic value corresponding to the evaluation information output by the first classifier, wherein the first classifier corresponds to the source information corresponding to the evaluation information.

5. The method of claim 1 or 2, wherein the probability of at least one emotional tendency output by the softmax layer satisfies the following formula:

wherein (1)>

Representing the probability of the softmax layer output, k representing the number of selected logic values, the probability of the selected logic values being greater than or equal to a preset threshold, S representing the set of the first k logic values, C being a constant, C representing the number of classified categories, j e S, z _i Represents the ith logical value, z _j Representing the jth logical value.

6. A method for training an emotion analysis model, comprising:

constructing a training data set, wherein the training data set comprises a plurality of pieces of sample comment information and source information corresponding to each piece of sample comment information;

training a preset model based on the training data set to obtain an emotion analysis model, wherein the emotion analysis model is used for performing emotion analysis on the comment information according to comment information and source information corresponding to the comment information, and the preset model comprises a BERT layer, an expert neural network layer, a classifier layer and a normalized exponential function softmax layer;

the emotion analysis model is used for performing emotion analysis on the comment information according to the comment information and source information corresponding to the comment information, and comprises the following steps:

the embedded vector of the source information satisfies the following formula:

wherein (1)>

An embedded vector representing said source information, < > >

7. The method of claim 6, wherein training a predetermined model based on the training dataset comprises:

selecting one piece of sample comment information in the training data set, and taking the sample comment information as target comment information;

Processing the target comment information through the preset model to obtain an emotion classification result of the preset model on the target comment information;

calculating a loss value of the preset model based on the difference between the emotion classification result of the preset model on the target comment information and the tag emotion analysis result of the target comment information;

and adjusting parameters of the preset model based on the loss value, and returning to the step of selecting one piece of sample comment information in the training data set until the preset model converges, wherein the preset model obtained through training is used as the emotion analysis model.

8. A computer-readable storage medium comprising computer instructions;

the computer instructions, when run on an electronic device, cause the electronic device to perform the method of any one of claims 1-6.

9. An emotion analysis device comprising a processor coupled to a memory, the memory storing program instructions that when executed by the processor cause the device to implement the method of any one of claims 1-6.