CN111949791A - Text classification method, device and equipment - Google Patents
Text classification method, device and equipment Download PDFInfo
- Publication number
- CN111949791A CN111949791A CN202010735443.0A CN202010735443A CN111949791A CN 111949791 A CN111949791 A CN 111949791A CN 202010735443 A CN202010735443 A CN 202010735443A CN 111949791 A CN111949791 A CN 111949791A
- Authority
- CN
- China
- Prior art keywords
- answer
- text information
- word vector
- feature representation
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 239000013598 vector Substances 0.000 claims abstract description 205
- 238000012545 processing Methods 0.000 claims abstract description 57
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 230000015654 memory Effects 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 8
- 238000013145 classification model Methods 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000007635 classification algorithm Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 abstract description 7
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000007246 mechanism Effects 0.000 description 12
- 239000000047 product Substances 0.000 description 7
- 239000013604 expression vector Substances 0.000 description 5
- 238000012549 training Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013499 data model Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 229910021389 graphene Inorganic materials 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification relates to the technical field of artificial intelligence data processing, and discloses a text classification method, a device and equipment, wherein the method comprises the steps of carrying out feature extraction on target problem text information to obtain a first feature representation of the target problem text information; generating word vectors for all words in the appointed answer text information corresponding to the target question text information to obtain answer word vectors corresponding to all words in the appointed answer text information; for any answer word vector, taking the first feature representation as constraint information, and performing weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector; coding each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information; and determining a classification result of the answer text information by using the second feature representation. Therefore, the accuracy of effective answer screening can be further improved.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence data processing technologies, and in particular, to a text classification method, apparatus, and device.
Background
In application scenarios such as software release evaluation or service question answering, the platform can pre-configure a series of questions, and correspondingly, the user can answer the questions. Alternatively, the user may initiate a question on the platform, which may be answered by other users or platform service personnel. The platform can obtain feedback information of a user to a certain service or software application by analyzing answers corresponding to different questions. For a certain question, a plurality of answers may be provided, and some answers may be asked questions or have little meaning. The platform usually needs to examine the answers corresponding to the questions first to screen out more effective answers so as to know the feedback of the user more accurately and quickly.
Currently, the validity of each answer is usually determined by analyzing the answer directly or by stitching the answer and the question together. However, in practical applications, the answers are usually corresponding to the questions, and it is difficult to evaluate the validity of each answer only by analyzing the answers. However, in view of the limitation of the reference answer library and the complexity and variability of the expression forms of the user answers, the answers which are actually effective but have larger differences between the expression forms and the expression forms of the reference answers are easier to exclude in the actual processing, so that the accuracy of determining the effective answers is influenced. The deep learning algorithm considering the context semantic information is mostly applied to the field of dialog generation because the context information related to the deep learning algorithm is required to have strong self-correlation, and is difficult to be directly transferred to a question and answer application scene for use. Therefore, a method for classifying question and answer texts, which is more accurate and efficient, is urgently needed in the technical field.
Disclosure of Invention
The embodiment of the specification aims to provide a text classification method, a text classification device and text classification equipment, which can further improve the accuracy of effective answer screening.
The specification provides a text classification method, a text classification device and text classification equipment, which are realized in the following modes:
a text classification method is applied to a server, and comprises the following steps:
performing feature extraction on target problem text information to obtain a first feature representation corresponding to the target problem text information;
generating word vectors for all words in the appointed answer text information corresponding to the target question text information to obtain answer word vectors corresponding to all words in the appointed answer text information;
for any answer word vector, taking the first feature representation as constraint information, and performing weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
coding each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information;
and determining a classification result of the answer text information by using the second feature representation.
In other embodiments of the method provided in this specification, the extracting features of the target question text information includes:
generating word vectors for all words and phrases in the target problem text information to obtain problem word vectors corresponding to all words and phrases in the target problem text information;
and coding each problem word vector corresponding to the target problem text information to obtain a first feature representation corresponding to the target problem text information.
In other embodiments of the method provided herein,
the weighting processing of the answer word vector by using the first feature representation as constraint information and an attention model comprises the following steps:
using the first feature representation as constraint information of an attention model, using the answer word vector as a value of the attention model, and inputting the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation;
and calculating the product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector to obtain the weighted word vector corresponding to the corresponding answer word vector.
In still other embodiments of the method provided in this specification, the determining a classification result of the text information of the specified answer using the second feature representation includes:
and inputting the second feature representation into a pre-constructed classification model to obtain a classification result of the text information of the specified answer, wherein the classification model is constructed by adopting a classification algorithm.
In other embodiments of the method provided in this specification, the encoding processing on each weighted word vector corresponding to the text information of the specified answer includes:
and coding each weighted word vector corresponding to the text information of the specified answer by using an LSTM algorithm.
In other embodiments of the method provided in this specification, before generating the word vector for each vocabulary in the target question text information, the method further includes:
and performing word segmentation on the target question text information and the specified answer text information corresponding to the target question text information to obtain the specified answer text information and one or more words corresponding to the specified answer text information.
On the other hand, the embodiment of this specification also provides a text classification device, apply to the server, the device includes:
the system comprises a characteristic extraction module, a first characteristic representation module and a second characteristic representation module, wherein the characteristic extraction module is used for extracting characteristics of target problem text information to obtain a first characteristic representation corresponding to the target problem text information;
a word vector generation module, configured to perform word vector generation on each vocabulary in the specified answer text information corresponding to the target question text information, and obtain an answer word vector corresponding to each vocabulary in the specified answer text information;
the relevance analysis module is used for weighting any answer word vector by taking the first feature representation as constraint information and utilizing an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
the coding processing module is used for coding each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information;
and the classification module is used for determining a classification result of the text information of the specified answer by using the second feature representation.
In other embodiments of the apparatus provided in this specification, the feature extraction module includes:
the word vector generating unit is used for generating word vectors for all words and phrases in the target problem text information to obtain problem word vectors corresponding to all words and phrases in the target problem text information;
and the coding processing unit is used for coding each problem word vector corresponding to the target problem text information to obtain a first feature representation corresponding to the target problem text information.
In other embodiments of the apparatus provided in this specification, the weighting processing module is configured to use the first feature representation as constraint information of an attention model, use the answer word vector as a value of the attention model, and input the value into the attention model to obtain a correlation coefficient of the answer word vector with respect to the first feature representation; and calculating the product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector to obtain the weighted word vector corresponding to the corresponding answer word vector.
In another aspect, an embodiment of the present specification further provides a text classification device applied to a server, where the device includes at least one processor and a memory for storing processor-executable instructions, and the instructions, when executed by the processor, implement steps including any one or more of the above-mentioned methods.
The text classification method, the text classification device, and the text classification equipment provided in one or more embodiments of the present specification may obtain a first feature representation of semantic information representing question text information, and an answer word vector corresponding to each vocabulary in any answer text information corresponding to the question text information. Then, by using an attention mechanism, the first feature representation is used as constraint information of the attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is applied to the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Accordingly, the weighted word vector incorporates the semantic information of the question text information. Then, encoding processing may be performed on each weighted word vector to obtain a second feature representation corresponding to the answer text information, so as to perform answer classification processing by using the second feature representation. Therefore, by using the embodiments of the specification, the logical relationship between the question and the answer and the semantic influence of the question on the answer can be effectively considered, and the answer representation fused with the question semantic information is obtained. And then, classifying the answer text information by using the second feature representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort. In the drawings:
fig. 1 is a schematic flowchart of an embodiment of a text classification method provided in this specification;
FIG. 2 is a schematic diagram of a text classification flow in one embodiment provided herein;
fig. 3 is a schematic block diagram of another text classification apparatus provided in this specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the specification, and not all embodiments. All other embodiments obtained by a person skilled in the art based on one or more embodiments of the present specification without making any creative effort shall fall within the protection scope of the embodiments of the present specification.
In a scenario example provided by the embodiment of the present specification, the text classification method may be applied to a server that performs question-answer type text classification in an application scenario such as software release evaluation or service question-answer. The server may refer to one server or a server cluster composed of a plurality of servers.
For a certain target question and one or more answers corresponding to the target question, the server may perform feature extraction on target question text information to obtain a first feature representation corresponding to the target question text information. Then, word vector generation may be performed on the specified answer text information corresponding to the target question text information, so as to obtain word vectors corresponding to words in the specified answer text information. The first feature representation of semantic information representing the question text information and answer word vectors corresponding to words in any answer text information corresponding to the question text information can be obtained. Then, by using an attention mechanism, the first feature representation is used as constraint information of the attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is applied to the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Accordingly, the weighted word vector incorporates the semantic information of the question text information. Then, encoding processing may be performed on each weighted word vector to obtain a second feature representation corresponding to the answer text information, so as to perform answer classification processing by using the second feature representation.
Therefore, by using the embodiments of the specification, the logical relationship between the question and the answer and the semantic influence of the question on the answer can be effectively considered, and the answer representation fused with the question semantic information is obtained. And then, classifying the answer text information by using the second feature representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Fig. 1 is a schematic flow chart of an embodiment of the text classification method provided in this specification. Although the present specification provides the method steps or apparatus structures as shown in the following examples or figures, more or less steps or modules may be included in the method or apparatus structures based on conventional or non-inventive efforts. In the case of steps or structures which do not logically have the necessary cause and effect relationship, the execution order of the steps or the block structure of the apparatus is not limited to the execution order or the block structure shown in the embodiments or the drawings of the present specification. When the described method or module structure is applied to a device, a server or an end product in practice, the method or module structure according to the embodiment or the figures may be executed sequentially or in parallel (for example, in a parallel processor or multi-thread processing environment, or even in an implementation environment including distributed processing and server clustering). Fig. 1 shows a specific embodiment, and in an embodiment of the text classification method provided in this specification, the method may be applied to the server, and the method may include the following steps:
s20: and performing feature extraction on the target problem text information to obtain a first feature representation corresponding to the target problem text information.
For a certain target question and one or more answers corresponding to the target question, the server may obtain text information of the target question corresponding to the target question. For example, the target question text message may be "X credit card XXX presents a question asking how it should be processed".
Then, the server may perform feature extraction on the target problem text information to obtain a first feature representation corresponding to the target problem text information. The server can map the target question text information to a numerical semantic space to obtain the representation information of the target question text information in the numerical semantic space so as to carry out computer processing on the target question text information. Correspondingly, the first feature corresponding to the target question text information is represented in a numerical representation form carrying semantic features of the target question text information.
In some embodiments, word vector generation may be performed on each vocabulary in the target problem text information, so as to obtain a problem word vector corresponding to each vocabulary in the target problem text information. Then, the multiple question word vectors may be encoded to obtain a first feature representation corresponding to the target question text information.
In some embodiments, the server may perform word segmentation on the target problem text information to obtain one or more words corresponding to the target problem text information. For example, jieba word segmentation, SnowNLP, THULAC, NLPIR, etc. may be used to perform word segmentation. In other embodiments, the processing of removing stop words and the like can be performed simultaneously, so that the interference information is reduced.
For each word obtained by division, the server may further generate a word vector corresponding to each word. The word vector is a numerical representation corresponding to each vocabulary. I.e. mapping each vocabulary into a digitized semantic space for computer processing. The word vectors corresponding to the words can be generated by using a statistical method or a language model method. For example, word vectors corresponding to the words in the target question text information may be generated using Skip-gram, CBOW, LBL, NNLM, C & W, GloVe, and the like. For example, a 300-dimensional Glove can be used to generate a word vector corresponding to each vocabulary. For convenience of expression, the word vector corresponding to the target question text information may be described as a question word vector, and the word vector corresponding to the answer text information may be described as an answer word vector.
Then, the server may perform encoding processing on the plurality of problem word vectors to obtain a first feature representation corresponding to the target problem text information. For example, the extracted word vector may be input into a sequence encoder LSTM (Long Short-Term Memory network), and subjected to semantic compression, and the output of the last hidden layer of the LSTM may be taken as the first feature representation corresponding to the target problem text information. When the word vectors are coded by using the LSTM, at each moment, the output coded vectors not only depend on the input of the current moment, but also consider the state of the model at the previous moment, and through the historical dependency relationship, the context dependency information of each word of the target problem text information can be more effectively represented by the first feature representation obtained after the coding processing, so that the semantic information expressed by the target problem text information can be more effectively represented. Of course, in practical application, other algorithms may be used to perform the encoding process, such as RNN (Recurrent Neural Network) and the like.
In the above embodiment, the importance degree of each vocabulary in the answer text information with respect to the target question text information can be determined more simply and conveniently by performing vocabulary division first and then determining the feature representation corresponding to the target question text information based on the word vector corresponding to each vocabulary.
S22: and generating word vectors for all words in the appointed answer text information corresponding to the target question text information to obtain answer word vectors corresponding to all words in the appointed answer text information.
For any one of one or more answers corresponding to a target question, the server may obtain answer text information of the answer. The answer text information may be "learned", "is XX meaning" and "XXX processing should be performed on the X credit card", for example. Among them, "learned", "is XX means" belongs to answers which do not solve the question and answers which are not asked. And "XXX processing should be performed on X credit cards" belongs to a valid answer to the question. Correspondingly, the text information of the specified answer may be text information of an answer of any one to-be-determined category of one or more answers corresponding to the target question.
Then, the server may generate word vectors corresponding to words in the specified answer text information, and obtain question word vectors corresponding to words in the specified answer text information. The method for generating the word vector corresponding to each vocabulary in the specified answer text information may be implemented with reference to the method for generating the word vector of the vocabulary in the target question text information. In some embodiments, the word vector representation space of the specified answer text information may be set to be consistent with the word vector representation space in the target question text information, and a 300-dimensional Glove word vector may also be used for extraction. By setting the two representation space dimensions to be the same, data processing can be facilitated.
S24: and for any answer word vector, taking the first feature representation as constraint information, and performing weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector.
For any answer word vector, the server may use the first feature representation as constraint information, and perform weighting processing on the answer word vector by using the attention model to obtain a weighted word vector corresponding to the corresponding answer word vector. The Attention model may refer to a model constructed using an Attention (Attention) mechanism.
In some embodiments, for any answer word vector, the server may input the attention model with the first feature representation as constraint information of the attention model and with the answer word vector as a value of the attention model, to obtain a correlation coefficient of the answer word vector with respect to the first feature representation. Then, the product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector may be calculated to obtain the weighted word vector corresponding to the corresponding answer word vector. Or, after the product of the two is calculated, the product value may be further normalized to obtain the weighted word vector corresponding to the corresponding answer word vector. The calculation process of the correlation coefficient in the attention model may be performed by referring to an attention mechanism processing method, which is not described herein again.
The correlation coefficient represents the correlation of each vocabulary in the answer text information relative to the target question text information. The larger the correlation coefficient is, the stronger the correlation between the vocabulary in the answer and the semantic information of the target question is, and the smaller the correlation coefficient is, the weaker the correlation between the vocabulary in the answer and the semantic information of the target question is. By calculating the correlation between the two words and taking the correlation coefficient as the weight of the corresponding word vector when each word vector of the answer is utilized to perform semantic coding processing, the feature representation of the vocabulary with strong correlation with the semantic information of the target question can be further highlighted, the feature representation of the vocabulary with weak correlation with the semantic information of the target question is weakened, and the feature representation of the answer is enabled to be effectively fused with the semantic information of the target question, namely, the logic relationship between the target question and the specified answer and the semantic influence of the target question on the specified answer can be effectively considered in the feature representation of the answer.
And then, the answer feature expression fused with the semantic information of the target question is utilized to classify the validity of the answer, so that the same answer can be classified into different results under different questions, invalid answers are accurately eliminated, and meanwhile, the valid answer with strong relevance to the question is kept as accurate as possible.
S26: and coding each weighted word vector corresponding to the answer text information to obtain a second feature representation corresponding to the specified answer text information.
The server may perform encoding processing on the converted weighted word vector input to obtain a second feature representation corresponding to the text information of the specified answer. For example, the weighted word vector may be encoded using LSTM, RNN, or the like with reference to the target question text information.
S28: and determining a classification result of the text information of the specified answer by using the second feature representation.
The server may determine a classification result of the specified answer text information using the second feature representation. The classification result may include a valid answer and an invalid answer. Or may be a valid answer, a probability value of an invalid answer, etc. The effective answers represent the categories corresponding to the answer text information needing to be screened out, and the invalid answers represent the categories corresponding to the answer text information needing not to be screened out. Of course, in an actual application scenario, other classifications may exist, which are not limited herein.
For example, the server may compare the second feature representation corresponding to the text information of the specified answer with the feature representation of the reference answer given in the reference answer library, and determine the probability that the text information of the specified answer belongs to the valid answer. The reference answers in the reference answer library are effective answers of all pre-configured questions. For example, the service person may pre-configure a valid answer as a reference answer according to the actual application scenario of each question. Then, in practical application, the effective answer determined by the scheme provided by the above embodiment may be dynamically updated into the value reference answer library to update the optimized reference answer library, enrich the reference answer library, and improve the accuracy of answer classification.
In some embodiments, the server may further input the second feature representation corresponding to the answer text information into a pre-constructed classification model, and perform classification probability calculation to obtain a probability value that the answer text information belongs to a certain category. The classification model can be constructed by adopting a classification algorithm such as a multilayer perceptron. The answer classification processing is carried out in a mode of constructing a classification model, so that the efficiency of the classification processing can be further improved.
Fig. 2 shows a flowchart of a text classification processing method. As shown in fig. 2, in an example of an implementation scenario of the present specification, a question-and-answer text classification processing model may be constructed by using the following steps to perform question-and-answer text classification processing. The sample data may be pre-processed first. For example, the sample data set may be first divided into a training set and a test set. For the questions and corresponding answer sentences in the training set and the test set, the jieba word segmentation can be used for word segmentation, and stop words are removed.
Supposing that any question text information and corresponding answer text information in sample data are subjected to word segmentation processing to obtain a question word w corresponding to the question text information1,w2,w2...wnAnd answer vocabulary p corresponding to the question text information1,p2,p3...pm。
Then, a question vocabulary w can be generated1,w2,w2...wnThe corresponding problem word vector, i.e. the output of embed on the left side in fig. 2, wherein the word vector may adopt a 300-dimensional Glove word vector. The problem word vector can be input into a sequence encoder LSTM to perform semantic compression on the problem text information, and the output of the last hidden layer of the LSTM is taken to obtain a first feature expression vector q of the problem text information.
Then, each word p in the answer text information corresponding to the question text information can be generated1,p2,p3...pmThe corresponding answer word vector, i.e. the output of the right embed in fig. 2. The answer word vector representation space is consistent with the question word vector representation space, and is also a 300-dimensional Glove word vector.
Taking the first feature expression vector q as a signal of the attention mechanism, and the answer word vectors corresponding to the answer text information as values of the attention mechanism respectively, firstly calculating a correlation coefficient of the answer word vectors relative to the first feature expression vector q. Wherein, the output of Softmax is the correlation coefficient of each answer word vector. Then, the product of the correlation coefficient and the corresponding answer word vector is calculated to obtain the weighted word vector corresponding to the answer word vector. The weighted word vector corresponding to each answer word vector is the output of the Att (Attention) in fig. 2.
And inputting each weighted word vector into a decoder LSTM, and performing semantic coding to obtain a second feature expression vector e corresponding to answer text information.
The second feature expression vector e may be used as an input of the classification algorithm, and the corresponding classification label of the answer text information may be used as an output, so as to train the classification algorithm. For example, cross entropy loss calculation can be used to compare the algorithm classification result with the real label to calculate the loss. And adopting a minipatch training method, after obtaining the loss, carrying out model gradient updating by using an SGD optimizer, repeating the steps until the loss does not decrease for 10 continuous epochs, and obtaining a final model and parameters, thereby obtaining a trained question-answer text classification processing model.
For a certain target question, the text information of the target question and the corresponding text information of the answer to be classified can be input into the trained question-answer text classification processing model, under the constraint of the first feature representation corresponding to the text information of the target question, the construction of the second feature representation and the determination of the answer classification result based on the constructed second feature representation are carried out, and the classification result of the text information of the answer to be classified is obtained.
According to the scheme provided by one or more embodiments, the question and the answer are respectively used as a signal and a value of an attention mechanism, the semantic space of the question and the semantic space of the answer are unified through the mapping of the attention mechanism, the answer sentence representation fused with the semantic information of the question is obtained, the question information is effectively fused into a new representation space of the answer, and initial semantic representation is provided for the question answer to participate in other task training, so that the context logic relationship of the question and the answer is fully considered, different classification results under different questions can be answered in the same way, and the accuracy of answer classification is improved. Compared with the traditional text classification method, the scheme provided by the embodiment has richer classification characteristics and better adaptability. Meanwhile, the attention mechanism is used for fusing the question information to the frame for representing the answer, and any deep learning representation method such as CNN, RNN, GRU, LSTM and the like can be used in each coding stage according to the corpus characteristics, so that the use is more flexible.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. For details, reference may be made to the description of the related embodiments of the related processing, and details are not repeated herein.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The text classification method provided in one or more embodiments of the present specification may obtain a first feature representation of semantic information representing question text information, and an answer word vector corresponding to each vocabulary in any answer text information corresponding to the question text information. Then, by using an attention mechanism, the first feature representation is used as constraint information of the attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is applied to the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Accordingly, the weighted word vector incorporates the semantic information of the question text information. Then, encoding processing may be performed on each weighted word vector to obtain a second feature representation corresponding to the answer text information, so as to perform answer classification processing by using the second feature representation. Therefore, by using the embodiments of the specification, the logical relationship between the question and the answer and the semantic influence of the question on the answer can be effectively considered, and the answer representation fused with the question semantic information is obtained. And then, classifying the answer text information by using the second feature representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Based on the text classification method, one or more embodiments of the present specification further provide a text classification device. The apparatus may include systems, software (applications), modules, components, servers, etc. that utilize the methods described in the embodiments of the present specification in conjunction with hardware implementations as necessary. Based on the same innovative conception, embodiments of the present specification provide an apparatus as described in the following embodiments. Since the implementation scheme of the apparatus for solving the problem is similar to that of the method, the specific implementation of the apparatus in the embodiment of the present specification may refer to the implementation of the foregoing method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Specifically, fig. 3 is a schematic diagram of a module structure of an embodiment of a text classification apparatus provided in the specification, and as shown in fig. 3, the apparatus is applied to an edge server, and the apparatus may include:
the feature extraction module 102 may be configured to perform feature extraction on the target problem text information to obtain a first feature representation corresponding to the target problem text information.
The word vector generating module 104 may be configured to perform word vector generation on each vocabulary in the specified answer text information corresponding to the target question text information, and obtain an answer word vector corresponding to each vocabulary in the specified answer text information.
The weighting processing module 106 may be configured to perform weighting processing on any answer word vector by using the first feature representation as constraint information and using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector.
The encoding processing module 108 may be configured to perform encoding processing on each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information.
The classification module 110 may be configured to determine a classification result of the text information of the specified answer by using the second feature representation.
In other embodiments, the feature extraction module 102 may include:
and the word vector generating unit may be configured to perform word vector generation on each vocabulary in the target problem text information, and obtain a problem word vector corresponding to each vocabulary in the target problem text information.
And the encoding processing unit may be configured to perform encoding processing on each question word vector corresponding to the target question text information to obtain a first feature representation corresponding to the target question text information.
In other embodiments, the weighting processing module 106 may be configured to use the first feature representation as constraint information of an attention model, use the answer word vector as a value of the attention model, input the attention model to obtain a correlation coefficient of the answer word vector with respect to the first feature representation, and then calculate a product of the answer word vector and a correlation coefficient corresponding to the corresponding answer word vector to obtain a weighted word vector corresponding to the corresponding answer word vector.
It should be noted that the above-described apparatus may also include other embodiments according to the description of the method embodiment. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.
The text classification device provided in one or more embodiments of the present specification may obtain a first feature representation of semantic information representing question text information, and an answer word vector corresponding to each vocabulary in any answer text information corresponding to the question text information. Then, by using an attention mechanism, the first feature representation is used as constraint information of the attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is applied to the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Accordingly, the weighted word vector incorporates the semantic information of the question text information. Then, encoding processing may be performed on each weighted word vector to obtain a second feature representation corresponding to the answer text information, so as to perform answer classification processing by using the second feature representation. Therefore, by using the embodiments of the specification, the logical relationship between the question and the answer and the semantic influence of the question on the answer can be effectively considered, and the answer representation fused with the question semantic information is obtained. And then, classifying the answer text information by using the second feature representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
The present specification also provides a text classification apparatus that may be applied to a single text classification system, or to a variety of computer data processing systems. The system may be a single server, or may include a server cluster, a system (including a distributed system), software (applications), an actual operating device, a logic gate device, a quantum computer, etc. using one or more of the methods or one or more of the example devices of the present specification, in combination with a terminal device implementing hardware as necessary. In some embodiments, an apparatus may include at least one processor and a memory storing processor-executable instructions that, when executed by the processor, perform steps comprising a method as in any one or more of the embodiments described above.
The memory may include physical means for storing information, typically by digitizing the information for storage on a medium using electrical, magnetic or optical means. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.
It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method or apparatus embodiment, and specific implementation manners may refer to the description of the related method embodiment, which is not described in detail herein.
The text classification device according to the above embodiment may obtain the first feature representation of the semantic information representing the question text information, and the answer word vector corresponding to each vocabulary in any answer text information corresponding to the question text information. Then, by using an attention mechanism, the first feature representation is used as constraint information of the attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is applied to the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Accordingly, the weighted word vector incorporates the semantic information of the question text information. Then, encoding processing may be performed on each weighted word vector to obtain a second feature representation corresponding to the answer text information, so as to perform answer classification processing by using the second feature representation. Therefore, by using the embodiments of the specification, the logical relationship between the question and the answer and the semantic influence of the question on the answer can be effectively considered, and the answer representation fused with the question semantic information is obtained. And then, classifying the answer text information by using the second feature representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
It should be noted that the embodiments of the present disclosure are not limited to the cases where the data model/template is necessarily compliant with the standard data model/template or the description of the embodiments of the present disclosure. Certain industry standards, or implementations modified slightly from those described using custom modes or examples, may also achieve the same, equivalent, or similar, or other, contemplated implementations of the above-described examples. The embodiments using these modified or transformed data acquisition, storage, judgment, processing, etc. may still fall within the scope of the alternative embodiments of the present description.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.
Claims (10)
1. A text classification method is applied to a server, and the method comprises the following steps:
performing feature extraction on target problem text information to obtain a first feature representation corresponding to the target problem text information;
generating word vectors for all words in the appointed answer text information corresponding to the target question text information to obtain answer word vectors corresponding to all words in the appointed answer text information;
for any answer word vector, taking the first feature representation as constraint information, and performing weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
coding each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information;
and determining a classification result of the answer text information by using the second feature representation.
2. The method of claim 1, wherein the extracting the features of the target question text information comprises:
generating word vectors for all words and phrases in the target problem text information to obtain problem word vectors corresponding to all words and phrases in the target problem text information;
and coding each problem word vector corresponding to the target problem text information to obtain a first feature representation corresponding to the target problem text information.
3. The method according to claim 1, wherein the weighting the answer word vector by using the attention model with the first feature representation as constraint information comprises:
using the first feature representation as constraint information of an attention model, using the answer word vector as a value of the attention model, and inputting the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation;
and calculating the product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector to obtain the weighted word vector corresponding to the corresponding answer word vector.
4. The method according to claim 1, wherein the determining a classification result of the specified answer text information using the second feature representation comprises:
and inputting the second feature representation into a pre-constructed classification model to obtain a classification result of the text information of the specified answer, wherein the classification model is constructed by adopting a classification algorithm.
5. The method according to claim 1, wherein the encoding of each weighted word vector corresponding to the text information of the specified answer includes:
and coding each weighted word vector corresponding to the text information of the specified answer by using an LSTM algorithm.
6. The method of claim 2, wherein before generating the word vector for each vocabulary in the target question text information, further comprising:
and performing word segmentation on the target question text information and the specified answer text information corresponding to the target question text information to obtain the specified answer text information and one or more words corresponding to the specified answer text information.
7. A text classification apparatus applied to a server, the apparatus comprising:
the system comprises a characteristic extraction module, a first characteristic representation module and a second characteristic representation module, wherein the characteristic extraction module is used for extracting characteristics of target problem text information to obtain a first characteristic representation corresponding to the target problem text information;
a word vector generation module, configured to perform word vector generation on each vocabulary in the specified answer text information corresponding to the target question text information, and obtain an answer word vector corresponding to each vocabulary in the specified answer text information;
the weighting processing module is used for weighting any answer word vector by taking the first feature representation as constraint information and utilizing an attention model to obtain a weighting word vector corresponding to the corresponding answer word vector;
the coding processing module is used for coding each weighted word vector corresponding to the specified answer text information to obtain a second feature representation corresponding to the specified answer text information;
and the classification module is used for determining a classification result of the text information of the specified answer by using the second feature representation.
8. The apparatus of claim 7, wherein the feature extraction module comprises:
the word vector generating unit is used for generating word vectors for all words and phrases in the target problem text information to obtain problem word vectors corresponding to all words and phrases in the target problem text information;
and the coding processing unit is used for coding each problem word vector corresponding to the target problem text information to obtain a first feature representation corresponding to the target problem text information.
9. The apparatus according to claim 7, wherein the weighting processing module is configured to input an attention model with the first feature representation as constraint information of the attention model and the answer word vector as a value of the attention model, to obtain a correlation coefficient of the answer word vector with respect to the first feature representation; and calculating the product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector to obtain the weighted word vector corresponding to the corresponding answer word vector.
10. A text classification device for application to a server, the device comprising at least one processor and a memory for storing processor-executable instructions which, when executed by the processor, implement steps comprising the method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010735443.0A CN111949791B (en) | 2020-07-28 | 2020-07-28 | Text classification method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010735443.0A CN111949791B (en) | 2020-07-28 | 2020-07-28 | Text classification method, device and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111949791A true CN111949791A (en) | 2020-11-17 |
CN111949791B CN111949791B (en) | 2024-01-30 |
Family
ID=73339649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010735443.0A Active CN111949791B (en) | 2020-07-28 | 2020-07-28 | Text classification method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111949791B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959246A (en) * | 2018-06-12 | 2018-12-07 | 北京慧闻科技发展有限公司 | Answer selection method, device and electronic equipment based on improved attention mechanism |
CN109670029A (en) * | 2018-12-28 | 2019-04-23 | 百度在线网络技术(北京)有限公司 | For determining the method, apparatus, computer equipment and storage medium of problem answers |
CN111241244A (en) * | 2020-01-14 | 2020-06-05 | 平安科技(深圳)有限公司 | Big data-based answer position acquisition method, device, equipment and medium |
CN111382232A (en) * | 2020-03-09 | 2020-07-07 | 联想(北京)有限公司 | Question and answer information processing method and device and computer equipment |
-
2020
- 2020-07-28 CN CN202010735443.0A patent/CN111949791B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108959246A (en) * | 2018-06-12 | 2018-12-07 | 北京慧闻科技发展有限公司 | Answer selection method, device and electronic equipment based on improved attention mechanism |
CN109670029A (en) * | 2018-12-28 | 2019-04-23 | 百度在线网络技术(北京)有限公司 | For determining the method, apparatus, computer equipment and storage medium of problem answers |
CN111241244A (en) * | 2020-01-14 | 2020-06-05 | 平安科技(深圳)有限公司 | Big data-based answer position acquisition method, device, equipment and medium |
CN111382232A (en) * | 2020-03-09 | 2020-07-07 | 联想(北京)有限公司 | Question and answer information processing method and device and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111949791B (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102071582B1 (en) | Method and apparatus for classifying a class to which a sentence belongs by using deep neural network | |
Latif et al. | Survey of deep representation learning for speech emotion recognition | |
US20230027526A1 (en) | Method and apparatus for classifying document based on attention mechanism and semantic analysis | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
CN108563624A (en) | A kind of spatial term method based on deep learning | |
CN117349675B (en) | Multi-mode large model construction system for multiple information sources | |
CN113314100B (en) | Method, device, equipment and storage medium for evaluating and displaying results of spoken language test | |
CN115662435B (en) | Virtual teacher simulation voice generation method and terminal | |
Cabada et al. | Mining of educational opinions with deep learning | |
CN111401105B (en) | Video expression recognition method, device and equipment | |
CN115935969A (en) | Heterogeneous data feature extraction method based on multi-mode information fusion | |
CN118093834A (en) | AIGC large model-based language processing question-answering system and method | |
CN114528835A (en) | Semi-supervised specialized term extraction method, medium and equipment based on interval discrimination | |
CN114416948A (en) | One-to-many dialog generation method and device based on semantic perception | |
CN118245602A (en) | Emotion recognition model training method, device, equipment and storage medium | |
CN111949791B (en) | Text classification method, device and equipment | |
CN111310847B (en) | Method and device for training element classification model | |
CN116010563A (en) | Multi-round dialogue data analysis method, electronic equipment and storage medium | |
CN116991982B (en) | Interactive dialogue method, device, equipment and storage medium based on artificial intelligence | |
CN118504556B (en) | Method, equipment and medium for mining figure speaking views aiming at news | |
CN117828072B (en) | Dialogue classification method and system based on heterogeneous graph neural network | |
CN117668562B (en) | Training and using method, device, equipment and medium of text classification model | |
CN118349922B (en) | Context feature-based music emotion recognition method, device, equipment and medium | |
CN118761404A (en) | Machine reading understanding method, device, equipment and computer readable medium based on neuron activation | |
Karpagam et al. | Multimodal Fusion for Precision Personality Trait Analysis: A Comprehensive Model Integrating Video, Audio, and Text Inputs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |