CN111949791B - Text classification method, device and equipment - Google Patents

Text classification method, device and equipment Download PDF

Info

Publication number
CN111949791B
CN111949791B CN202010735443.0A CN202010735443A CN111949791B CN 111949791 B CN111949791 B CN 111949791B CN 202010735443 A CN202010735443 A CN 202010735443A CN 111949791 B CN111949791 B CN 111949791B
Authority
CN
China
Prior art keywords
answer
text information
word vector
classification
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010735443.0A
Other languages
Chinese (zh)
Other versions
CN111949791A (en
Inventor
孔繁爽
李琦
梁莉娜
王小红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010735443.0A priority Critical patent/CN111949791B/en
Publication of CN111949791A publication Critical patent/CN111949791A/en
Application granted granted Critical
Publication of CN111949791B publication Critical patent/CN111949791B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification relates to the technical field of artificial intelligence data processing and discloses a text classification method, a device and equipment, wherein the method comprises the steps of extracting characteristics of target problem text information to obtain a first characteristic representation of the target problem text information; generating word vectors for words in the appointed answer text information corresponding to the target question text information, and obtaining answer word vectors corresponding to words in the appointed answer text information; taking the first characteristic representation as constraint information for any answer word vector, and carrying out weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector; encoding each weighted word vector corresponding to the text information of the specified answer to obtain a second characteristic representation corresponding to the text information of the specified answer; and determining a classification result of the answer text information by using the second characteristic representation. Thereby further improving the accuracy of effective answer screening.

Description

Text classification method, device and equipment
Technical Field
The present disclosure relates to the field of artificial intelligence data processing technologies, and in particular, to a text classification method, apparatus, and device.
Background
In application scenarios such as software release assessment or service question and answer, the platform can pre-configure a series of questions, and correspondingly, the user can answer the questions. Alternatively, the user may initiate a question at the platform, which other users or platform business personnel may answer. The platform can obtain feedback information of a user on a certain service or software application by analyzing answers corresponding to different questions. For a question, there may be multiple answers corresponding, and some may have a case where the answer is not a question or the reference is not significant. The platform usually needs to audit the answers corresponding to the questions first, and screens out more effective answers so as to more accurately and quickly know the feedback of the user.
Currently, the validity of each answer is usually determined by directly analyzing the answer or splicing the answer and the question together. However, in practical applications, the answers usually correspond to the questions, and it is difficult to evaluate the validity of each answer by only analyzing the answers. While the answers and the questions are spliced together to analyze the validity of the answers, the answers and the questions can be associated, in view of the limited reference answer library and the complex variability of the expression form of the answers of the user, the answers which are actually valid but have larger variability between the expression form and the expression form of the reference answers are easily removed during actual processing, so that the accuracy of determining the valid answers is affected. The deep learning algorithm considering the context semantic information is more difficult to directly migrate to the question-answer application scene for use because the context information related to the requirements has stronger self-relevance and is mostly applied to the field of dialogue generation. Therefore, there is a need in the art for a more accurate and efficient classification method for question-answer text.
Disclosure of Invention
An objective of the embodiments of the present disclosure is to provide a method, an apparatus, and a device for classifying text, which can further improve accuracy of effective answer screening.
The text classification method, device and equipment provided by the specification are realized in the following modes:
a text classification method applied to a server, the method comprising:
extracting features of the target problem text information to obtain a first feature representation corresponding to the target problem text information;
generating word vectors for each word in the appointed answer text information corresponding to the target question text information, and obtaining an answer word vector corresponding to each word in the appointed answer text information;
taking the first characteristic representation as constraint information for any answer word vector, and carrying out weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
encoding each weighted word vector corresponding to the specified answer text information to obtain a second characteristic representation corresponding to the specified answer text information;
and determining a classification result of the answer text information by using the second characteristic representation.
In other embodiments of the method provided in the present specification, the feature extraction of the target question text information includes:
generating word vectors of all words in the target problem text information to obtain problem word vectors corresponding to all words in the target problem text information;
and carrying out coding processing on each question word vector corresponding to the target question text information to obtain a first characteristic representation corresponding to the target question text information.
In other embodiments of the methods provided herein,
the weighting processing of the answer word vector by using the first characteristic representation as constraint information and using an attention model comprises the following steps:
taking the first feature representation as constraint information of an attention model, taking the answer word vector as a value of the attention model, and inputting the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation;
and calculating the product of the corresponding correlation coefficient of the answer word vector and the corresponding answer word vector to obtain the corresponding weighted word vector of the corresponding answer word vector.
In other embodiments of the method provided in the present specification, the determining the classification result of the text information of the specified answer using the second feature representation includes:
And inputting the second characteristic representation into a pre-constructed classification model to obtain a classification result of the text information of the specified answer, wherein the classification model is constructed by adopting a classification algorithm.
In other embodiments of the method provided in the present disclosure, the encoding the weighted word vectors corresponding to the text information of the specified answer includes:
and encoding each weighted word vector corresponding to the text information of the specified answer by using an LSTM algorithm.
In other embodiments of the method provided in the present specification, before generating the word vector for each word in the target question text information, the method further includes:
and performing word segmentation processing on the target question text information and the specified answer text information corresponding to the target question text information to obtain the specified answer text information and one or more vocabularies corresponding to the specified answer text information.
On the other hand, the embodiment of the specification also provides a text classification device, which is applied to a server, and comprises:
the feature extraction module is used for extracting features of the target problem text information to obtain a first feature representation corresponding to the target problem text information;
The word vector generation module is used for generating word vectors for words in the appointed answer text information corresponding to the target question text information, and obtaining answer word vectors corresponding to words in the appointed answer text information;
the correlation analysis module is used for carrying out weighting processing on any answer word vector by using the first characteristic representation as constraint information and using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
the coding processing module is used for coding each weighted word vector corresponding to the specified answer text information to obtain a second characteristic representation corresponding to the specified answer text information;
and the classification module is used for determining a classification result of the text information of the specified answer by using the second characteristic representation.
In other embodiments of the apparatus provided herein, the feature extraction module includes:
the word vector generation unit is used for generating word vectors of words in the target problem text information and obtaining problem word vectors corresponding to words in the target problem text information;
and the encoding processing unit is used for encoding each question word vector corresponding to the target question text information to obtain a first characteristic representation corresponding to the target question text information.
In other embodiments of the apparatus provided in the present disclosure, the weighting processing module is configured to use the first feature representation as constraint information of an attention model, use the answer word vector as a value of the attention model, and input the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation; and calculating the product of the corresponding correlation coefficient of the answer word vector and the corresponding answer word vector to obtain the corresponding weighted word vector of the corresponding answer word vector.
In another aspect, embodiments of the present disclosure further provide a text classification device, applied to a server, where the device includes at least one processor and a memory for storing processor-executable instructions, which when executed by the processor implement steps including any one or more of the methods described above.
According to the text classification method, device and equipment provided by one or more embodiments of the present disclosure, a first feature representation of semantic information representing question text information and answer word vectors corresponding to words in any answer text information corresponding to the question text information can be obtained. Then, the attention mechanism can be utilized, the first feature representation is used as constraint information of an attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is acted on the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Correspondingly, the weighted word vector fuses semantic information of the problem text information. Then, each weighted word vector can be subjected to coding processing to obtain a second characteristic representation corresponding to the answer text information, so that the second characteristic representation is utilized to carry out answer classification processing. Therefore, by utilizing the embodiments of the specification, the logic relationship between the questions and the answers and the semantic influence of the questions on the answers can be effectively considered, and the answer representation fused with the semantic information of the questions can be obtained. And then, classifying the answer text information by utilizing the second characteristic representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some of the embodiments described in the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
fig. 1 is a schematic flow chart of an embodiment of a text classification method provided in the present specification;
FIG. 2 is a schematic diagram of a text classification flow in one embodiment provided herein;
fig. 3 is a schematic block diagram of another text classification apparatus provided in the present specification.
Detailed Description
In order that those skilled in the art will better understand the technical solutions in this specification, a clear and complete description of the technical solutions in one or more embodiments of this specification will be provided below with reference to the accompanying drawings in one or more embodiments of this specification, and it is apparent that the described embodiments are only some embodiments of the specification and not all embodiments. All other embodiments, which may be made by one or more embodiments of the disclosure without undue effort by one of ordinary skill in the art, are intended to be within the scope of the embodiments of the disclosure.
In one scenario example provided in the embodiment of the present disclosure, the text classification method may be applied to a server that performs question-answer text classification in an application scenario such as software release assessment or service question-answer. The server may refer to a server or a server cluster formed by a plurality of servers.
And for a certain target question and one or more answers corresponding to the target question, the server can perform feature extraction on the text information of the target question to obtain a first feature representation corresponding to the text information of the target question. And then, generating word vectors for the specified answer text information corresponding to the target question text information, and obtaining word vectors corresponding to each word in the specified answer text information. The first feature representation of the semantic information representing the question text information and the answer word vector corresponding to each word in any answer text information corresponding to the question text information can be obtained. Then, the attention mechanism can be utilized, the first feature representation is used as constraint information of an attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is acted on the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Correspondingly, the weighted word vector fuses semantic information of the problem text information. Then, each weighted word vector can be subjected to coding processing to obtain a second characteristic representation corresponding to the answer text information, so that the second characteristic representation is utilized to carry out answer classification processing.
Therefore, by utilizing the embodiments of the specification, the logic relationship between the questions and the answers and the semantic influence of the questions on the answers can be effectively considered, and the answer representation fused with the semantic information of the questions can be obtained. And then, classifying the answer text information by utilizing the second characteristic representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Fig. 1 is a schematic flow chart of an embodiment of the text classification method provided in the present specification. Although the description provides methods and apparatus structures as shown in the examples or figures described below, more or fewer steps or modular units may be included in the methods or apparatus, whether conventionally or without inventive effort. In the steps or the structures where there is no necessary causal relationship logically, the execution order of the steps or the module structure of the apparatus is not limited to the execution order or the module structure shown in the embodiments or the drawings of the present specification. The described methods or module structures may be implemented in a device, server or end product in practice, in a sequential or parallel fashion (e.g., parallel processor or multi-threaded processing environments, or even distributed processing, server cluster implementations) as shown in the embodiments or figures. In a specific embodiment, as shown in fig. 1, in one embodiment of the text classification method provided in the present specification, the method may be applied to the server, and the method may include the following steps:
S20: and extracting the characteristics of the target problem text information to obtain a first characteristic representation corresponding to the target problem text information.
For a certain target question and one or more answers corresponding to the target question, the server can acquire target question text information corresponding to the target question. For example, the target question text information may be "X credit card XXX presents a question asking how it should be handled".
Then, the server can perform feature extraction on the target problem text information to obtain a first feature representation corresponding to the target problem text information. The server can map the target problem text information into a numeric semantic space to obtain the representation information of the target problem text information in the numeric semantic space so as to facilitate the computer processing of the target problem text information. Correspondingly, the first feature corresponding to the target question text information is represented as a numerical representation form carrying the semantic feature of the target question text information.
In some embodiments, word vector generation may be performed on each word in the target question text information, so as to obtain a question word vector corresponding to each word in the target question text information. And then, the plurality of question word vectors can be subjected to coding processing to obtain a first characteristic representation corresponding to the target question text information.
In some embodiments, the server may perform word segmentation processing on the target question text information first to obtain one or more vocabularies corresponding to the target question text information. The word segmentation process may be performed using jieba segmentation, snowNLP, THULAC, NLPIR, or the like. In other embodiments, the processing of removing the stop word and the like can be performed simultaneously, so that interference information is reduced.
For each word obtained by division, the server can further generate a word vector corresponding to each word. The word vector is a numerical representation corresponding to each word. I.e., mapping each vocabulary into a digitized semantic space for computer processing. Word vectors corresponding to the words can be generated by using a statistical method or a language model method. For example, skip-gram, CBOW, LBL, NNLM, C and W, gloVe can be used to generate word vectors corresponding to words in the target question text information. For example, a word vector corresponding to each word may be generated using a Glove of 300 dimensions. For convenience of expression, a word vector corresponding to the target question text information may be described as a question word vector, and a word vector corresponding to the answer text information corresponding to the target question text information may be described as an answer word vector.
Then, the server may perform encoding processing on the plurality of question word vectors to obtain a first feature representation corresponding to the target question text information. For example, the extracted word vector may be input to a sequence encoder LSTM (Long Short-Term Memory network), and semantic compression processing may be performed, where the output of the last hidden layer of the LSTM is taken as the corresponding first feature representation of the target question text information. When the LSTM is utilized to encode word vectors, at each moment, the output encoding vector not only depends on the input of the current moment, but also considers the state of the last moment model, and through the history dependency relationship, the first characteristic representation obtained after encoding processing can more effectively represent the context dependency information of each vocabulary of the target problem text information, so that semantic information expressed by the target problem text information is more effectively represented. Of course, other algorithms may be used for encoding in practical applications, such as RNN (Recurrent Neural Network ) may also be used.
In the above embodiment, by firstly performing vocabulary division and then determining the feature representation corresponding to the target question text information based on the word vector corresponding to each vocabulary, the importance degree of each vocabulary in the answer text information relative to the target question text information can be more simply and conveniently determined.
S22: generating word vectors for each word in the appointed answer text information corresponding to the target question text information, and obtaining the answer word vector corresponding to each word in the appointed answer text information.
For any one of one or more answers corresponding to a certain target question, the server may obtain answer text information of the answer. The answer text information may be, for example, "learned", "is" XX "meaning" and "XXX processing should be performed on an X credit card". Wherein, "learned", "is the meaning of XX" pertains to answers that do not solve a question and answers that are not questions. Whereas "XXX processing should be performed on X credit card" belongs to a valid answer to the question. Correspondingly, the text information of the specified answer can be the text information of any answer to be determined in one or more answers corresponding to the target question.
Then, the server can generate word vectors corresponding to the words in the text information of the specified answer, and obtain question word vectors corresponding to the words in the text information of the specified answer. The method for generating the word vector corresponding to each word in the text information of the specified answer can be implemented by referring to the method for generating the word vector of the word in the text information of the target question. In some embodiments, the word vector representation space of the text information of the specified answer may be set to be consistent with the word vector representation space in the text information of the target question, and the extraction is performed by using a 300-dimensional Glove word vector. By setting the two representation space dimensions to be the same, the data processing can be more convenient.
S24: and for any answer word vector, taking the first characteristic representation as constraint information, and carrying out weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector.
And for any answer word vector, the server can use the first characteristic representation as constraint information, and perform weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector. The Attention model may refer to a model constructed using an Attention (Attention) mechanism.
In some embodiments, for any answer word vector, the server may use the first feature representation as constraint information of the attention model, use the answer word vector as a value of the attention model, and input the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation. Then, the product of the corresponding correlation coefficient of the answer word vector and the corresponding answer word vector can be calculated, and the weighted word vector corresponding to the corresponding answer word vector is obtained. Or, after calculating the product of the two, normalization and other processing can be performed on the product value, so as to obtain the weighted word vector corresponding to the corresponding answer word vector. The calculation of the correlation coefficient in the attention model may be performed by referring to an attention mechanism processing method, which is not described herein.
The correlation coefficient characterizes the correlation of each vocabulary in the answer text information relative to the target question text information. The larger the correlation coefficient is, the stronger the semantic information correlation between the vocabulary in the answer and the target question is, and the smaller the correlation coefficient is, the weaker the semantic information correlation between the vocabulary in the answer and the target question is. By calculating the correlation between the two, when semantic coding processing is carried out by utilizing each word vector of the answer, the correlation coefficient is used as the weight of the corresponding word vector, the feature representation of the vocabulary with strong correlation with the semantic information of the target question can be further highlighted, the feature representation of the vocabulary with weak correlation with the semantic information of the target question is weakened, the feature representation of the answer effectively fuses the semantic information of the target question, namely, the feature representation of the answer can effectively consider the logical relationship between the target question and the appointed answer and the semantic influence of the target question on the appointed answer.
And then, carrying out effective classification of answers by utilizing answer characteristic expression fused with semantic information of the target questions, so that the same answers are different in classification results under different questions, and effective answers with stronger relevance to the questions can be reserved as accurately as possible while invalid answers are accurately removed.
S26: and carrying out coding processing on each weighted word vector corresponding to the answer text information to obtain a second characteristic representation corresponding to the appointed answer text information.
The server can input the converted weighted word vector to be encoded to obtain a second characteristic representation corresponding to the text information of the specified answer. For example, the weighted word vector may be encoded with LSTM or RNN or the like with reference to the target question text information.
S28: and determining a classification result of the text information of the specified answer by using the second characteristic representation.
The server may determine a classification result of the specified answer text information using the second feature representation. The classification result may include a valid answer and an invalid answer. Or may be a probability value of a valid answer, an invalid answer, etc. The effective answers represent the categories corresponding to the answer text information to be screened out, and the ineffective answers represent the categories corresponding to the answer text information not to be screened out. Of course, in the practical application scenario, other classifications may exist, which are not limited herein.
For example, the server may compare the second feature representation corresponding to the specified answer text information with feature representations of reference answers given in the reference answer library, and determine a probability that the specified answer text information belongs to a valid answer. And the reference answers in the reference answer library are effective answers of the questions which are preset. For example, the service personnel may pre-configure the effective answer as the reference answer according to the actual application scenario of each question. And then, in practical application, the effective answer dynamic update value determined by the scheme provided by the embodiment can be stored in the reference answer library so as to update the optimized reference answer library, enrich the reference answer library and improve the accuracy of answer classification.
In some embodiments, the server may further input a second feature representation corresponding to the answer text information into a pre-constructed classification model, and perform classification probability calculation to obtain a probability value of the answer text information belonging to a certain class. The classification model can be constructed by using a classification algorithm such as a multi-layer perceptron. The answer classification processing is carried out in a mode of constructing a classification model, so that the classification processing efficiency can be further improved.
Fig. 2 shows a flow chart of a text classification processing method. As shown in fig. 2, in one example of an implementation scenario of the present specification, a question-answer text classification processing model may be constructed to perform a question-answer text classification processing using the following steps. The sample data may be pre-processed first. For example, the sample data set may be divided into a training set and a test set. And for the questions in the training set and the testing set and corresponding answer sentences, the jieba word segmentation operation can be performed by using the jieba word segmentation, and meanwhile, the stop words are removed.
Assume that, for any question text information and corresponding answer text information in sample data, after word segmentation processing, obtaining a question vocabulary w corresponding to the question text information 1 ,w 2 ,w 2 ...w n Answer vocabulary p corresponding to the question text information 1 ,p 2 ,p 3 ...p m
Then, a problem vocabulary w may be generated 1 ,w 2 ,w 2 ...w n The corresponding problem word vector, i.e. the output of the left-hand encoding in fig. 2, wherein the word vector may be a 300-dimensional Glove word vector. The problem word vector can be input into a sequence encoder LSTM to perform semantic compression of the problem text information, and the output of the last hidden layer of the LSTM is taken to obtain a first characteristic representation vector q of the problem text information.
Then, each vocabulary p in the answer text information corresponding to the question text information can be generated 1 ,p 2 ,p 3 ...p m The corresponding answer word vector is the output of the right-side encoding in fig. 2. The answer word vector representation space is consistent with the question word vector representation space, and is the 300-dimensional Glove word vector.
And taking the first characteristic representation vector q as a signal of the attention mechanism, respectively taking answer word vectors corresponding to the answer text information as values of the attention mechanism, and firstly calculating a correlation coefficient of the answer word vectors relative to the first characteristic representation vector q. The output of Softmax is the correlation coefficient of each answer word vector. And then, calculating the product of the correlation coefficient and the corresponding answer word vector to obtain the weighted word vector corresponding to the answer word vector. The weighted word vector corresponding to each answer word vector is the output of att (Attention) in fig. 2.
And inputting each weighted word vector into a decoder LSTM, and carrying out semantic coding to obtain a second characteristic representation vector e corresponding to the answer text information.
The second feature expression vector e can be used as input of the classification algorithm, and the classification label of the corresponding answer text information is used as output to train the classification algorithm. For example, a cross entropy loss calculation may be used to calculate the loss for the algorithm classification result compared to the real label. And after the loss is obtained by adopting a miniband training method, performing model gradient update by using an SGD (generalized model generator) optimizer, and repeating the steps until the loss of continuous 10 epochs training is not reduced any more, so as to obtain a final model and parameters, and further obtain a trained question-answering text classification processing model.
For a certain target question, inputting the text information of the target question and the corresponding text information of the answer to be classified into the trained question-answer text classification processing model, constructing a second characteristic representation under the constraint of a first characteristic representation corresponding to the text information of the target question, and determining an answer classification result based on the constructed second characteristic representation to obtain a classification result of the text information of the answer to be classified.
According to the scheme provided by the one or more embodiments, the questions and the answers are respectively used as signals and values of the attention mechanism, the semantic space of the questions and the answers is unified through the mapping of the attention mechanism, the answer sentence representation of the fusion question semantic information is obtained, the question information is effectively fused into the new representation space of the answers, and the initial semantic representation is provided for the answers of the questions to participate in other task training, so that the contextual logic relationship of the questions and the answers is fully considered, the effect that the classification results are different under different questions can be achieved, and the accuracy of answer classification is improved. Compared with the traditional text classification method, the scheme provided by the embodiment is richer in classification characteristics and better in adaptability. Meanwhile, the attention mechanism is used for fusing the question information into the framework of answer representation, and any deep learning representation method, such as CNN, RNN, GRU, LSTM, can be used in each coding stage according to corpus characteristics, so that the use is more flexible.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. Specific reference may be made to the foregoing description of related embodiments of the related process, which is not described herein in detail.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
According to the text classification method provided by one or more embodiments of the present disclosure, a first feature representation of semantic information representing question text information and answer word vectors corresponding to words in any answer text information corresponding to the question text information may be obtained. Then, the attention mechanism can be utilized, the first feature representation is used as constraint information of an attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is acted on the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Correspondingly, the weighted word vector fuses semantic information of the problem text information. Then, each weighted word vector can be subjected to coding processing to obtain a second characteristic representation corresponding to the answer text information, so that the second characteristic representation is utilized to carry out answer classification processing. Therefore, by utilizing the embodiments of the specification, the logic relationship between the questions and the answers and the semantic influence of the questions on the answers can be effectively considered, and the answer representation fused with the semantic information of the questions can be obtained. And then, classifying the answer text information by utilizing the second characteristic representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
Based on the text classification method, one or more embodiments of the present disclosure further provide a text classification device. The apparatus may include a system, software (application), module, component, server, etc. using the methods described in the embodiments of the present specification in combination with necessary hardware implementation. Based on the same innovative concepts, the embodiments of the present description provide means in one or more embodiments as described in the following embodiments. Because the implementation scheme and the method for solving the problem by the device are similar, the implementation of the device in the embodiment of the present disclosure may refer to the implementation of the foregoing method, and the repetition is not repeated. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated. Specifically, fig. 3 shows a schematic block diagram of an embodiment of a text classification apparatus provided in the specification, as shown in fig. 3, applied to a side server, where the apparatus may include:
the feature extraction module 102 may be configured to perform feature extraction on the target question text information, and obtain a first feature representation corresponding to the target question text information.
The word vector generation module 104 may be configured to perform word vector generation on each word in the specified answer text information corresponding to the target question text information, so as to obtain an answer word vector corresponding to each word in the specified answer text information.
The weighting processing module 106 may be configured to perform weighting processing on any answer word vector by using the first feature representation as constraint information and using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector.
The encoding processing module 108 may be configured to perform encoding processing on each weighted word vector corresponding to the specified answer text information, so as to obtain a second feature representation corresponding to the specified answer text information.
The classification module 110 may be configured to determine a classification result of the text information of the specified answer using the second feature representation.
In other embodiments, the feature extraction module 102 may include:
the word vector generation unit may be configured to generate word vectors for each word in the target question text information, so as to obtain question word vectors corresponding to each word in the target question text information.
The encoding processing unit can be used for encoding each question word vector corresponding to the target question text information to obtain a first characteristic representation corresponding to the target question text information.
In other embodiments, the weighting processing module 106 may be configured to take the first feature representation as constraint information of the attention model, take the answer word vector as a value of the attention model, input the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation, and then calculate a product of the answer word vector and the correlation coefficient corresponding to the corresponding answer word vector to obtain a weighted word vector corresponding to the corresponding answer word vector.
It should be noted that the above description of the apparatus according to the method embodiment may also include other implementations. Specific implementation may refer to descriptions of related method embodiments, which are not described herein in detail.
According to the text classification device provided by one or more embodiments of the present disclosure, a first feature representation of semantic information representing question text information and answer word vectors corresponding to words in any answer text information corresponding to the question text information may be obtained. Then, the attention mechanism can be utilized, the first feature representation is used as constraint information of an attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is acted on the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Correspondingly, the weighted word vector fuses semantic information of the problem text information. Then, each weighted word vector can be subjected to coding processing to obtain a second characteristic representation corresponding to the answer text information, so that the second characteristic representation is utilized to carry out answer classification processing. Therefore, by utilizing the embodiments of the specification, the logic relationship between the questions and the answers and the semantic influence of the questions on the answers can be effectively considered, and the answer representation fused with the semantic information of the questions can be obtained. And then, classifying the answer text information by utilizing the second characteristic representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
The present specification also provides a text classification apparatus that may be used in a single text classification system or in a variety of computer data processing systems. The system may be a stand-alone server or may include a server cluster, a system (including a distributed system), software (applications), an actual operating device, a logic gate device, a quantum computer, etc., using one or more of the methods or one or more of the embodiment devices of the present specification in combination with a terminal device that implements the necessary hardware. In some embodiments, an apparatus may include at least one processor and memory for storing processor-executable instructions that, when executed by the processor, perform steps comprising the methods of any one or more of the embodiments described above.
The memory may include physical means for storing information, typically by digitizing the information before storing it in an electrical, magnetic or optical medium. The storage medium may include: means for storing information using electrical energy such as various memories, e.g., RAM, ROM, etc.; devices for storing information using magnetic energy such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, and USB flash disk; devices for optically storing information, such as CDs or DVDs. Of course, there are other ways of readable storage medium, such as quantum memory, graphene memory, etc.
It should be noted that the description of the above apparatus according to the method or apparatus embodiment may further include other embodiments, and specific implementation manner may refer to the description of the related method embodiment, which is not described herein in detail.
The text classification device in the above embodiment may obtain the first feature representation of the semantic information representing the question text information, and the answer word vector corresponding to each word in any answer text information corresponding to the question text information. Then, the attention mechanism can be utilized, the first feature representation is used as constraint information of an attention model, each answer word vector is used as a value of the attention model, the correlation of each answer word vector relative to the first feature representation is analyzed, and the correlation is acted on the corresponding answer word vector to obtain a weighted word vector corresponding to each answer word vector. Correspondingly, the weighted word vector fuses semantic information of the problem text information. Then, each weighted word vector can be subjected to coding processing to obtain a second characteristic representation corresponding to the answer text information, so that the second characteristic representation is utilized to carry out answer classification processing. Therefore, by utilizing the embodiments of the specification, the logic relationship between the questions and the answers and the semantic influence of the questions on the answers can be effectively considered, and the answer representation fused with the semantic information of the questions can be obtained. And then, classifying the answer text information by utilizing the second characteristic representation fused with the question semantic information, so that the accuracy of effective answer screening can be further improved.
It should be noted that the embodiments of the present specification are not limited to the case where the standard data model/template is met or described in the embodiments of the present specification. Some industry standards or embodiments modified slightly based on the implementation described by the custom manner or examples can also realize the same, equivalent or similar or predictable implementation effect after modification of the above examples. Examples of data acquisition, storage, judgment, processing, etc., using these modifications or variations are still within the scope of alternative embodiments of the present description.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments. In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present specification. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims (9)

1. A text classification method, applied to a server, the method comprising:
extracting features of the target problem text information to obtain a first feature representation corresponding to the target problem text information;
generating word vectors for each word in the appointed answer text information corresponding to the target question text information, and obtaining an answer word vector corresponding to each word in the appointed answer text information;
taking the first characteristic representation as constraint information for any answer word vector, and carrying out weighting processing on the answer word vector by using an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
encoding each weighted word vector corresponding to the specified answer text information to obtain a second characteristic representation corresponding to the specified answer text information;
Determining a classification result of the answer text information by using the second characteristic representation;
wherein the determining the classification result of the text information of the specified answer by using the second feature representation includes:
inputting the second characteristic representation into a pre-constructed classification model to obtain a classification result of the text information of the specified answer, wherein the classification model is constructed by a classification algorithm; the classification result comprises a valid answer and an invalid answer, or the classification result comprises a probability value of the valid answer and a probability value of the invalid answer.
2. The method of claim 1, wherein the feature extraction of the target question text information comprises:
generating word vectors of all words in the target problem text information to obtain problem word vectors corresponding to all words in the target problem text information;
and carrying out coding processing on each question word vector corresponding to the target question text information to obtain a first characteristic representation corresponding to the target question text information.
3. The method of claim 1, wherein the weighting the answer word vector with the attention model using the first feature representation as constraint information comprises:
Taking the first feature representation as constraint information of an attention model, taking the answer word vector as a value of the attention model, and inputting the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation;
and calculating the product of the corresponding correlation coefficient of the answer word vector and the corresponding answer word vector to obtain the corresponding weighted word vector of the corresponding answer word vector.
4. The method of claim 1, wherein the encoding each weighted word vector corresponding to the text information of the specified answer comprises:
and encoding each weighted word vector corresponding to the text information of the specified answer by using an LSTM algorithm.
5. The method of claim 2, wherein before generating the word vector for each word in the target question text information, further comprising:
and performing word segmentation processing on the target question text information and the specified answer text information corresponding to the target question text information to obtain the specified answer text information and one or more vocabularies corresponding to the specified answer text information.
6. A text classification apparatus for use with a server, the apparatus comprising:
The feature extraction module is used for extracting features of the target problem text information to obtain a first feature representation corresponding to the target problem text information;
the word vector generation module is used for generating word vectors for words in the appointed answer text information corresponding to the target question text information, and obtaining answer word vectors corresponding to words in the appointed answer text information;
the weighting processing module is used for weighting any answer word vector by taking the first characteristic representation as constraint information and utilizing an attention model to obtain a weighted word vector corresponding to the corresponding answer word vector;
the coding processing module is used for coding each weighted word vector corresponding to the specified answer text information to obtain a second characteristic representation corresponding to the specified answer text information;
the classification module is used for determining a classification result of the text information of the specified answer by using the second characteristic representation;
wherein, the classification module is specifically configured to: inputting the second characteristic representation into a pre-constructed classification model to obtain a classification result of the text information of the specified answer, wherein the classification model is constructed by a classification algorithm; the classification result comprises a valid answer and an invalid answer, or the classification result comprises a probability value of the valid answer and a probability value of the invalid answer.
7. The apparatus of claim 6, wherein the feature extraction module comprises:
the word vector generation unit is used for generating word vectors of words in the target problem text information and obtaining problem word vectors corresponding to words in the target problem text information;
and the encoding processing unit is used for encoding each question word vector corresponding to the target question text information to obtain a first characteristic representation corresponding to the target question text information.
8. The apparatus of claim 6, wherein the weighting module is configured to take the first feature representation as constraint information of an attention model, take the answer word vector as a value of the attention model, and input the attention model to obtain a correlation coefficient of the answer word vector relative to the first feature representation; and calculating the product of the corresponding correlation coefficient of the answer word vector and the corresponding answer word vector to obtain the corresponding weighted word vector of the corresponding answer word vector.
9. Text classification device, characterized in that it is applied to a server, said device comprising at least one processor and a memory for storing processor-executable instructions, which instructions, when executed by said processor, implement the steps comprising the method according to any of the preceding claims 1-5.
CN202010735443.0A 2020-07-28 2020-07-28 Text classification method, device and equipment Active CN111949791B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010735443.0A CN111949791B (en) 2020-07-28 2020-07-28 Text classification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010735443.0A CN111949791B (en) 2020-07-28 2020-07-28 Text classification method, device and equipment

Publications (2)

Publication Number Publication Date
CN111949791A CN111949791A (en) 2020-11-17
CN111949791B true CN111949791B (en) 2024-01-30

Family

ID=73339649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010735443.0A Active CN111949791B (en) 2020-07-28 2020-07-28 Text classification method, device and equipment

Country Status (1)

Country Link
CN (1) CN111949791B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN109670029A (en) * 2018-12-28 2019-04-23 百度在线网络技术(北京)有限公司 For determining the method, apparatus, computer equipment and storage medium of problem answers
CN111241244A (en) * 2020-01-14 2020-06-05 平安科技(深圳)有限公司 Big data-based answer position acquisition method, device, equipment and medium
CN111382232A (en) * 2020-03-09 2020-07-07 联想(北京)有限公司 Question and answer information processing method and device and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN109670029A (en) * 2018-12-28 2019-04-23 百度在线网络技术(北京)有限公司 For determining the method, apparatus, computer equipment and storage medium of problem answers
CN111241244A (en) * 2020-01-14 2020-06-05 平安科技(深圳)有限公司 Big data-based answer position acquisition method, device, equipment and medium
CN111382232A (en) * 2020-03-09 2020-07-07 联想(北京)有限公司 Question and answer information processing method and device and computer equipment

Also Published As

Publication number Publication date
CN111949791A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
KR102071582B1 (en) Method and apparatus for classifying a class to which a sentence belongs by using deep neural network
Kaiser et al. Discrete autoencoders for sequence models
Sojasingarayar Seq2seq ai chatbot with attention mechanism
CN112069295B (en) Similar question recommendation method and device, electronic equipment and storage medium
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN108563624A (en) A kind of spatial term method based on deep learning
CN111966800A (en) Emotional dialogue generation method and device and emotional dialogue model training method and device
CN111625634A (en) Word slot recognition method and device, computer-readable storage medium and electronic device
CN113779310B (en) Video understanding text generation method based on hierarchical representation network
Tang et al. Modelling student behavior using granular large scale action data from a MOOC
Zhou et al. ICRC-HIT: A deep learning based comment sequence labeling system for answer selection challenge
Cabada et al. Mining of educational opinions with deep learning
CN118194923B (en) Method, device, equipment and computer readable medium for constructing large language model
CN115935969A (en) Heterogeneous data feature extraction method based on multi-mode information fusion
Farazi et al. Accuracy vs. complexity: a trade-off in visual question answering models
CN114386436A (en) Text data analysis method, model training device and computer equipment
CN116701632A (en) Entity-level multi-mode emotion classification method, device and equipment for graphics context
CN111241843B (en) Semantic relation inference system and method based on composite neural network
CN111046157A (en) Universal English man-machine conversation generation method and system based on balanced distribution
Ghanimifard et al. Learning to compose spatial relations with grounded neural language models
CN114416948A (en) One-to-many dialog generation method and device based on semantic perception
CN111949791B (en) Text classification method, device and equipment
Wakchaure et al. A scheme of answer selection in community question answering using machine learning techniques
Zhang et al. RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant