CN111159358A

CN111159358A - Multi-intention recognition training and using method and device

Info

Publication number: CN111159358A
Application number: CN201911421640.9A
Authority: CN
Inventors: 刘枭
Original assignee: AI Speech Ltd
Current assignee: AI Speech Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-15

Abstract

The invention discloses a multi-intention recognition training and using method and a device, wherein the multi-intention recognition training method comprises the following steps: converting original intention labeling data into sentence pair data, wherein the sentence pair data at least comprises a target sentence in the original intention labeling data, a sentence pair consisting of representative sentences of intention types contained in the intention labeling data, and the similarity between the target sentence and the representative sentences; a sentence vector encoder encodes the target sentence and the representative sentence to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence; and after vector splicing is carried out on the target sentence vector and the representative sentence vector, the target sentence vector and the representative sentence vector are input into a classifier to judge whether the sentence pairs are similar or not, and the sentence vector encoder and the classifier are trained.

Description

Multi-intention recognition training and using method and device

Technical Field

The invention belongs to the technical field of semantic understanding, and particularly relates to a multi-purpose recognition training and using method and device.

Background

In the prior art, the following techniques exist: the method for identifying the intention based on the multi-classification, the method for identifying the intention based on the raw sharing and the method for identifying the intention based on the one-vs-all multi-label classification.

In the intention identification method based on multi-classification, an intention is marked to each sentence of a user, if yes, an express is marked as an intention of sending the express, the intention identification problem in a dialogue system is converted into a multi-classification problem, and then a multi-classifier is trained to identify the intention by using a classical classification algorithm such as Support Vector Machines (SVMs, Support Vector Machines), random forests, gradient trees, deep learning and the like.

In the intention identification method based on one-vs-all multi-label classification, all reasonable intentions are marked in each sentence of a user, if yes, the 'sending of an express' is marked as two intentions of 'confirmation' and 'sending of the express', the intention identification problem in a dialogue system is converted into a multi-label classification problem, and then a one-vs-all form is adopted, and a plurality of two classifiers are trained by using a classical classification algorithm to identify the intention.

An intention identification method based on raw shot learning mainly aims at solving the problem that the accuracy rate of intention identification is low when labeled data are few, a data set is divided into a plurality of original tasks in a training stage, then the model generalization capability is learned when the intention types are changed, and in a prediction stage, the classification capability of samples of the type can be learned only by a small amount of labeled data for new intention types (generally, the data quantity is small).

The inventor finds that the prior scheme has at least the following defects in the process of implementing the application:

the intention identification method based on multi-classification assumes that a user has only one intention per sentence, but in an actual dialogue system, the user often expresses multiple intentions at one time, for example, in express delivery customer service, when a robot asks the user what service is needed, the user may answer that "i want to send an express delivery, but i want to check the sending cost first", the user expresses two intentions of "sending an express delivery" and "inquiring cost" at the same time, the dialogue system with excellent experience needs to perform the inquiring cost flow first for the user, and then enters the ordering flow after the inquiry is completed, which cannot be solved by the intention identification method based on multi-classification.

The intention identification method based on one-vs-all multi-label classification directly maps the input text into an intention space, does not use extra data information and the relation between different intention labeling data, and has low identification accuracy.

The intention recognition method based on the raw shot learning is based on the assumption that only one intention exists in each sentence of the user, which is the same as the intention recognition method based on multi-classification at present, and therefore has the same defects

Disclosure of Invention

The embodiment of the invention provides a multi-intention recognition training and using method and a device, which are used for solving at least one of the technical problems.

In a first aspect, an embodiment of the present invention provides a multi-intent recognition training method, including: converting original intention labeling data into sentence pair data, wherein the sentence pair data at least comprises a target sentence in the original intention labeling data, a sentence pair consisting of representative sentences of intention types contained in the intention labeling data, and the similarity between the target sentence and the representative sentences; a sentence vector encoder encodes the target sentence and the representative sentence to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence; and after vector splicing is carried out on the target sentence vector and the representative sentence vector, the target sentence vector and the representative sentence vector are input into a classifier to judge whether the sentence pairs are similar or not, and the sentence vector encoder and the classifier are trained.

In a second aspect, an embodiment of the present invention provides a multi-intent recognition training method, including: predicting the intention of a text to be detected in real time based on a sentence vector encoder and a classifier trained by the method in the first aspect; converting the sentence vector encoder of the text to be detected after training into a sentence vector, and splicing the obtained sentence vector with the sentence vectors of the intention categories one by one; and sending the spliced vectors to the classifier to judge whether the vectors are similar.

In a third aspect, an embodiment of the present invention provides a multi-intent recognition training apparatus, including: a data conversion module configured to convert original intention tagging data into sentence pair data, wherein the sentence pair data at least includes a target sentence in the original intention tagging data, a sentence pair composed of representative sentences of intention categories included in the intention tagging data, and a similarity between the target sentence and the representative sentence; a sentence vector encoding module configured to encode the target sentence and the representative sentence by a sentence vector encoder to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence; and the similarity judgment training module is used for performing vector splicing on the target sentence vector and the representative sentence vector, inputting the target sentence vector and the representative sentence vector into a classifier to judge whether the sentence pairs are similar or not, and training the sentence vector encoder and the classifier.

In a fourth aspect, an embodiment of the present invention provides a multi-intent recognition using apparatus, including a real-time prediction module, configured to perform real-time prediction on an intent of a text to be detected based on a sentence vector encoder and a classifier trained by the method of any one of the first aspect; the conversion splicing module is configured to convert the sentence vector encoder of the text to be detected after training into sentence vectors, and then splice the obtained sentence vectors with the sentence vectors of the intention categories one by one; and the classification module is configured to send the spliced vectors to the classifier to judge whether the vectors are similar.

In a fifth aspect, an electronic device is provided, comprising: the system comprises at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the multi-intent recognition training or using method of any of the embodiments of the present invention.

In a sixth aspect, the present invention further provides a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, the computer program including program instructions, which, when executed by a computer, cause the computer to perform the steps of the multi-intent recognition training or using method according to any one of the embodiments of the present invention.

The method and the device provided by the application can form sentence pair data based on original intention labeling data, and further are used for training a sentence vector encoder and a classifier in a multi-intention recognition training device, so that a network capable of recognizing multiple intentions can be trained.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

FIG. 1 is a flowchart of a multi-intent recognition training method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method for multiple intent recognition according to an embodiment of the present invention;

FIG. 3 is a flow chart of a training phase according to an embodiment of the present invention;

FIG. 4 is a flow chart of an inference phase of an embodiment of the present invention;

FIG. 5 is a block diagram of a multi-intent recognition training apparatus according to an embodiment of the present invention;

FIG. 6 is a block diagram of a multiple intent recognition utilizing apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, which shows a flowchart of an embodiment of the multi-intent recognition training method of the present application, the multi-intent recognition training method of the present embodiment may be applied to a scene that needs to recognize and semantically understand a user intent, including understanding the user intent during speech recognition semantic understanding.

As shown in fig. 1, in step 101, original intention tagging data is converted into sentence pair data;

in step 102, a sentence vector encoder encodes the target sentence and the representative sentence to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence;

in step 103, after vector concatenation is performed on the target sentence vector and the representative sentence vector, the target sentence vector and the representative sentence vector are input to a classifier to determine whether the sentence pairs are similar, and training is performed on the sentence vector encoder and the classifier.

In this embodiment, for step 101, the multi-intent recognition training device converts original intent tagging data into sentence pair data, where the sentence pair data at least includes a target sentence in the original intent tagging data, a sentence pair composed of representative sentences of intent categories included in the intent tagging data, and similarities between the target sentence and the representative sentences; thus, a large amount of training text can be formed by the original intention labeling data, and the original intention labeling data can be obtained from users or be organized by developers, and the sources are wide and easy to obtain.

Then, in step 102, the multi-purpose recognition training device encodes the target sentence and the representative sentence via a sentence vector encoder to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence; finally, in step 103, the multi-purpose recognition training device performs vector concatenation on the target sentence vector and the representative sentence vector, inputs the result to a classifier to determine whether the sentence pairs are similar, and trains the sentence vector encoder and the classifier.

The method of the embodiment of the application can form sentence pair data based on original intention labeling data, and further is used for training a sentence vector encoder and a classifier in a multi-intention recognition training device, so that a network capable of recognizing multiple intentions can be trained.

In some optional embodiments, the training of the sentence vector encoder and the classifier comprises: composing a positive example sample of representative sentences of intention types contained in each piece of intention labeling data and the intention labeling data, and composing a negative example sample of representative sentences of intention types not contained in each piece of intention labeling data and the intention labeling data; and training the vector encoder and the classifier based on the positive examples samples and the negative examples samples. A multi-intention recognition system with better robustness can be trained by forming a positive example sample and a negative example sample based on original intention labeling data and then using the positive example sample and the negative example sample for network training.

In a further optional embodiment, the vector encoder comprises a bidirectional long-term memory network sentence vector encoder; the training of the sentence vector encoder and the classifier comprises completing the training of the sentence vector encoder and the classifier using a back propagation algorithm. Therefore, a better multi-intention recognition system can be trained by adopting a bidirectional long-time memory network and a back propagation algorithm, and a plurality of intentions of the user can be recognized better.

Please refer to fig. 2, which illustrates a multi-intent recognition using method according to an embodiment of the present application, where the method may be applied to devices that need to recognize a user intent, including smart voice devices such as a smart speaker, and is not described herein again.

As shown in fig. 2, in step 201, a sentence vector encoder and a classifier trained based on the method described in any of the foregoing embodiments predict the intention of a text to be detected in real time;

in step 202, a sentence vector encoder of the text to be detected after training is converted into a sentence vector, and then the obtained sentence vector is spliced with the sentence vectors of the intention categories one by one;

in step 203, the spliced vectors are sent to the classifier to determine whether the vectors are similar.

In the method of the embodiment, the text to be detected is input into the sentence vector encoder and the classifier trained in the previous embodiment, the sentence vector corresponding to the text to be detected is spliced with the sentence vectors of the intention categories by the sentence vector encoder, and then the spliced vectors are sent to the classifier to judge whether the vectors are similar, and then if the vectors are similar to the intention categories, various intentions can be identified.

In some optional embodiments, the sentence vector of the intention category comprises a sentence vector obtained by: all intent classes are converted to sentence vectors by the sentence vector encoder. Because the parameters of the sentence vector encoder are fixed, the conversion of the intention types can be performed off line, and the excessive occupation of real-time resources is avoided.

The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.

The inventors found in the course of carrying out the present application that the above drawbacks existing in the related art are mainly due to the following: the intention identification method based on multi-classification and the intention identification method based on raw shot learning mainly make improper assumptions, and the intention identification method based on one-vs-all multi-label classification does not fully use information and external data in a data set.

In view of the above-mentioned drawbacks of the prior art, the following solutions are easily conceivable by those skilled in the art:

1. the amount of marking data is increased, which is the most commonly used method, and is simple but labor-consuming and time-consuming.

2. The idea of the intention identification method based on the how shot learning is applied to the intention identification method based on the one-vs-all multi-label classification so as to improve the identification accuracy when the labeled data is less, and the current mainstream intention identification method based on the how shot learning is not matched with the realization form of the intention identification method based on the one-vs-all multi-label classification, so that the technical difficulty is high.

3. The method adopts the pretrained model fine adjustment trained by massive non-labeled data such as BERT (bidirectional Encoder retrieval from transformations), but has huge BERT parameter amount, high calculation resource requirement and high deployment cost in a dialogue system with higher real-time requirement.

The most intuitive solution to the problem of intent recognition is the use of the research efforts of text classification tasks in the field of natural language understanding. Sentence similarity calculation is often studied as another independent task in the field of natural language understanding-a problem in similar sentence matching, which is not easily conceivable to apply to problem solving for intent recognition.

In a specific example of the embodiment of the application, a set of representative sentences is set for each intention category, and whether the user expresses the intention is determined by calculating the similarity between the sentences spoken by the user and the representative sentences, for example, for "i want to send an express delivery, but i want to look up the delivery cost first", similarity is calculated between the representative sentences and the query delivery price, if the similarity is higher, the intention of sending the express delivery and the query delivery cost is included, and if the similarity is lower, the similarity is calculated between the representative sentences and other categories, and if the similarity is lower, the intention of not including other intentions is not included. Assuming that a common k-class intention exists, the number of the representative sentences of each class is b, the method can convert one marked text into about kb sentence pair samples, increases a large number of marked samples, and also fully utilizes the relationship information between texts in a data set.

The representative sentence is written by human according to comprehension, and external knowledge is introduced.

Referring to fig. 3, a flow chart of the training phase is shown.

Fig. 3 depicts the whole training phase process, and the main process can be divided into:

1. the original intention labeling data is converted into sentence pair data in the form of (target sentence, representative sentence, 0/1), wherein 1 denotes that the target sentence and the representative sentence are similar and is called positive example sample, and 0 denotes that the target sentence and the representative sentence are dissimilar and is called negative example sample. The target sentence is a sentence in the original intention labeling data, and the representative sentences are three types:

1) and (4) intention category names, such as sending express mails and inquiring the sending cost.

2) If I want to send an express, I want to place an order.

3) A part of sentences randomly extracted from the intention labeling data are extracted, and for each intention category, sentences containing only the intention are extracted as representative sentences as possible.

After the representative sentences of each category are prepared, each intention marking data and the representative sentences under the intention categories contained in the intention marking data form positive example samples one by one, and simultaneously, the intention marking data and the representative sentences under the intention categories not contained in the intention marking data form negative example samples. Assume two intent categories, a and B, each with two representative sentences, as shown in table 1. With two intent annotation data, q1 and q2, q1 containing two intents, a and B, and q2 containing B intents, the sentence pair data in table 2 can be constructed.

TABLE 1 intention Category and representative sentence example

Intention category	Representative sentence
		A	a1,a2
B	b1,b2

Table 2 example of sentences versus data

Sentence pair	Whether or not they are similar
		q1,a1	1
q1,a2	1
		q1,b1	1
q1,b2	1
		q2,a1	0
q2,a2	0
		q2,b1	1
q2,b2	1

After sentence pair data is generated, a target sentence and a representative sentence are coded by using a Bidirectional Long-Short Term Memory Network (BilSTM) and are converted into 2 d-dimensional sentence vectors, namely the target sentence vector and the representative sentence vector.

And finally, after splicing the target sentence vector and the representative sentence vector, judging whether the sentence pairs are similar through a classifier, and finishing training of a sentence vector encoder and the classifier by utilizing a back propagation algorithm.

Please refer to the drawings, which illustrate an inference process provided by an embodiment of the present application.

Fig. 4 illustrates the inference phase flow.

Fig. 4 describes the whole reasoning phase process, and the main process can be divided into:

1. all intention class names are converted into sentence vectors through the sentence vector encoder, and since parameters of the sentence vector encoder are fixed after training is completed, the step can be completed off line.

2. The method comprises the steps of predicting the intention of a sentence spoken by a user in real time, converting a text to be predicted into a sentence vector through a sentence vector encoder, splicing the obtained sentence vector with the sentence vectors of all intention category names obtained through calculation in advance one by one, finally sending the spliced vectors to a classifier to judge whether the vectors are similar or not, and outputting the identified intention category according to the judgment result.

Verification experiment

The experimental data come from a customer service dialogue system in the field of express delivery, and totally comprise 48 intents, wherein the training set size is 38452, and the test set size is 6025. An intention identification method based on one-vs-all multi-label classification is adopted as a baseline method, a sentence vector encoder in the baseline method and a sentence vector encoder in the method use a layer of BilSTM, a classifier uses two layers of fully-connected neural networks, and F1 score is used as an evaluation index. The results on the test set are as follows:

TABLE 3 comparison of Properties

	F1	Reasoning speed
			Baseline method	89.39	22 ms/text
Method based on text similarity	89.81	34 ms/text

From experimental results, it can be found that due to the design of a new training task, the recognition performance of the method based on the text similarity is greatly improved compared with that of a baseline method. Because the sentence vector text needs to be calculated in real time in the inference stage, the overall inference speed is only slightly reduced, and the real-time requirement of intention identification in a dialogue system can still be met.

Referring to fig. 5, a block diagram of a multi-intent recognition training apparatus according to an embodiment of the invention is shown.

As shown in fig. 5, the multi-intent recognition training apparatus 500 includes a data transformation module 510, a sentence vector encoding module 520, and a similarity determination training module 530.

The data conversion module 510 is configured to convert original intention tagging data into sentence pair data, where the sentence pair data at least includes a target sentence in the original intention tagging data, a sentence pair composed of representative sentences of intention categories included in the intention tagging data, and a similarity between the target sentence and the representative sentence; a sentence vector encoding module 520, configured to encode the target sentence and the representative sentence by a sentence vector encoder, so as to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence; and a similarity judgment training module 530, which performs vector concatenation on the target sentence vector and the representative sentence vector, inputs the result into a classifier to judge whether the sentence pairs are similar, and trains the sentence vector encoder and the classifier.

In some optional embodiments, the similarity determination training module is further configured to: composing a positive example sample of representative sentences of intention types contained in each piece of intention labeling data and the intention labeling data, and composing a negative example sample of representative sentences of intention types not contained in each piece of intention labeling data and the intention labeling data; and training the vector encoder and the classifier based on the positive examples samples and the negative examples samples.

Referring to fig. 6, a block diagram of a multi-purpose recognition device according to an embodiment of the invention is shown.

As shown in fig. 6, the multiple intent recognition using apparatus 600 includes a real-time prediction module 610, a conversion splicing module 620, and a classification module 630.

Wherein the real-time prediction module 610 is configured to predict the intention of the text to be detected in real time based on the sentence vector encoder and the classifier trained by the method of any one of claims 1-3; the conversion and splicing module 620 is configured to convert the sentence vector encoder of the text to be detected, which is trained, into sentence vectors, and then splice the obtained sentence vectors with the sentence vectors of the intention categories one by one; the classification module 630 is configured to send the spliced vectors to the classifier to determine whether the vectors are similar.

It should be understood that the modules depicted in fig. 5 and 6 correspond to various steps in the methods described with reference to fig. 1 and 2. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 5 and 6, and are not described again here.

It should be noted that the modules in the embodiments of the present application are not limited to the scheme of the present application, for example, the feature extraction module may be described as a module that performs feature extraction on a text sequence to obtain a text feature sequence in response to the received text sequence. In addition, the related functional modules may also be implemented by a hardware processor, for example, the word segmentation module may also be implemented by a processor, which is not described herein again.

In other embodiments, the present invention further provides a non-transitory computer storage medium storing computer-executable instructions that can perform the multi-intent recognition training and using method of any of the above method embodiments;

as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:

converting original intention labeling data into sentence pair data, wherein the sentence pair data at least comprises a target sentence in the original intention labeling data, a sentence pair consisting of representative sentences of intention types contained in the intention labeling data, and the similarity between the target sentence and the representative sentences;

a sentence vector encoder encodes the target sentence and the representative sentence to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence;

and after vector splicing is carried out on the target sentence vector and the representative sentence vector, the target sentence vector and the representative sentence vector are input into a classifier to judge whether the sentence pairs are similar or not, and the sentence vector encoder and the classifier are trained.

The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the multi-intent recognition training and use device, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected over a network to at most the intended recognition training and use device. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the above methods of multi-intent recognition training and use.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 7, the electronic device includes: one or more processors 710 and a memory 720, one processor 710 being illustrated in fig. 7. The apparatus of the multi-intent recognition training and use method may further comprise: an input device 730 and an output device 740. The processor 710, the memory 720, the input device 730, and the output device 740 may be connected by a bus or other means, such as the bus connection in fig. 7. The memory 720 is a non-volatile computer-readable storage medium as described above. The processor 710 executes various functional applications of the server and data processing by executing nonvolatile software programs, instructions and modules stored in the memory 720, namely, implements the method embodiments for multi-purpose recognition training and using. The input device 730 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the multi-intent recognition training and use device. The output device 740 may include a display device such as a display screen.

The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.

As an embodiment, the electronic device is applied to a multi-intent recognition training and using device, and includes:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.

(3) A portable entertainment device: such devices can display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A multi-intent recognition training method, comprising:

2. The method of claim 1, wherein the training of the sentence vector encoder and the classifier comprises:

composing a positive example sample of representative sentences of intention types contained in each piece of intention labeling data and the intention labeling data, and composing a negative example sample of representative sentences of intention types not contained in each piece of intention labeling data and the intention labeling data;

training the vector encoder and the classifier based on the positive case samples and the negative case samples.

3. The method of claim 1 or 2, wherein the vector encoder comprises a bidirectional long-term memory network sentence vector encoder;

the training of the sentence vector encoder and the classifier comprises completing the training of the sentence vector encoder and the classifier using a back propagation algorithm.

4. A multi-intent recognition usage method, comprising:

predicting the intention of a text to be detected in real time based on a sentence vector encoder and a classifier trained by the method of any one of claims 1-3;

converting the sentence vector encoder of the text to be detected after training into a sentence vector, and splicing the obtained sentence vector with the sentence vectors of the intention categories one by one;

and sending the spliced vectors to the classifier to judge whether the vectors are similar.

5. The method of claim 4, wherein the sentence vector of the intent category comprises being obtained by:

all intention categories are converted into sentence vectors by the sentence vector encoder.

6. A multi-intent recognition training device, comprising:

a data conversion module configured to convert original intention tagging data into sentence pair data, wherein the sentence pair data at least includes a target sentence in the original intention tagging data, a sentence pair composed of representative sentences of intention categories included in the intention tagging data, and a similarity between the target sentence and the representative sentence;

a sentence vector encoding module configured to encode the target sentence and the representative sentence by a sentence vector encoder to form a target sentence vector corresponding to the target sentence and a representative sentence vector corresponding to the representative sentence;

and the similarity judgment training module is used for performing vector splicing on the target sentence vector and the representative sentence vector, inputting the target sentence vector and the representative sentence vector into a classifier to judge whether the sentence pairs are similar or not, and training the sentence vector encoder and the classifier.

7. The apparatus of claim 6, wherein the similarity determination training module is further configured to:

8. A multiple intent recognition use device comprising:

a real-time prediction module configured to predict an intention of a text to be detected in real time based on a sentence vector encoder and a classifier trained by the method of any one of claims 1-3;

the conversion splicing module is configured to convert the sentence vector encoder of the text to be detected after training into sentence vectors, and then splice the obtained sentence vectors with the sentence vectors of the intention categories one by one;

and the classification module is configured to send the spliced vectors to the classifier to judge whether the vectors are similar.

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1 to 5.

10. A storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, is adapted to carry out the steps of the method of any one of claims 1 to 5.