CN113221644A - Slot position word recognition method and device, storage medium and electronic equipment - Google Patents

Slot position word recognition method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN113221644A
CN113221644A CN202110367820.4A CN202110367820A CN113221644A CN 113221644 A CN113221644 A CN 113221644A CN 202110367820 A CN202110367820 A CN 202110367820A CN 113221644 A CN113221644 A CN 113221644A
Authority
CN
China
Prior art keywords
word
text
feature vector
slot position
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110367820.4A
Other languages
Chinese (zh)
Inventor
薛闯
陈子鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Yuanguang Mobile Interconnection Technology Co ltd
Yuanguang Software Co Ltd
Original Assignee
Zhuhai Yuanguang Mobile Interconnection Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Yuanguang Mobile Interconnection Technology Co ltd filed Critical Zhuhai Yuanguang Mobile Interconnection Technology Co ltd
Priority to CN202110367820.4A priority Critical patent/CN113221644A/en
Publication of CN113221644A publication Critical patent/CN113221644A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application discloses a slot position word identification method and device, a storage medium and electronic equipment, and belongs to the technical field of computers. The method comprises the following steps: marking a preset slot position word in the first text; acquiring a second text and calculating a feature vector of each character in the second text; performing word segmentation processing on the second text to obtain a plurality of words; calculating the feature vector of each word according to the feature vector of the character contained in the word; similarity calculation is carried out on the feature vector of each word and the feature vector of the preset slot position word; and identifying the word with the maximum similarity as the target slot position word in the second text. According to the method and the device, the neural network model does not need to be retrained by a large amount of sample data to identify the slot position words, and the complexity and the efficiency of identification can be reduced.

Description

Slot position word recognition method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of computers, and in particular, to a method and an apparatus for identifying slot position words, a storage medium, and an electronic device.
Background
During a human-machine conversation, the machine needs to understand the meaning of the conversation. At present, semantic information of a user conversation is generally represented by a structural representation method of intentions and slots, so how to accurately extract slot words in the conversation is the focus of current research. In the related art, a deep learning-based method is generally used for identifying slot position information in a conversation, then the method needs to be trained based on a large amount of sample data to obtain a neural network model, and then the slot position information can be better identified by using the neural network model, so that the problem existing in the slot position word extraction method in the related art is known to include: the neural network model trained in the absence of sample data has poor recognition effect, can only recognize slot position information of a specific type in advance, and needs to be trained in advance to obtain the neural network model of the new slot position information when the slot position information is newly added, so that a great amount of time is consumed in the process of training the neural network model.
Disclosure of Invention
The slot position word recognition method, the slot position word recognition device, the storage medium and the terminal can solve the problems that a neural network model obtained through training is poor in slot position word recognition effect and long in time consumption for training the neural network model under the condition that sample data is lacked. The technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a method for identifying a slot position word, where the method includes:
marking a preset slot position word in the first text;
acquiring a second text and calculating a feature vector of each character in the second text;
performing word segmentation processing on the second text to obtain a plurality of words;
calculating the feature vector of each word according to the feature vector of the character contained in the word;
similarity calculation is carried out on the feature vector of each word and the feature vector of the preset slot position word;
and identifying the word with the maximum similarity as the target slot position word in the second text.
In a second aspect, an embodiment of the present application provides an apparatus for identifying a slot position word, where the apparatus includes:
the marking unit is used for marking a preset slot position word in the first text;
the calculation unit is used for acquiring a second text and calculating a feature vector of each character in the second text;
the word segmentation unit is used for carrying out word segmentation processing on the second text to obtain a plurality of words;
the calculation unit is further used for calculating the feature vector of each word according to the feature vector of the character contained in the word;
the calculation unit is further configured to perform similarity calculation on the feature vectors of the words and the feature vectors of the preset slot position words;
and the identification unit is used for identifying the word with the maximum similarity as the target slot position word in the second text.
In a third aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
when a new slot position word needs to be identified, setting a reference text, marking a preset slot position word in the reference text, calculating a characteristic vector of each character in the text to be identified according to the context information, the method comprises the steps of performing word segmentation processing on a text to be recognized to obtain a plurality of words, performing similarity calculation based on the feature vectors of the words and the feature vectors of preset slot words, and using the word with the maximum similarity as a target slot word in the text to be recognized, so that the problems that the related technology needs a complicated and tedious model training process and a large amount of sample data before a new slot word is recognized are solved, the recognition method does not depend on the model training process, the time consumption is low, meanwhile, only a reference text marked as a preset slot position word needs to be provided in advance, and the required data amount is small, so that the slot position word identification method has the advantages of simple identification process and high efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic interface diagram of a human-machine conversation provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a slot position word recognition method according to an embodiment of the present application;
fig. 3 is another schematic flow chart of a slot position word identification method provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of computing feature vectors of characters according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a slot position word recognition device provided in the present application;
fig. 6 is a schematic structural diagram of an electronic device provided in the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic diagram of an interface for a user to perform a human-computer conversation.
The embodiment of the application provides a slot position word identification method, which can be applied to electronic equipment. The electronic device may be a smartphone, a tablet computer, a gaming device, an AR (Augmented Reality) device, an automobile, a data storage device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as an electronic watch, an electronic glasses, an electronic helmet, an electronic bracelet, an electronic necklace, an electronic garment, or the like. The electronic device may obtain the text to be recognized input by the user through the input device, for example: the electronic equipment acquires voice data input by a user through a voice acquisition device and then converts the voice data into a text to be recognized; or the electronic equipment acquires the text to be recognized input by the user through the keyboard.
The following description will be made by taking an example in which the electronic device performs a man-machine conversation through the voice collecting apparatus:
when the display screen of the electronic equipment is in the screen-off state, the electronic equipment collects voice sent by a user 100, the voice is converted into voice data, acoustic features in the voice data are extracted, the acoustic features are input into the voiceprint personal model, the voiceprint personal model performs voiceprint awakening on the acoustic features, whether the user 100 is a preset user is identified, if yes, whether awakening words are included in the voice data is continuously judged, if yes, the voice control function is switched to be in an activated state, the activated state is kept for preset duration, and the display screen is switched to be in a lighting state.
In the activated state, the electronic device 101 may receive a control voice sent by the user 100, convert the control voice into a control instruction, and then perform an operation corresponding to the control instruction. For example: and calling the XX contact person, inquiring weather, playing music, starting an application program and the like. Among other things, the electronic device 101 may also convert voice data to text data and then display the text data on the display screen.
Wherein, the electronic device 101 may set a wake-up word, for example: the method comprises the steps that a preset awakening word in electronic equipment 101 is 'XX genie', a user 100 sends a section of voice, the electronic equipment acquires the voice to obtain voice data, the voice data are converted into text data which are 'XX genie', the electronic equipment determines that the text data comprise the preset awakening word according to a voiceprint personal model, then extracts acoustic features of the voice data, inputs the acoustic features into the voiceprint personal model, identifies the user 100 who sends out the voice data as a preset user according to the acoustic features, activates a voice control function, and displays a voice control interface 102, wherein the voice control interface 102 comprises a microphone icon 103, the voice control function is in an activated state, the microphone icon 103 is switched from a static display state to a dynamic display state, the dynamically displayed microphone icon 103 is used for prompting the user that the voice control function of the electronic equipment is in the activated state, and the electronic equipment 101 keeps the activated state of the voice control function within a preset duration, after the preset time length is exceeded, the electronic device 101 switches the voice control function from the active state to the dormant state, and simultaneously switches the display screen to the off-screen state, and simultaneously the microphone icon 103 is displayed in a static manner. In the screen-off state, if the user 100 needs to use the voice control function, the voice control function needs to be reactivated in the manner described above, and if the electronic device 101 is in the screen-on state, the user 100 may click the microphone icon 103 to switch the voice control function to the activated state.
Among them, the electronic device 101 may also have installed thereon various communication client applications, such as: voice interactive applications, video recording applications, voice interactive applications, search-type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The electronic device 101 may be hardware or software. When the electronic device 101 is hardware, it may be various electronic devices having a display screen, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the electronic device 101 is software, it may be installed in the electronic device listed above. Which may be implemented as multiple software or software modules (e.g., to provide distributed services) or as a single software or software module, and is not particularly limited herein.
When the electronic device 101 is a hardware, a display device may be further installed thereon, and the display device may be various devices capable of implementing a display function, such as: a Cathode ray tube display (CR), a Light-emitting diode display (LED), an electronic ink panel, a Liquid Crystal Display (LCD), a Plasma Display Panel (PDP), and the like. A user may utilize a display device on the electronic device 101 to view information such as displayed text, pictures, videos, and the like.
The method for identifying slot position words provided by the embodiment of the present application will be described in detail below with reference to fig. 2 to 3. The slot word recognition device in the embodiment of the present application may be the electronic device shown in fig. 2 to 3.
Please refer to fig. 2, which is a flowchart illustrating a slot position word recognition method according to an embodiment of the present disclosure. As shown in fig. 2, the method of the embodiment of the present application may include the steps of:
s201, marking a preset slot position word in the first text.
Wherein the recognition device sets a first text, and the first text may be set according to an actual scene, for example: the method includes the steps that a user marks a preset slot position word in a first text, the first text comprises a plurality of characters, the slot position word is composed of one or more characters (token), the type of the preset slot position word can be place name, date, weather, time and the like, and the method is not limited in the application. The user can mark the preset slot words by adding marks in the first text, and the number of the standard preset slot words of the user can be one or more.
In a possible implementation, the marking a preset slot word in the first text includes:
generating a first text according to an actual scene;
adding a slot position word start mark and a slot position word end mark in the first text based on the marking operation of a user;
and determining a preset slot position word based on the characters corresponding to the slot position word start mark and the slot position word end mark.
The slot position word start mark and the slot position word end mark can be different marks, the electronic equipment determines characters between the two marks as preset slot position words, and the electronic equipment does not need to identify and detect the types of the slot position words.
For example, the actual scene is a scene of a predetermined ticket, and the generating of the first text according to the scene of the predetermined ticket is: to help me to order a train ticket to beijing, a user marks a preset slot word in a first text in a mode of adding a slot word start mark and a slot word end mark, for example: the slot word start mark is { "and the slot word end mark is" } ", that is, the preset slot word in the first text is labeled by using" { } ", and the first text labeled with the preset slot word is represented as: to buy I order a train ticket to get rid of Beijing.
S202, acquiring a second text and calculating a feature vector of each character in the second text.
The electronic device obtains the second text of the slot word to be identified, and the method for obtaining the second text by the electronic device may be: and acquiring a second text input by the user through a keyboard, or acquiring voice data of the user through a microphone, and performing text conversion on the voice data to obtain the second text. The second text is composed of a plurality of characters, and the electronic device then calculates feature vectors for the individual characters in the second text, which can be calculated using a semantic representation model (e.g., BERT, Bidirectional Encoder Representations from transforms) and a base semantic model.
For example: the second text to be recognized is: and ordering a ticket going to Shanghai, wherein the second text consists of 9 characters, and the electronic equipment calculates the feature vectors of all the characters in the second text as follows: y1, y2, y3, y4, y5, y6, y7, y8 and y 9.
Further, calculating a feature vector of each character in the second text, including:
processing the first text through a basic semantic model to obtain a first basic semantic feature vector;
processing the second text through the basic semantic model to obtain a second basic semantic feature vector;
and processing the first basic semantic feature vector, the second basic semantic feature vector, the first text and the second text through a semantic representation model to obtain feature vectors of all characters in the second text.
The basic semantic model can be a syntax analysis model, a dependency syntax analysis model and the like, the basic semantic model is used for distinguishing the characteristics of two words with similar meaning but different semantics in the text, the semantic representation model is used for extracting context information of each character, the semantic representation model can be BERT or ERNIE and the like, and according to the calculation method of the feature vectors, the feature vectors of each character have the context information so as to make a good foundation for accurately identifying the slot position words subsequently.
And S203, performing word segmentation processing on the second text to obtain a plurality of words.
The electronic device performs word segmentation on the second text to obtain a plurality of words, and the word segmentation algorithm may be a maximum matching algorithm, a shortest path word segmentation algorithm, or any other self-defined word segmentation algorithm. For example: according to the second text obtained in S202, 6 words are obtained by performing word segmentation processing on the second text, where the 6 words are: booking, one, going, ticket.
And S204, calculating the feature vector of each word according to the feature vector of the character contained in each word.
The electronic device determines the characters included in each word obtained by the word processing in S203, and since the feature vectors of the characters have been calculated in S202, the present application may be used for calculating the feature vector of the word by using the feature vectors of the characters included in the word, for example: the feature vectors of the words are calculated using a weighted average, an arithmetic average, or other algorithm.
For example, according to the example of S203, the feature vectors of 6 words are calculated as: m1, M2, M3, M4, M5 and M6, for the word "one sheet", they include two characters "one" and "sheet", the feature vector of the character "one" is y2, the feature vector of the character "sheet" is y3, the feature vector M2 of the word "one sheet" is obtained by weighted average of y2 and y3, and M2 is (y2+ y 3)/2.
And S205, carrying out similarity calculation on the feature vector of each word and the feature vector of the preset slot position word.
The feature vector of the preset slot word in the first text is determined, and the calculation method may be based on a semantic representation model, and similarity calculation is performed on the feature vector of each word calculated in S204 and the feature vector of the preset slot word. According to the example of S204, similarity calculation is performed on the feature vectors of the 6 words and the feature vector of the preset slot word "beijing" respectively to obtain 6 similarities. Similarity algorithms include, but are not limited to: euclidean distance, cosine similarity, pearson correlation coefficient, or any other similarity algorithm.
And S206, identifying the word with the maximum similarity as the target slot word in the second text.
The greater the similarity is, the more similar the semantics between the two words are, the word with the greatest similarity is determined from the multiple similarities calculated in S205, and the word is used as the target slot word in the second text. For example: the similarity between the feature vector of the word "shanghai" in the second text and the preset slot position word "beijing" is the largest, and the word "shanghai" is the target slot position word in the second text.
When the scheme of the embodiment of the application is executed, when a new slot word needs to be recognized, a reference text is set, a preset slot word is marked in the reference text, the feature vector of each character in the text to be recognized is calculated according to context information, the text to be recognized is subjected to word segmentation processing to obtain a plurality of words, similarity calculation is performed based on the feature vector of each word and the feature vector of the preset slot word, the word with the maximum similarity is used as a target slot word in the text to be recognized, the problems that a complex and tedious model training process needs to be performed before the new slot word is recognized and a large amount of sample data needs to be solved in the related technology are solved, the recognition method of the application does not depend on the model training process, consumes less time, only needs to provide the reference text marked as the preset slot word in advance, the required data amount is small, in addition, the feature vector calculated by the application has context information, the slot position word recognition method has the advantages of simple recognition process and high efficiency.
Please refer to fig. 3, which is a flowchart illustrating a slot position word recognition method according to an embodiment of the present disclosure. The present embodiment is exemplified by applying the slot word recognition method to the electronic device. The slot position word recognition method can comprise the following steps:
s301, generating a first text according to the actual scene.
The actual scenes are the same as the scenes of the subsequent second texts to be recognized, and the actual scenes include but are not limited to: the electronic device can generate a first text based on the input operation of the user, and the user can input the first text through a microphone or a keyboard. For example: the first text generated by the electronic equipment is 'help me order a south navigation air ticket to beijing'.
S302, adding a slot word start mark and a slot word end mark in the first text based on the marking operation of the user.
The slot position word start mark and the slot position word end mark can be different marks, the electronic equipment determines characters between the two marks as preset slot position words, and the electronic equipment does not need to identify and detect the types of the slot position words.
For example, the actual scene is a scene of a predetermined ticket, and the generating of the first text according to the scene of the predetermined ticket is: to help me to order a south navigation air ticket to beijing, a user marks a preset slot word in a first text in a mode of adding a slot word start mark and a slot word end mark, for example: the slot position word is marked as [ ", the slot position word is marked as" ] ", namely, the preset slot position word in the first text is marked by using the [ ]", and the first text marked with the preset slot position word is represented as follows: to help me order a south navigation ticket to Beijing.
And S303, determining a preset slot position word based on the characters corresponding to the slot position word start mark and the slot position word end mark.
The word composed of one or more characters included in the slot word start mark and the slot word technical mark is a preset slot word, for example: the default slot word in S302 is "south navigation".
And S304, acquiring voice data of the user through a voice acquisition unit.
The voice data may be voice data generated after a user reads a specific word, for example: the user speaks a particular string or number generated voice data. After a user sends out voice, the electronic equipment converts the voice into a voice signal in an analog form through the audio acquisition device, and the audio acquisition device can be a single microphone or a microphone array formed by a plurality of microphones. The electronic device then preprocesses the analog voice signal to obtain digital voice data, where the preprocessing includes, but is not limited to, filtering, amplifying, sampling, analog-to-digital conversion, and format conversion. The voice data may be voice data in a lossless format, such as: the format of the voice data is: CD. WAV (waveform file), FLAC (Free Lossless Audio Codec) format, and the like.
S305, performing text conversion on the voice data to obtain a second text.
The electronic equipment converts the voice data into a second text to be recognized based on the voice text conversion model, and further, the electronic equipment can utilize preset noise template data to filter the collected voice data, and the noise template data can be generated by collecting the environmental noise data with preset duration by the electronic equipment, so that the accuracy of the voice conversion text can be improved.
For example: the second text acquired by the electronic device is "order a national line ticket to go to shanghai tomorrow".
S306, processing the first text through the basic semantic model to obtain a first basic semantic feature vector.
The basic semantic model is used for distinguishing the characteristics of two words with similar meanings but different semantics in the text, and can be a syntax analysis model, a dependency syntax analysis model and the like.
S307, processing the second text through the basic semantic model to obtain a second basic semantic feature vector.
S308, processing the first basic semantic feature vector, the second basic semantic feature vector, the first text and the second text through a semantic representation model to obtain feature vectors of all characters in the second text.
The semantic representation model is used for extracting context information of each character, the semantic representation model can be BERT or ERNIE and the like, and according to the calculation method of the feature vector, the feature vector of each character has the context information, so that a foundation is provided for accurately identifying the slot position words subsequently.
Referring to fig. 4, a schematic diagram of the method for calculating feature vectors of each character in a second text is shown, in which a first text is input to a basic semantic model to obtain a first basic semantic feature vector, a second text is input to the basic semantic model to obtain a second basic semantic feature vector, and then the first text, the second text, the first basic semantic feature vector, and the second basic semantic feature vector are input to a semantic representation model to obtain feature vectors of each character in the second text.
For example: for example: the second text to be recognized is: and (3) ordering a state line ticket going to Shanghai, wherein the second text consists of 11 characters, and the electronic equipment calculates the feature vectors of all the characters in the second text as follows: y1, y2, y3, y4, y5, y6, y7, y8, y9, y10 and y 11.
S309, performing word segmentation processing on the second text based on a word segmentation algorithm matched with the text to obtain a plurality of words.
The word segmentation algorithm based on text matching comprises the following steps: a forward maximum matching method (left-to-right direction), a reverse maximum matching method (right-to-left direction), a least segmentation method (minimizing the number of words to be cut out per sentence), and the like. Furthermore, the method can also adopt a generative model for word segmentation, wherein the generative model mainly comprises an n-gram model, an HMM hidden Markov model, a naive Bayes classification and the like. The n-gram model and the HMM model are applied more frequently in word segmentation. The HMM model considers that two sequences exist when solving the sequence labeling problem, one is an observation sequence, namely a sentence which is observed by people, and the sequence label is a hidden state sequence, namely the observation sequence is X, the hidden state sequence is Y, and the causal relationship is Y- > X. Therefore, to obtain the labeling result Y, the probability of X, the probability of Y, and P (X | Y) must be calculated, i.e., a probability distribution model of P (X, Y) must be established.
For example: according to the example of S308, the second text is subjected to word segmentation to obtain 7 words, where the 7 words are: booking, one, going, Shanghai, China aviation and air ticket.
S310, determining the feature vectors of a plurality of characters contained in the word.
S311, carrying out weighted average on the feature vectors of the characters to obtain the feature vector of the word.
The electronic device determines the characters included in each word obtained by the word segmentation processing, and since the feature vector of each character has been calculated in S308, the present application may be used for calculating the feature vector of the word using the feature vector of the characters included in the word, for example: the feature vectors of the words are calculated using a weighted average, an arithmetic average, or other algorithm.
For example, according to the above example, feature vectors for 7 words are calculated as: m1, M2, M3, M4, M5, M6, M7, for the word "national aviation", it includes two characters "country" and "aviation", the feature vector of the character "country" is y8, the feature vector of the character "aviation" is y9, the feature vector M6 of the word "national aviation" is obtained by weighted average of y8 and y9, M6 ═ α × y8+ β × y9, α and β are greater than 0, and α + β ═ 1, α and β are weighting coefficients, the values of which can be determined according to actual needs, and the application is not limited.
And S312, performing similarity calculation on the feature vector of each word and the feature vector of the preset slot word.
The feature vector of the preset slot word in the first text is determined, and the calculation method may be based on a semantic representation model, and similarity calculation is performed on the feature vector of each word calculated in 308 and the feature vector of the preset slot word. According to the above example, the feature vectors of the 7 words are respectively subjected to similarity calculation with the feature vector of the preset slot word "beijing" to obtain 7 similarities. Similarity algorithms include, but are not limited to: euclidean distance, cosine similarity, pearson correlation coefficient, or any other similarity algorithm.
And S313, identifying the word with the maximum similarity as the target slot word in the second text.
The greater the similarity is, the more similar the semantics between the two words are, the word with the maximum similarity is determined from the multiple similarities calculated in S312, and the word is used as the target slot word in the second text. For example: the similarity between the feature vector of the word "national navigation" in the second text and the preset slot position word "south navigation" is the largest, and the word "shanghai" is the target slot position word in the second text.
When the scheme of the embodiment of the application is executed, when a new slot position word needs to be identified, a reference text is set, a preset slot position word is marked in the reference text, the characteristic vector of each character in the text to be identified is calculated according to the context information, the method comprises the steps of performing word segmentation processing on a text to be recognized to obtain a plurality of words, performing similarity calculation based on the feature vectors of the words and the feature vectors of preset slot words, and using the word with the maximum similarity as a target slot word in the text to be recognized, so that the problems that the related technology needs a complicated and tedious model training process and a large amount of sample data before a new slot word is recognized are solved, the recognition method does not depend on the model training process, the time consumption is low, meanwhile, only a reference text marked as a preset slot position word needs to be provided in advance, and the required data amount is small, so that the slot position word identification method has the advantages of simple identification process and high efficiency.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 5, a schematic structural diagram of a slot word recognition device according to an exemplary embodiment of the present application is shown, and hereinafter referred to as the recognition device 5. The identification means 5 may be implemented as all or part of a terminal, in software, hardware or a combination of both. The recognition means 5 comprises: a labeling unit 501, a calculating unit 502, a word segmentation unit 503 and a recognition unit 504.
A labeling unit 501, configured to label a preset slot word in a first text;
a calculating unit 502, configured to obtain a second text and calculate a feature vector of each character in the second text;
a word segmentation unit 503, configured to perform word segmentation processing on the second text to obtain multiple words;
the calculating unit 502 is further configured to calculate a feature vector of each word according to the feature vector of the character included in the word;
the calculating unit 502 is further configured to perform similarity calculation on the feature vector of each word and the feature vector of the preset slot word;
the identifying unit 504 is configured to identify a word with the largest similarity as the target slot word in the second text, where in one or more possible embodiments, the marking of the preset slot word in the first text includes: generating a first text according to an actual scene;
adding a slot position word start mark and a slot position word end mark in the first text based on the marking operation of a user;
and determining a preset slot position word based on the characters corresponding to the slot position word start mark and the slot position word end mark.
In one or more possible embodiments, the calculating the feature vector of each character in the second text includes:
processing the first text through a basic semantic model to obtain a first basic semantic feature vector;
processing the second text through the basic semantic model to obtain a second basic semantic feature vector;
and processing the first basic semantic feature vector, the second basic semantic feature vector, the first text and the second text through a semantic representation model to obtain feature vectors of all characters in the second text.
In one or more possible embodiments, the calculating the feature vector of each word according to the feature vector of the character included in the word includes:
determining a feature vector of a plurality of characters contained in a word;
and carrying out weighted average on the feature vectors of the characters to obtain the feature vector of the word.
In one or more possible embodiments, the performing word segmentation processing on the second text to obtain a plurality of words includes:
and performing word segmentation processing on the second text based on a word segmentation algorithm matched with the text to obtain a plurality of words.
In one or more possible embodiments, the obtaining the second text includes:
acquiring voice data of a user through a voice acquisition unit;
and performing text conversion on the voice data to obtain a second text.
In one or more possible embodiments, the acquiring, by the voice acquisition unit, the voice data of the user includes:
and filtering the voice data through preset noise template data.
It should be noted that, when the slot word recognition method is executed by the recognition apparatus 5 according to the above embodiment, only the division of the above function modules is taken as an example, and in practical applications, the function allocation may be completed by different function modules according to needs, that is, the internal structure of the device is divided into different function modules, so as to complete all or part of the functions described above. In addition, the slot position word recognition device provided by the above embodiments and the slot position word recognition method embodiments belong to the same concept, and the detailed implementation process is shown in the method embodiments and will not be described herein.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, where the instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 2 to fig. 3, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 2 to fig. 3, which is not described herein again.
The present application further provides a computer program product storing at least one instruction, which is loaded and executed by the processor to implement the slot word identification method according to the above embodiments.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 6, the electronic device 600 may include: at least one processor 601, at least one network interface 604, a user interface 603, a memory 605, at least one communication bus 602.
Wherein a communication bus 602 is used to enable the connection communication between these components.
The user interface 603 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 603 may also include a standard wired interface and a wireless interface.
The network interface 604 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
Processor 601 may include one or more processing cores, among others. The processor 601 connects various parts throughout the terminal 600 using various interfaces and lines to perform various functions of the terminal 600 and process data by executing or executing instructions, programs, code sets or instruction sets stored in the memory 605 and invoking data stored in the memory 605. Optionally, the processor 601 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 601 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 601, but may be implemented by a single chip.
The Memory 605 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 605 includes a non-transitory computer-readable medium. The memory 605 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 605 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 605 may optionally be at least one storage device located remotely from the processor 601. As shown in fig. 6, the memory 605, which is one type of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a voiceprint wake application.
In the electronic device 600 shown in fig. 6, the user interface 603 is mainly used for providing an input interface for a user to obtain data input by the user; the processor 601 may be configured to call the touch operation response application stored in the memory 605, and specifically perform the following operations:
marking a preset slot position word in the first text;
acquiring a second text and calculating a feature vector of each character in the second text;
performing word segmentation processing on the second text to obtain a plurality of words;
calculating the feature vector of each word according to the feature vector of the character contained in the word;
similarity calculation is carried out on the feature vector of each word and the feature vector of the preset slot position word;
and identifying the word with the maximum similarity as the target slot position word in the second text.
In one or more embodiments, the processor 601 executes the marking of the preset slot word in the first text, including:
generating a first text according to an actual scene;
adding a slot position word start mark and a slot position word end mark in the first text based on the marking operation of a user;
and determining a preset slot position word based on the characters corresponding to the slot position word start mark and the slot position word end mark.
In one or more embodiments, processor 601 performs the calculating the feature vectors of the characters in the second text, including:
processing the first text through a basic semantic model to obtain a first basic semantic feature vector;
processing the second text through the basic semantic model to obtain a second basic semantic feature vector;
and processing the first basic semantic feature vector, the second basic semantic feature vector, the first text and the second text through a semantic representation model to obtain feature vectors of all characters in the second text.
In one or more embodiments, the processor 601 performs the calculation of the feature vector of each word according to the feature vector of the character included in the word, including:
determining a feature vector of a plurality of characters contained in a word;
and carrying out weighted average on the feature vectors of the characters to obtain the feature vector of the word.
In one or more embodiments, the processor 601 performs the word segmentation processing on the second text to obtain a plurality of words, including:
and performing word segmentation processing on the second text based on a word segmentation algorithm matched with the text to obtain a plurality of words.
In one or more embodiments, processor 601 performs the obtaining the second text, including:
acquiring voice data of a user through a voice acquisition unit;
and performing text conversion on the voice data to obtain a second text.
In one or more embodiments, the processor 601 executing the acquiring of the voice data of the user through the voice collecting unit includes:
and filtering the voice data through preset noise template data.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (10)

1. A method for identifying slot position words, which is characterized by comprising the following steps:
marking a preset slot position word in the first text;
acquiring a second text and calculating a feature vector of each character in the second text;
performing word segmentation processing on the second text to obtain a plurality of words;
calculating the feature vector of each word according to the feature vector of the character contained in the word;
similarity calculation is carried out on the feature vector of each word and the feature vector of the preset slot position word;
and identifying the word with the maximum similarity as the target slot position word in the second text.
2. The method of claim 1, wherein the marking of the pre-slot word in the first text comprises:
generating a first text according to an actual scene;
adding a slot position word start mark and a slot position word end mark in the first text based on the marking operation of a user;
and determining a preset slot position word based on the characters corresponding to the slot position word start mark and the slot position word end mark.
3. The method of claim 1, wherein computing the feature vector for each character in the second text comprises:
processing the first text through a basic semantic model to obtain a first basic semantic feature vector;
processing the second text through the basic semantic model to obtain a second basic semantic feature vector;
and processing the first basic semantic feature vector, the second basic semantic feature vector, the first text and the second text through a semantic representation model to obtain feature vectors of all characters in the second text.
4. The method according to claim 1, 2 or 3, wherein the calculating the feature vector of each word according to the feature vector of the character contained in the word comprises:
determining a feature vector of a plurality of characters contained in a word;
and carrying out weighted average on the feature vectors of the characters to obtain the feature vector of the word.
5. The method of claim 1, 2 or 3, wherein the tokenizing the second text to obtain a plurality of words comprises:
and performing word segmentation processing on the second text based on a word segmentation algorithm matched with the text to obtain a plurality of words.
6. The method of claim 1, 2 or 3, wherein the obtaining the second text comprises:
acquiring voice data of a user through a voice acquisition unit;
and performing text conversion on the voice data to obtain a second text.
7. The method of claim 6, wherein the obtaining voice data of the user by the voice capture unit comprises:
and filtering the voice data through preset noise template data.
8. An apparatus for slot word recognition, the apparatus comprising:
the marking unit is used for marking a preset slot position word in the first text;
the calculation unit is used for acquiring a second text and calculating a feature vector of each character in the second text;
the word segmentation unit is used for carrying out word segmentation processing on the second text to obtain a plurality of words;
the calculation unit is further used for calculating the feature vector of each word according to the feature vector of the character contained in the word;
the calculation unit is further configured to perform similarity calculation on the feature vectors of the words and the feature vectors of the preset slot position words;
and the identification unit is used for identifying the word with the maximum similarity as the target slot position word in the second text.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to carry out the method steps according to any one of claims 1 to 7.
10. An electronic device, comprising: a processor, a memory, and a microphone; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 7.
CN202110367820.4A 2021-04-06 2021-04-06 Slot position word recognition method and device, storage medium and electronic equipment Pending CN113221644A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110367820.4A CN113221644A (en) 2021-04-06 2021-04-06 Slot position word recognition method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110367820.4A CN113221644A (en) 2021-04-06 2021-04-06 Slot position word recognition method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN113221644A true CN113221644A (en) 2021-08-06

Family

ID=77086396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110367820.4A Pending CN113221644A (en) 2021-04-06 2021-04-06 Slot position word recognition method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113221644A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918680A (en) * 2019-03-28 2019-06-21 腾讯科技(上海)有限公司 Entity recognition method, device and computer equipment
CN111061840A (en) * 2019-12-18 2020-04-24 腾讯音乐娱乐科技(深圳)有限公司 Data identification method and device and computer readable storage medium
CN111369980A (en) * 2020-02-27 2020-07-03 网易有道信息技术(北京)有限公司江苏分公司 Voice detection method and device, electronic equipment and storage medium
CN112036186A (en) * 2019-06-04 2020-12-04 腾讯科技(深圳)有限公司 Corpus labeling method and device, computer storage medium and electronic equipment
CN112069828A (en) * 2020-07-31 2020-12-11 飞诺门阵(北京)科技有限公司 Text intention identification method and device
CN112149414A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Text similarity determination method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918680A (en) * 2019-03-28 2019-06-21 腾讯科技(上海)有限公司 Entity recognition method, device and computer equipment
CN112036186A (en) * 2019-06-04 2020-12-04 腾讯科技(深圳)有限公司 Corpus labeling method and device, computer storage medium and electronic equipment
CN111061840A (en) * 2019-12-18 2020-04-24 腾讯音乐娱乐科技(深圳)有限公司 Data identification method and device and computer readable storage medium
CN111369980A (en) * 2020-02-27 2020-07-03 网易有道信息技术(北京)有限公司江苏分公司 Voice detection method and device, electronic equipment and storage medium
CN112069828A (en) * 2020-07-31 2020-12-11 飞诺门阵(北京)科技有限公司 Text intention identification method and device
CN112149414A (en) * 2020-09-23 2020-12-29 腾讯科技(深圳)有限公司 Text similarity determination method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110265040B (en) Voiceprint model training method and device, storage medium and electronic equipment
CN110288077B (en) Method and related device for synthesizing speaking expression based on artificial intelligence
CN110807388B (en) Interaction method, interaction device, terminal equipment and storage medium
CN110808034A (en) Voice conversion method, device, storage medium and electronic equipment
CN112650831A (en) Virtual image generation method and device, storage medium and electronic equipment
US20240070397A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
CN113205817A (en) Speech semantic recognition method, system, device and medium
EP3164864A2 (en) Generating computer responses to social conversational inputs
CN110602516A (en) Information interaction method and device based on live video and electronic equipment
CN112765971B (en) Text-to-speech conversion method and device, electronic equipment and storage medium
CN110765294B (en) Image searching method and device, terminal equipment and storage medium
CN112309365A (en) Training method and device of speech synthesis model, storage medium and electronic equipment
CN112837401A (en) Information processing method and device, computer equipment and storage medium
JP2022502758A (en) Coding methods, equipment, equipment and programs
CN112632244A (en) Man-machine conversation optimization method and device, computer equipment and storage medium
CN109063624A (en) Information processing method, system, electronic equipment and computer readable storage medium
CN112910761B (en) Instant messaging method, device, equipment, storage medium and program product
CN110781327B (en) Image searching method and device, terminal equipment and storage medium
CN114503193A (en) Multi-stream recurrent neural network transducer
CN111835621A (en) Session message processing method and device, computer equipment and readable storage medium
CN113743267B (en) Multi-mode video emotion visualization method and device based on spiral and text
CN112417095B (en) Voice message processing method and device
CN113938739A (en) Information display method and device, electronic equipment and storage medium
CN113221644A (en) Slot position word recognition method and device, storage medium and electronic equipment
WO2022143768A1 (en) Speech recognition method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210729

Address after: 519000 Guangdong Zhuhai science and technology innovation coastal high beam Software Park

Applicant after: YGSOFT Inc.

Applicant after: Zhuhai Yuanguang Mobile Interconnection Technology Co.,Ltd.

Address before: 519000 room 105-4675, No. 6, Baohua Road, Hengqin new area, Xiangzhou District, Zhuhai City, Guangdong Province (centralized office area)

Applicant before: Zhuhai Yuanguang Mobile Interconnection Technology Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210806