CN113935337A - Dialogue management method, system, terminal and storage medium - Google Patents
Dialogue management method, system, terminal and storage medium Download PDFInfo
- Publication number
- CN113935337A CN113935337A CN202111235000.6A CN202111235000A CN113935337A CN 113935337 A CN113935337 A CN 113935337A CN 202111235000 A CN202111235000 A CN 202111235000A CN 113935337 A CN113935337 A CN 113935337A
- Authority
- CN
- China
- Prior art keywords
- conversation
- taskflow
- node
- flow
- flow chart
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007726 management method Methods 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 claims abstract description 118
- 230000008569 process Effects 0.000 claims abstract description 75
- 238000010586 diagram Methods 0.000 claims description 25
- 230000014509 gene expression Effects 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 10
- 238000005516 engineering process Methods 0.000 claims description 8
- 238000013518 transcription Methods 0.000 claims description 7
- 230000035897 transcription Effects 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000001960 triggered effect Effects 0.000 claims description 4
- 238000005206 flow analysis Methods 0.000 claims description 3
- 238000009877 rendering Methods 0.000 claims 2
- 238000011161 development Methods 0.000 abstract description 7
- 238000012423 maintenance Methods 0.000 abstract description 5
- 230000018109 developmental process Effects 0.000 description 6
- 230000008451 emotion Effects 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 6
- 230000008878 coupling Effects 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 5
- 238000005859 coupling reaction Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000033772 system development Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Machine Translation (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention discloses a conversation management method, a conversation management system, a terminal and a storage medium. The method comprises the following steps: drawing a taskflow flow chart according to the dialogue logic, and storing the taskflow flow chart in a database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow; during man-machine conversation, recognizing user intentions according to voice stream data of users, and acquiring corresponding taskflow flow flowcharts according to the user intentions; analyzing the taskflow flow chart to obtain a conversation flow; and executing the conversation logic according to the conversation process. The embodiment of the invention improves the application flexibility of the TaskFlow flow chart, greatly reduces the development and maintenance difficulty, and can deal with more complex conversation management scenes.
Description
Technical Field
The present invention relates to the field of voice dialog system technology, and in particular, to a dialog management method, system, terminal, and storage medium.
Background
In recent years, with the continuous progress of related technologies such as intelligent voice and natural language processing, the performance and user experience of the dialog management system have been greatly improved. The existing dialogue management system belongs to a rule-based method, and a relatively universal rule programming framework and platform are lacked. The dialog scenarios that the dialog management system can express are usually designed by domain experts, and the management rules can be implemented by code logic or hidden in the dialog tree structure and dialog framework.
The current popular voice interaction mode is VoiceXML (a markup language applied to voice browsing), which mainly comprises a voice browser, voice recognition, voice synthesis, a VoiceXML gateway and the like, and voice applications and services based on WEB can be established by utilizing the VoiceXML. However, the voice interaction mode has poor portability and flexibility, great difficulty in developing an actual system, and complicated writing and debugging of a conversation process.
Disclosure of Invention
The invention provides a dialogue management method, a system, a terminal and a storage medium, and aims to solve the technical problems of poor portability and flexibility, high difficulty in actual system development, complex dialogue flow compiling and debugging and the like in the conventional voice interaction mode.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a dialog management method comprising:
drawing a taskflow flow chart according to the dialogue logic, and storing the taskflow flow chart in a database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow;
during man-machine conversation, recognizing user intentions according to voice stream data of users, and acquiring a taskflow flow chart according to the user intentions;
analyzing the taskflow flow chart to obtain a conversation flow;
and executing a dialog logic according to the dialog flow, and returning a reply dialog corresponding to the user intention.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the process of drawing the taskflow according to the dialog logic includes:
respectively configuring the API interface node, the SLOTS slot filling node, the SCRIPT SCRIPT node, the NLG reply node and the JUDGE judgment node to specified positions, and connecting the nodes at different positions according to conversation logic to obtain a drawn taskflow flow chart;
the API interface node is used for acquiring service information through remote calling in a conversation process;
the SLOTS slot filling node is used for collecting slot position information in the execution conversation process and filling the slot position;
the SCRIPT SCRIPT node is used for acquiring conversation state information through the embedded groovy SCRIPT in the execution process of the conversation process and controlling and modifying the conversation state information;
NLG replying node: the system is used for generating reply dialogs in a templatized mode in the process of executing the conversation process;
and the JUDGE judging node is used for controlling the trend of the conversation process according to the configured conditional expression in the conversation process executing process.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the process of drawing a taskflow flow diagram according to dialog logic further comprises:
the taskflow flow diagram is divided into a flow diagram including a complete conversation flow and a sub-flow diagram including at least one user intent.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the taskflow flow chart is stored in a database and specifically comprises the following steps:
and storing the taskflow flow chart into a database in a JSON format.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the recognizing the user intention according to the voice stream data of the user comprises the following steps:
acquiring voice stream data of a user;
carrying out phonetic transcription and transcription recognition on the voice stream data by an automatic voice recognition technology to obtain corresponding text data;
and processing the text data through a natural language understanding algorithm to obtain the user intention.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the analyzing the taskflow flowchart includes:
and analyzing the taskflow flow chart in the JSON format by adopting Jackson to generate java objects of all the nodes to obtain a conversation flow.
The technical scheme adopted by the embodiment of the invention also comprises the following steps: the executing of the dialog logic according to the dialog flow, the returning of the reply dialog corresponding to the user intention comprising:
starting from the first node in the TaskFlow flow chart, firstly stacking the root node of the TaskFlow flow chart, starting execution from the top node element of the stack, judging whether the current node is a non-leaf node, and if the current node is the non-leaf node, continuously stacking a child node; if the current node is a leaf node, executing the operation of the current node and returning the conversation state information;
judging whether the current node needs to wait for the user to reply or not, and if not, returning execution state information; if necessary, creating input state information of the user, and filling the slot position after identifying reply information of the user;
and stacking the next triggered node in the TaskFlow flow chart, executing conversation logic, and clearing the executed node in the stack.
The embodiment of the invention adopts another technical scheme that: a dialog management system comprising:
a flow drawing module: the system comprises a database, a logic unit and a database, wherein the logic unit is used for drawing a taskflow flow chart according to the logic of conversation and storing the taskflow flow chart in the database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow;
a flow acquisition module: the method comprises the steps that when man-machine conversation is conducted, user intentions are identified according to voice flow data of users, and a taskflow flow flowchart is obtained according to the user intentions;
a flow analysis module: the taskflow flow chart is used for analyzing the taskflow flow chart to obtain a conversation flow;
a flow execution module: for executing dialog logic according to the dialog flow.
The embodiment of the invention adopts another technical scheme that: a terminal comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the above-described dialog management method;
the processor is to execute the program instructions stored by the memory to perform the dialog management operations.
The embodiment of the invention adopts another technical scheme that: a storage medium having stored thereon program instructions executable by a processor for performing the above-described dialog management method.
The invention has the beneficial effects that: according to the dialog management method, the dialog management system, the terminal and the storage medium, the pre-drawn TaskFlow flow chart is stored in the dialog management module, and in the dialog process, the corresponding TaskFlow flow chart or sub-flow is loaded and the dialog flow is executed according to the intention of the user, so that the flexibility of dialog management is improved. Meanwhile, the nodes used for drawing the TaskFlow flow chart are designed, and in the TaskFlow flow chart drawing process, the nodes at different positions are connected according to conversation logic by dragging each node to the designated position, so that the drawing efficiency is greatly improved. According to the method and the device, the conversation process can be divided into the plurality of sub-processes according to different user intentions, the corresponding TaskFlow flow diagrams are drawn for each sub-process respectively, the TaskFlow flow diagrams of all the sub-processes can be shared and reused, the coupling degree of the TaskFlow flow diagrams and other modules is reduced, the application flexibility of the TaskFlow flow diagrams is improved, the development and maintenance difficulty is greatly reduced, and more complex conversation management scenes can be dealt with.
Drawings
FIG. 1 is a flow chart illustrating a session management method according to a first embodiment of the present invention;
FIG. 2 is a flowchart illustrating a session management method according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a dialog management system according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a terminal according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a storage medium structure according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Please refer to fig. 1, which is a flowchart illustrating a session management method according to a first embodiment of the present invention. A dialogue management method of a first embodiment of the present invention includes the steps of:
s10: drawing a taskflow flow chart according to the dialogue logic, and storing the taskflow flow chart in a database;
in this step, the taskflow flowchart is the pre-drawn dialog logic. In the embodiment of the present application, in order to improve the drawing efficiency, five nodes for implementing different functions are designed, which are respectively: API interface node, SLOTS fill out the groove node, SCRIPT SCRIPT node, NLG reply node, JUDGE JUDGE five nodes to constitute, the function of each node is as follows:
API interface node: the method is used for acquiring the service information through remote calling in the conversation process. An API interface node may be created by configuring a URL (Uniform Resource Locator), an interface entry (key-value format) and an exit of a remote service. In the embodiment of the invention, when the taskflow flow chart is drawn, the API interface node shields the detail information of remote calling through the API gateway service, the difference of different remote calling parties does not need to be noticed, and the user experience and the editing efficiency of the conversation flow are greatly improved.
SLOTS slot filling node: the method is used for collecting slot position information (namely key information needing to be collected to a user) in the execution conversation process, and slot filling is carried out through NLU enabling. In the process of executing the conversation process, the slot filling node continuously traverses all slot positions, when finding an unfilled slot position, the slot position corresponding To a clarified Speech technology is output To TTS (Text To Speech from Text To Speech), a user inputs information under the guidance of the TTS Speech, and after extracting the user intention according To the user input information through an NLU entity, the unfilled slot position is filled through the slot filling node.
SCRIPT node SCRIPT: the method is used for acquiring the conversation state information through the embedded groovy script in the conversation process execution process, controlling and modifying the conversation state information and realizing the customization requirement of the conversation process. Namely, the general dialog management engine execution logic is developed by using a static Java language and edited by a TaskFlow, and the groovy SCRIPT embedded in the SCRIPT SCRIPT node is processed by processing the specific field service logic, so that the system operation and the service field are decoupled. Specifically, the groovy script may acquire the dialog state information through a session.get ('key') statement, and after processing, set the updated dialog state information through a session.set ('key', value) statement. Taking the identification of the gender status information of the user as an example:
def g=session.get('gender')
set ('gender', g ═ 0
NLG replying node: the method is used for giving dynamically changed data in a templatized mode in the process of executing the conversation process to generate the reply conversation. The sentence template comprises a plurality of short sentences containing variables, the variables are dynamically kept updated by data information and are generated by related business rules, and finally the short sentences are spliced into a complete sentence with a good structure.
Judge JUDGE node: the method is used for controlling the trend of the conversation process according to the configured conditional expression during the execution process of the conversation process. The JUDGE judging node supports and configures a plurality of conditional expressions with priorities, variables supported by the conditional expressions are provided by current conversation state information, when the node runs, the conditional expressions are executed according to the priorities, and if the value of the conditional expression is true, the trend of the conversation process is controlled to jump to the branch and the conversation process is continuously executed.
In the embodiment of the application, the taskflow flow chart is divided into a flow chart comprising a complete conversation flow and a sub-flow chart comprising at least one user intention according to the complexity of a conversation task; the flow chart comprising the complete conversation process aims at the conversation tasks with simpler business logic and less flow chart nodes. In practical application, task-based dialog often needs to collect more information through multiple rounds of interaction, and the more complicated the service logic is, the more nodes of the drawn TaskFlow flowchart will be, resulting in higher development difficulty of the TaskFlow flowchart. Therefore, in the embodiment of the application, for a conversation task with a complex business logic, the conversation process is divided into a plurality of (at least two) sub-processes according to the user intention, each sub-process corresponds to at least one user intention, and a sub-process diagram corresponding to each sub-process is drawn. When the sub-processes are required to be used, the sub-processes corresponding to the user intentions are dragged to the editing interface to share and reuse, so that meaningless repeated labor is avoided, and the development and maintenance difficulty is greatly reduced.
Further, the TaskFlow flowchart (including the flowchart and the sub-flowchart) in the embodiment of the present application is drawn in the following manner: dragging the five nodes to an appointed position in an editing interface, configuring each node, connecting the nodes at different positions according to a conversation logic to obtain a drawn TaskFlow flow chart, storing the drawn TaskFlow flow chart into a database of a conversation management module in a JSON (JavaScript Object Notation, JS Object Notation, a lightweight data exchange format) format, and enabling the TaskFlow flow chart to be read by the conversation management module when a conversation task is executed.
S11: during man-machine conversation, recognizing the user intention according to the voice stream data of the user, and acquiring a taskflow flow chart according to the user intention;
in this step, after the voice stream data of the user is obtained, voice-to-word transcription Recognition is performed on the voice stream data by an ASR (Automatic Speech Recognition) technique to obtain corresponding text data, and the text data is processed by an NLU (Natural Language Understanding) algorithm to obtain the user intention.
S12: analyzing the taskflow flow chart to obtain a conversation flow;
in this step, the taskflow flowchart is analyzed by a Jackson tool to obtain a conversation process.
S13: executing the conversation logic according to the conversation process;
in this step, the dialog logic execution process specifically includes: firstly, stacking a root node of the TaskFlow flow chart, and then starting to execute: firstly, executing elements of a stack top node, judging whether a current node is a non-leaf node (a control node) or not, and if the current node is the non-leaf node, continuing to stack a child node; and if the current node is a leaf node, executing the specific operation of the node and returning the dialog state information. Meanwhile, judging whether the current node needs to wait for the reply of the user, and if not, returning execution state information; if necessary, a state information object input by the user is created, and the slot position of the state information object is filled after the user input is recognized. And after the input is finished, pressing the next triggered node in the TaskFlow flow chart into the task stack, executing the process on the node again, and clearing the executed node in the stack. And circularly executing the processes until the conversation task is completed.
Based on the above, in the dialog management method according to the first embodiment of the present invention, the pre-drawn TaskFlow flowchart is stored in the dialog management module, and in the dialog process, the corresponding TaskFlow flowchart is loaded and the dialog flow is executed according to the user intention, so that the flexibility of dialog management is improved, and the drawing efficiency is greatly improved.
Please refer to fig. 2, which is a flowchart illustrating a session management method according to a second embodiment of the present application. A dialog management method of the second embodiment of the present application includes the steps of:
s20: acquiring voice stream data of a user through a conversation engine;
s21: performing transcription and transcription recognition on voice stream data through an ASR (asynchronous receiver-transmitter) to obtain corresponding text data, processing the text data through an NLU (non-line segment) algorithm to obtain user intention, and transmitting the user intention into a dialogue management module;
s22: the dialog management module loads the taskflow flow chart according to the user intention, and analyzes the taskflow flow chart by adopting a Jackson tool to acquire a dialog flow;
in this step, the taskflow flowchart is the pre-drawn dialog logic. In the embodiment of the present application, in order to improve the drawing efficiency, five nodes for implementing different functions are designed, which are respectively: API interface node, SLOTS fill out the groove node, SCRIPT SCRIPT node, NLG reply node, JUDGE JUDGE five nodes to constitute, the function of each node is as follows:
API interface node: the method is used for acquiring the service information through remote calling in the conversation process. An API interface node may be created by configuring a URL (Uniform Resource Locator), an interface entry (key-value format) and an exit of a remote service. In the embodiment of the invention, when the taskflow flow chart is drawn, the API interface node shields the detail information of remote calling through the API gateway service, the difference of different remote calling parties does not need to be noticed, and the user experience and the editing efficiency of the conversation flow are greatly improved.
SLOTS slot filling node: the method is used for collecting slot position information (namely key information needing to be collected to a user) in the execution conversation process, and slot filling is carried out through NLU enabling. In the process of executing the conversation process, the slot filling node continuously traverses all slot positions, when finding an unfilled slot position, the slot position corresponding To a clarified Speech technology is output To TTS (Text To Speech from Text To Speech), a user inputs information under the guidance of the TTS Speech, and after extracting the user intention according To the user input information through an NLU entity, the unfilled slot position is filled through the slot filling node.
SCRIPT node SCRIPT: the method is used for acquiring the conversation state information through the embedded groovy script in the conversation process execution process, controlling and modifying the conversation state information and realizing the customization requirement of the conversation process. Namely, the general dialog management engine execution logic is developed by using a static Java language and edited by a TaskFlow, and the groovy SCRIPT embedded in the SCRIPT SCRIPT node is processed by processing the specific field service logic, so that the system operation and the service field are decoupled. Specifically, the groovy script may acquire the dialog state information through a session.get ('key') statement, and after processing, set the updated dialog state information through a session.set ('key', value) statement. Taking the identification of the gender status information of the user as an example:
def g=session.get('gender')
set ('gender', g ═ 0
NLG replying node: the method is used for giving dynamically changed data in a templatized mode in the process of executing the conversation process to generate the reply conversation. The sentence template comprises a plurality of short sentences containing variables, the variables are dynamically kept updated by data information and are generated by related business rules, and finally the short sentences are spliced into a complete sentence with a good structure.
Judge JUDGE node: the method is used for controlling the trend of the conversation process according to the configured conditional expression during the execution process of the conversation process. The JUDGE judging node supports and configures a plurality of conditional expressions with priorities, variables supported by the conditional expressions are provided by current conversation state information, when the node runs, the conditional expressions are executed according to the priorities, and if the value of the conditional expression is true, the trend of the conversation process is controlled to jump to the branch and the conversation process is continuously executed.
In the embodiment of the application, the taskflow flow chart is divided into a flow chart comprising a complete conversation flow and a sub-flow chart comprising at least one user intention according to the complexity of a conversation task; the flow chart comprising the complete conversation process aims at the conversation tasks with simpler business logic and less flow chart nodes. In practical application, task-based dialog often needs to collect more information through multiple rounds of interaction, and the more complicated the service logic is, the more nodes of the drawn TaskFlow flowchart will be, resulting in higher development difficulty of the TaskFlow flowchart. Therefore, in the embodiment of the application, for a conversation task with a complex business logic, the conversation process is divided into a plurality of (at least two) sub-processes according to the user intention, each sub-process corresponds to at least one user intention, and a sub-process diagram corresponding to each sub-process is drawn. When the sub-processes are required to be used, the sub-processes corresponding to the user intentions are dragged to the editing interface to share and reuse, so that meaningless repeated labor is avoided, and the development and maintenance difficulty is greatly reduced.
Further, the TaskFlow flowchart (including the flowchart and the sub-flowchart) in the embodiment of the present application is drawn in the following manner: dragging the five nodes to an appointed position in an editing interface, configuring each node, connecting the nodes at different positions according to a conversation logic to obtain a drawn TaskFlow flow chart, storing the drawn TaskFlow flow chart into a database of a conversation management module in a JSON (JavaScript Object Notation, JS Object Notation, a lightweight data exchange format) format, and enabling the TaskFlow flow chart to be read by the conversation management module when a conversation task is executed.
The dialog management module acquires the taskflow flowchart from the database, and needs to analyze the taskflow flowchart into an execution object which can be identified by the dialog management module. According to the embodiment of the application, the taskflow flow chart in the JSON format is analyzed through a Jackson tool, java objects of all nodes are generated, a conversation flow used by a conversation management module is obtained, and the conversation flow is stored through a tree structure. The node information in the tree structure is shown in table 1:
table 1: node information of tree structure
S23, executing the dialog logic according to the dialog flow of the taskflow flow chart: starting from the first node in the taskflow flow chart, sequentially stacking each node for operation;
in this step, the dialog logic execution process specifically includes: firstly, stacking a root node of the TaskFlow flow chart, and then starting to execute: firstly, executing elements of a stack top node, judging whether a current node is a non-leaf node (a control node) or not, and if the current node is the non-leaf node, continuing to stack a child node; and if the current node is a leaf node, executing the specific operation of the node and returning the dialog state information. Meanwhile, judging whether the current node needs to wait for the reply of the user, and if not, returning execution state information; if necessary, a state information object input by the user is created, and the slot position of the state information object is filled after the user input is recognized. And after the input is finished, pressing the next triggered node in the TaskFlow flow chart into the task stack, executing the process on the node again, and clearing the executed node in the stack. And circularly executing the processes until the conversation task is completed.
S24: judging whether the SLOTS slot filling node is operated or not, and if the SLOTS slot filling node is operated, executing S25; otherwise, re-executing S23;
s25: pausing execution of the dialog logic, and returning a reply dialog corresponding to the user intention through the dialog engine;
s26: converting the reply dialect into voice stream data by a TTS technology and outputting the voice stream data to a user through a telephone platform;
s27: judging whether the conversation task is ended, if not, executing S20 again, otherwise, executing S280;
s280: the session is ended.
Based on the above, the dialog management method according to the second embodiment of the present invention improves the flexibility of dialog management by storing the pre-drawn TaskFlow flowchart in the dialog management module, and loading the corresponding TaskFlow flowchart or sub flowchart according to the user's intention and executing the dialog flow during the dialog process. Meanwhile, the nodes used for drawing the TaskFlow flow chart are designed, and in the TaskFlow flow chart drawing process, the nodes at different positions are connected according to conversation logic by dragging each node to the designated position, so that the drawing efficiency is greatly improved. According to the embodiment of the application, the conversation process can be divided into the plurality of sub-processes according to different user intentions, the corresponding sub-process diagrams are drawn for each sub-process respectively, all the sub-process diagrams can be shared and reused, the coupling degree of the sub-process diagrams and other modules is reduced, the application flexibility of the TaskFlow process diagram is improved, the development and maintenance difficulty is greatly reduced, and more complex conversation management scenes can be dealt with.
In an alternative embodiment, it is also possible to: and uploading the result of the conversation management method to a block chain.
Specifically, the corresponding digest information is obtained based on the result of the dialog management method, and specifically, the digest information is obtained by hashing the result of the dialog management method, for example, using the sha256s algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The user can download the summary information from the blockchain to verify whether the result of the session management method is tampered. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Please refer to fig. 3, which is a schematic structural diagram of a session management system according to an embodiment of the present invention. The dialog management system 40 according to the embodiment of the present invention includes:
the flow drawing module 41: the database is used for drawing a taskflow flow chart according to the conversation logic and storing the taskflow flow chart in the database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow;
the flow acquisition module 42: the method comprises the steps that when man-machine conversation is carried out, user intentions are identified according to voice stream data of users, and corresponding taskflow flow flowcharts are obtained according to the user intentions;
the flow analysis module 43: the taskflow flow chart is used for analyzing the taskflow flow chart to obtain a conversation flow;
the flow execution module 44: for executing dialog logic according to the dialog flow.
Fig. 4 is a schematic diagram of a terminal structure according to an embodiment of the present invention. The terminal 50 comprises a processor 51, a memory 52 coupled to the processor 51.
The memory 52 stores program instructions for implementing the dialog management method described above.
The processor 51 is operative to execute program instructions stored in the memory 52 to perform session management operations.
The processor 51 may also be referred to as a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip having signal processing capabilities. The processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The terminal of the embodiment of the application quantizes the expression degree of each acoustic feature on each emotion tag by executing the program instruction stored in the memory through the processor and controlling the conversation management method stored in the memory, then calculates the sensitivity of each acoustic feature changing along with the emotion tag conversion according to the quantization index when the emotion tag changes, filters out the acoustic feature with the sensitivity smaller than the sensitivity threshold value, and performs conversation management according to the filtered acoustic feature. The embodiment of the invention gives consideration to the flexibility of application, can improve the accuracy of session management, and simultaneously reduces the workload in the actual application scene.
Fig. 5 is a schematic structural diagram of a storage medium according to an embodiment of the invention. The storage medium of the embodiment of the present invention stores a program file 61 capable of implementing all the methods described above, wherein the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
The storage medium of the embodiment of the application quantizes the expression degree of each acoustic feature on each emotion tag through a program instruction execution conversation management method in a stored processor, then calculates the sensitivity of each acoustic feature changing along with emotion tag conversion according to the quantization index when the emotion tags change, filters out the acoustic features of which the sensitivity is smaller than a sensitivity threshold value, and performs conversation management according to the filtered acoustic features. The embodiment of the invention gives consideration to the flexibility of application, can improve the accuracy of session management, and simultaneously reduces the workload in the actual application scene.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A dialog management method, comprising:
drawing a taskflow flow chart according to the dialogue logic, and storing the taskflow flow chart in a database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow;
during man-machine conversation, recognizing user intentions according to voice stream data of users, and acquiring corresponding taskflow flow flowcharts according to the user intentions;
analyzing the taskflow flow chart to obtain a conversation flow;
and executing the conversation logic according to the conversation process.
2. The dialog management method of claim 1 wherein said rendering a taskflow flow diagram based on dialog logic comprises:
respectively configuring the API interface node, the SLOTS slot filling node, the SCRIPT SCRIPT node, the NLG reply node and the JUDGE judgment node to specified positions, and connecting the nodes at different positions according to conversation logic to obtain a drawn taskflow flow chart;
the API interface node is used for acquiring service information through remote calling in a conversation process;
the SLOTS slot filling node is used for collecting slot position information in the execution conversation process and filling the slot position;
the SCRIPT SCRIPT node is used for acquiring conversation state information through the embedded groovy SCRIPT in the execution process of the conversation process and controlling and modifying the conversation state information;
NLG replying node: the system is used for generating reply dialogs in a templatized mode in the process of executing the conversation process;
and the JUDGE judging node is used for controlling the trend of the conversation process according to the configured conditional expression in the conversation process executing process.
3. The dialog management method of claim 2 wherein the rendering of the taskflow flow diagram in accordance with dialog logic further comprises:
the taskflow flow diagram is divided into a flow diagram including a complete conversation flow and a sub-flow diagram including at least one user intent.
4. The session management method according to claim 3, wherein the taskflow flow graph stored in the database specifically comprises:
and storing the taskflow flow chart into a database in a JSON format.
5. The dialog management method according to claim 1, wherein the recognizing the user's intention from the user's voice stream data comprises:
acquiring voice stream data of a user;
carrying out phonetic transcription and transcription recognition on the voice stream data by an automatic voice recognition technology to obtain corresponding text data;
and processing the text data through a natural language understanding algorithm to obtain the user intention.
6. The dialog management method of claim 3 wherein said parsing the taskflow flow graph comprises:
and analyzing the taskflow flow chart in the JSON format by adopting Jackson to generate java objects of all the nodes to obtain a conversation flow.
7. The conversation management method according to any one of claims 1 to 6, wherein the executing conversation logic according to the conversation process comprises:
starting from the first node in the TaskFlow flow chart, firstly stacking the root node of the TaskFlow flow chart, starting execution from the top node element of the stack, judging whether the current node is a non-leaf node, and if the current node is the non-leaf node, continuously stacking a child node; if the current node is a leaf node, executing the operation of the current node and returning the conversation state information;
judging whether the current node needs to wait for the user to reply or not, and if not, returning execution state information; if necessary, creating input state information of the user, and filling the slot position after identifying reply information of the user;
and stacking the next triggered node in the TaskFlow flow chart, executing conversation logic, and clearing the executed node in the stack.
8. A dialog management system, comprising:
a flow drawing module: the system comprises a database, a logic unit and a database, wherein the logic unit is used for drawing a taskflow flow chart according to the logic of conversation and storing the taskflow flow chart in the database; the taskflow flow chart consists of API interface nodes, SLOTS slot filling nodes, SCRIPT SCRIPT nodes, NLG reply nodes and JUDGE judgment nodes, and comprises at least one user intention in a conversation flow;
a flow acquisition module: the method comprises the steps that when man-machine conversation is conducted, user intentions are identified according to voice flow data of users, and a taskflow flow flowchart is obtained according to the user intentions;
a flow analysis module: the taskflow flow chart is used for analyzing the taskflow flow chart to obtain a conversation flow;
a flow execution module: for executing dialog logic according to the dialog flow.
9. A terminal, comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the dialog management method of any of claims 1 to 7;
the processor is configured to execute the program instructions stored by the memory to perform the dialog management method.
10. A storage medium having stored thereon program instructions executable by a processor to perform the dialog management method of any of claims 1 to 7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111235000.6A CN113935337A (en) | 2021-10-22 | 2021-10-22 | Dialogue management method, system, terminal and storage medium |
PCT/CN2022/089566 WO2023065629A1 (en) | 2021-10-22 | 2022-04-27 | Dialogue management method and system, and terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111235000.6A CN113935337A (en) | 2021-10-22 | 2021-10-22 | Dialogue management method, system, terminal and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113935337A true CN113935337A (en) | 2022-01-14 |
Family
ID=79283877
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111235000.6A Pending CN113935337A (en) | 2021-10-22 | 2021-10-22 | Dialogue management method, system, terminal and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113935337A (en) |
WO (1) | WO2023065629A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114582314A (en) * | 2022-02-28 | 2022-06-03 | 江苏楷文电信技术有限公司 | ASR-based human-computer audio-video interaction logic model design method |
WO2023065629A1 (en) * | 2021-10-22 | 2023-04-27 | 平安科技(深圳)有限公司 | Dialogue management method and system, and terminal and storage medium |
CN117251553A (en) * | 2023-11-15 | 2023-12-19 | 知学云(北京)科技股份有限公司 | Intelligent learning interaction method based on custom plug-in and large language model |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116628141B (en) * | 2023-07-24 | 2023-12-01 | 科大讯飞股份有限公司 | Information processing method, device, equipment and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102447513B1 (en) * | 2016-01-22 | 2022-09-27 | 한국전자통신연구원 | Self-learning based dialogue apparatus for incremental dialogue knowledge, and method thereof |
CN108763568A (en) * | 2018-06-05 | 2018-11-06 | 北京玄科技有限公司 | The management method of intelligent robot interaction flow, more wheel dialogue methods and device |
CN110472030A (en) * | 2019-08-08 | 2019-11-19 | 网易(杭州)网络有限公司 | Man-machine interaction method, device and electronic equipment |
CN110704594A (en) * | 2019-09-27 | 2020-01-17 | 北京百度网讯科技有限公司 | Task type dialogue interaction processing method and device based on artificial intelligence |
CN113935337A (en) * | 2021-10-22 | 2022-01-14 | 平安科技(深圳)有限公司 | Dialogue management method, system, terminal and storage medium |
-
2021
- 2021-10-22 CN CN202111235000.6A patent/CN113935337A/en active Pending
-
2022
- 2022-04-27 WO PCT/CN2022/089566 patent/WO2023065629A1/en active Application Filing
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023065629A1 (en) * | 2021-10-22 | 2023-04-27 | 平安科技(深圳)有限公司 | Dialogue management method and system, and terminal and storage medium |
CN114582314A (en) * | 2022-02-28 | 2022-06-03 | 江苏楷文电信技术有限公司 | ASR-based human-computer audio-video interaction logic model design method |
CN117251553A (en) * | 2023-11-15 | 2023-12-19 | 知学云(北京)科技股份有限公司 | Intelligent learning interaction method based on custom plug-in and large language model |
CN117251553B (en) * | 2023-11-15 | 2024-02-27 | 知学云(北京)科技股份有限公司 | Intelligent learning interaction method based on custom plug-in and large language model |
Also Published As
Publication number | Publication date |
---|---|
WO2023065629A1 (en) | 2023-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113935337A (en) | Dialogue management method, system, terminal and storage medium | |
CN100424632C (en) | Semantic object synchronous understanding for highly interactive interface | |
CN114424185A (en) | Stop word data augmentation for natural language processing | |
US10277743B1 (en) | Configurable natural language contact flow | |
CN110244941B (en) | Task development method and device, electronic equipment and computer readable storage medium | |
CN111144128A (en) | Semantic parsing method and device | |
CN112163067A (en) | Sentence reply method, sentence reply device and electronic equipment | |
JP6866336B2 (en) | How and equipment to build artificial intelligence applications | |
US8315874B2 (en) | Voice user interface authoring tool | |
CN107808007A (en) | Information processing method and device | |
JP2019091416A5 (en) | ||
CN114186108A (en) | Multimode man-machine interaction system oriented to electric power material service scene | |
CN117219052A (en) | Prosody prediction method, apparatus, device, storage medium, and program product | |
Reddy et al. | Varoka-Chatbot: An Artificial Intelligence Based Desktop Partner | |
CN117389890A (en) | Method and device for generating test case, electronic equipment and storage medium | |
CN118202344A (en) | Deep learning technique for extracting embedded data from documents | |
CN113192510B (en) | Method, system and medium for realizing voice age and/or sex identification service | |
CN113987149A (en) | Intelligent session method, system and storage medium for task robot | |
EP3843090B1 (en) | Method and apparatus for outputting analysis abnormality information in spoken language understanding | |
CN109166581A (en) | Audio recognition method, device, electronic equipment and computer readable storage medium | |
WO2018197939A1 (en) | Adding voice commands to invoke web services | |
CN114449063B (en) | Message processing method, device and equipment | |
CN111104118A (en) | AIML-based natural language instruction execution method and system | |
CN117059082B (en) | Outbound call conversation method, device, medium and computer equipment based on large model | |
CN116521155B (en) | Method for dynamically generating Restful interface based on JSON description |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40062820 Country of ref document: HK |