US20190057298A1 - Mapping actions and objects to tasks - Google Patents
Mapping actions and objects to tasks Download PDFInfo
- Publication number
- US20190057298A1 US20190057298A1 US16/105,671 US201816105671A US2019057298A1 US 20190057298 A1 US20190057298 A1 US 20190057298A1 US 201816105671 A US201816105671 A US 201816105671A US 2019057298 A1 US2019057298 A1 US 2019057298A1
- Authority
- US
- United States
- Prior art keywords
- task
- user
- virtual assistant
- action
- conversation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000009471 action Effects 0.000 title claims abstract description 77
- 238000013507 mapping Methods 0.000 title abstract description 9
- 238000000034 method Methods 0.000 claims abstract description 59
- 238000004891 communication Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 description 30
- 230000004044 response Effects 0.000 description 27
- 238000012545 processing Methods 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 6
- HEFNNWSXXWATRW-UHFFFAOYSA-N Ibuprofen Chemical compound CC(C)CC1=CC=C(C(C)C(O)=O)C=C1 HEFNNWSXXWATRW-UHFFFAOYSA-N 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000006855 networking Effects 0.000 description 4
- 235000017284 Pometia pinnata Nutrition 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- LLJRXVHJOJRCSM-UHFFFAOYSA-N 3-pyridin-4-yl-1H-indole Chemical compound C=1NC2=CC=CC=C2C=1C1=CC=NC=C1 LLJRXVHJOJRCSM-UHFFFAOYSA-N 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 240000009305 Pometia pinnata Species 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/008—Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/50—Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
- H04M3/527—Centralised call answering arrangements not requiring operator intervention
Definitions
- a growing number of users are using smart devices, such as smart phones, tablet computers, and so on, to interact with virtual assistants.
- the users may communicate with virtual assistants to perform a desired task, such as searching for content, checking into a flight, setting a calendar appointment, and so on.
- a desired task such as searching for content, checking into a flight, setting a calendar appointment, and so on.
- the virtual assistants often incorrectly determine a task that the users are requesting. Accordingly, there is an increasing need to accurately identify a task to be performed by the virtual assistant for a user.
- FIG. 1 illustrates an example architecture in which techniques described herein may be implemented.
- FIG. 2 illustrates details of an example virtual assistant service.
- FIG. 3 illustrates an example process to determine a task to be performed by a virtual assistant.
- FIG. 4 illustrates an example user interface to enable a user to customize task preferences of a virtual assistant.
- FIGS. 5A-5B illustrate an example process to determine a task to be performed by a virtual assistant.
- FIG. 6 illustrates an example process to configure a task map of a virtual assistant.
- a user may interact with a virtual assistant on a smart device by providing input to the virtual assistant and/or receiving information from the virtual assistant. While interacting with the virtual assistant, the user may provide input that requests or otherwise facilitates a task to be performed by the virtual assistant.
- the virtual assistant may process the input to determine an action (e.g., verb) and an object (e.g., noun). For example, if the user inputs “listen to music,” the virtual assistant may identify the term “listen” as corresponding to the action and the term “music” as corresponding to the object.
- the virtual assistant may then identify a task to be performed by the virtual assistant.
- the virtual assistant may reference a task map.
- the task map may map action-object pairs to tasks of the virtual assistant.
- a task may include any type of operation that is performed at least in part by a computing device.
- the virtual assistant may determine that the action of “listen” and the object of “music” are associated with a task of playing a song on the smart device.
- the virtual assistant may then perform the identified task (e.g., play the song).
- a task map or other source of correlation that maps action-object pairs to tasks, a task may be efficiently identified for performance by the virtual assistant.
- the virtual assistant may utilize contextual information to identify a task to be performed by the virtual assistant.
- the contextual information may include a conversation history of the user with the virtual assistant, content output history identifying content that has been output to the user, user preferences, location of the user, and so on.
- the contextual information may provide some indication of what task the user would like the virtual assistant to perform (e.g., what the user is requesting the virtual assistant to do).
- the virtual assistant may reference the recent conversation to infer that the user may be interested in flight status information.
- the virtual assistant may identify a task that is relevant to the user's context.
- the task map may be personalized for a particular user (e.g., on a user-by-user basis).
- the virtual assistant may learn what task to perform for a particular action-object pair of input from the user. For example, if the virtual assistant has identified input of “let's rock-out” from the user in a previous conversation as corresponding to a task of playing music, the virtual assistant may update the task map for that user such that an action-object pair for “let's rock-out” corresponds to the task of playing music.
- the virtual assistant may learn the types of content that are output to a user, and personalize tasks to those types of content. To illustrate, if a user frequently views sports content on a particular sports web site, the task map may be personalized so that an action-object pair associated with sports may be associated with a task of navigating to the particular sports web site.
- a task map may be customized for a particular industry application, platform, device type, and so on, in which the virtual assistant is to be deployed.
- a task map may be generated for an airline industry implementation so that action-object pairs that are relevant to the airlines are associated with tasks that are relevant to the airlines.
- an action-object pair of check-status may be associated with a task of checking the status of an airline flight, instead of a task of checking the status of a purchased item, which may be the case in another industry application, such as an e-commerce implementation.
- a virtual assistant may perform tasks that are relevant to the particular context.
- the virtual assistant may provide a personalized interaction with the user (e.g., a conversation that is adapted to the user).
- the virtual assistant may provide functionality that is adapted to the particular industry application.
- the virtual assistant may provide accurate task determination, which may enhance a user's experience with the virtual assistant.
- the techniques described herein may learn over time tasks that may be relevant to particular action-object pairs and evolve a task map based on the learning.
- FIG. 1 illustrates an example architecture 100 in which techniques described herein may be implemented.
- the architecture 100 includes a smart device 102 configured to interact with one or more users 104 (hereinafter the user 104 ) and perform other processing discussed herein.
- the smart device 102 may comprise any type of computing device that is configured to perform an operation.
- the smart device 102 may be implemented as a laptop computer, a desktop computer, a server, a smart phone, an electronic reader device, a mobile handset, a personal digital assistant (PDA), a portable navigation device, a portable gaming device, a tablet computer, a watch, a portable media player, a television, a set-top box, a computer system in a car, an appliance, a camera, a robot, a hologram system, a security system, a home-based computer system (e.g., intercom system, home media system, etc.), a projector, an automated teller machine (ATM), a pair of glass with computing capabilities, a wearable computer, and so on.
- PDA personal digital assistant
- the smart device 102 may be equipped with one or more processors 106 , memory 108 , a display(s), a microphone(s), a speaker(s), a camera(s), a sensor(s), and a network interface(s).
- the one or more processors 106 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, a digital signal processor, and so on.
- the sensor(s) may include an accelerometer, compass, gyroscope, magnetometer, Global Positioning System (GPS), olfactory sensor (e.g., for smell), or other sensor.
- the display(s) is implemented as one or more touch screens.
- the camera(s) may include a front facing camera and/or a rear facing camera.
- the display(s), microphone(s), speaker(s), camera(s), and/or sensor(s) may be configured to receive user input, such as gesture input (e.g., through the camera), touch input, audio or speech input, and so on, and/or may be configured to output content, such as audio, images, video, and so on.
- the memory 108 may include a client application 110 (e.g., module) configured to interface with the user 104 .
- the client application 110 may receive any type of input from the user 104 , such as audio or speech, text, touch, or gesture input received through a sensor or other element of the smart device 102 .
- the client application 110 may also provide any type of response, such as audio, text, interface items (e.g., icons, buttons, menu elements, etc.), and so on.
- the client application 110 is implemented as, or in association with, a mobile application, a browser (e.g., mobile browser), and so on.
- the client application 110 may be implemented as, or in conjunction with, a virtual assistant 112 (e.g., an intelligent personal assistant).
- a “virtual assistant” may act as an interface between end users and information of one or more service providers 114 (hereinafter the service provider 114 ), information of the smart device 102 , information of a virtual assistant service 116 , or any type of information.
- the virtual assistant 112 may access content items stored on the service provider 114 to formulate a response to the user 104 .
- the virtual assistant 112 may be configured for multi-modal input/output (e.g., receive and/or respond in audio or speech, text, touch, gesture, etc.), multi-language communication (e.g., receive and/or respond according to any type of human language), multi-channel communication (e.g., carry out conversations through a variety of computing devices, such as continuing a conversation as a user transitions from using one computing device to another), and other types of input/output or communication.
- the virtual assistant 112 may embody a human-like persona and/or artificial intelligence (AI).
- AI artificial intelligence
- the virtual assistant 112 may be represented by an image or avatar that is displayed on the smart device 102 .
- An avatar may comprise an animated character that may take on any number of shapes and appearances, and/or resemble a human talking to a user.
- the avatar may be arranged as a representative of the service provider 114 , while in other instances the avatar may be a dedicated personal assistant to a user.
- the virtual assistant 112 may interface with the user through a conversation user interface 118 .
- the conversation user interface 118 may provide conversation items representing information from the virtual assistant 112 and/or information from the user 104 .
- the conversation user interface 118 may display a dialog representation of the user's query and a response item of the virtual assistant 112 that identifies the nearest restaurant to the user 104 .
- a conversation item may comprise an icon (e.g., selectable or non-selectable), a menu item (e.g., drop down menu, radio control, etc.), text, a link, audio, video, or any other type of information.
- the conversation user interface 118 may include other interface items, such as a microphone icon for speech input, a text box to input text, a keyboard (e.g., touch screen keyboard), other input icons, and so on.
- the conversation user interface 118 has been described as being associated with the smart device 102 , in other examples the conversation user interface 118 is associated with the service provider 114 and/or the virtual assistant service 116 .
- the interface 118 is displayed through an online site of the service provider 114 , such as when the user navigates to the online site.
- the interface 118 may include a virtual assistant that embodies characteristics of the service provider 114 , such as a flight attendant for an online airline site.
- the user 104 may generally interact with the virtual assistant 112 to cause a task to be performed by the virtual assistant 112 .
- a task may be performed in response to explicit user input, such as playing music in response to “please play music.” While in other instances a task may be performed in response to inferred user input requesting that that the task be performed, such as providing weather information in response to “the weather looks nice today.” In yet further instances, a task may be performed when an event has occurred, such as providing flight information an hour before a flight.
- a task may include any type of operation that is performed at least in part by a computing device.
- a task may include logging a user into a site, setting a calendar appointment, resetting a password for a user, purchasing an item, opening an application, sending an instruction to a device to perform an act, sending an email, outputting content (e.g., outputting audio (an audible answer), video, an image, text, a hyperlink, etc.), navigating to a web site, upgrading a user's seat assignment, and so on.
- a task may include providing a response to a user.
- a task may include performing an operation according to one or more criteria (e.g., one or more default settings).
- a task may include sending an email through a particular email account, providing directions with a particular mobile application, searching for content through a particular search engine, and so on.
- a task may include providing information through the conversation user interface 118 .
- a task may be associated with variables for performing the task.
- a task of playing music may be associated with an artist variable indicating the artist and a song variable indicating the song.
- a value for a variable is obtained from the input that initiated the task. For example, if the user requests “please play Free Fallin' by Tom Petty,” the virtual assistant 112 may identify “Free Fallin'” as a value for the song variable and “Tom Petty” as a value for the artist variable.
- values for variables may be known and/or obtained from contextual information.
- the virtual assistant 112 may identify a particular Megan in the user's contacts (e.g., when the contacts include multiple Megans) that was recently texted as a value for the person ID variable for the task.
- a value for a variable may be obtained by prompting a user for the value. For example, if the user requests “book a flight,” and has not provided a destination, the virtual assistant 112 may ask the user “where would you like to fly to?” and the user may provide a destination as the value.
- the virtual assistant 112 may generally determine a task to perform by referencing one or more task maps.
- a task map may map action-object pairs to tasks.
- a task map may generally refer to any type of data that associates a task with an action-object pair.
- a task map may comprise a look-up table, data in a database, data of a state machine, or any other data to correlate tasks and action-object pairs.
- an action may comprise a verb, while an object may comprise a noun.
- a task map may specify associations for a particular type of noun, such as a common noun (e.g., a class of entities) or a proper noun (e.g., a unique entity).
- variable value for the task may specify the proper noun.
- the object may comprise a song (e.g., the common noun) while the variable value may comprise “Free Fallin'” (e.g., the proper noun).
- the virtual assistant 112 operates in cooperation with the virtual assistant service 116 . That is, one or more functions of the virtual assistant 112 may be performed by the virtual assistant service 116 .
- the virtual assistant service 116 may generally provide one or more services, such as input processing, speech recognition, response formulation, task mapping, context analysis, user characteristic analysis, and so on.
- the virtual assistant service 116 may generally act as a “back-end” resource for the smart device 102 .
- the smart device 102 may receive input 120 from the user 104 (e.g., “what's the score of the game?”) and send the input 120 to the virtual assistant service 116 for processing.
- the virtual assistant service 116 may analyze the input 120 to determine an action and an object 122 .
- the action comprises “provide,” while the object comprises “score.”
- the virtual assistant service 116 may then reference a task map 124 that associates action-object pairs with tasks.
- the action-object pair of provide-score maps to multiple tasks, namely a task 126 ( a ) of providing the score of a sports game and a task 126 ( b ) of providing the score of a video game.
- the virtual assistant service 116 may reference contextual information 128 stored in a context data store 130 and rank the tasks 126 based on which task is most relevant to the contextual information 128 .
- the user 104 had a conversation with the virtual assistant 112 yesterday about the NXT Lions basketball team.
- the virtual assistant service 116 may identify that the task 126 ( a ) of providing a score of a sports game as most relevant to the input 120 .
- the virtual assistant service 116 may prompt the user for further clarification regarding a task (e.g., “would you like to view the score of the sports game or view the score of the video game?”).
- the virtual assistant service 116 may then determine variable values 132 ( a ) for performing the task 126 ( a ). As illustrated in FIG. 1 , the tasks 126 may be associated with variables 132 for performing the tasks 126 . In this example, the virtual assistant service 116 again references the contextual information 128 to identify a value for the sport and team variables (e.g., basketball and NXT Lions) of the task 126 ( a ). However, in other examples the virtual assistant 116 may cause the virtual assistant 112 to prompt the user 104 for the variable values 132 ( a ). Upon identifying the variable values 132 ( a ), the virtual assistant service 116 may cause the score of the game to be provided to the user 104 , as illustrated at 134 .
- the virtual assistant service 116 may cause the score of the game to be provided to the user 104 , as illustrated at 134 .
- a task was identified based on contextual information that provided content of a previous conversation.
- other types of contextual information may be used.
- a task may be identified based on contextual information that indicates a type of device a user is using to interact with a virtual assistant. If, for instance, the user requests “call Michelle,” and the user is using a desktop computer, a task for that context may be identified, such as calling an individual through a voice over internet protocol service or setting a reminder to call the individual at a later time (e.g., when the user is on his cell phone). Whereas, if the user is using a cell phone, a different task may be identified, such as calling the individual through a cellular connection.
- contextual information may be used to identify an order to perform multiple tasks.
- the task for buying the movie tickets may be performed first based on a calendar event for a date with the user's girlfriend (e.g., indicating that the user may want to buy the tickets first so that he can mention the tickets to his girlfriend). Thereafter, the task for calling the user's girlfriend may be performed.
- the architecture 100 also includes the service provider 114 that includes one or more data stores 136 for storing content items.
- the one or more data stores 136 may include a mobile web data store, a smart web data store, an information and content data store, a content management service (CMS) data store, and so on.
- a mobile web data store may store content items that are designed to be viewed on a mobile device, such as a mobile telephone, tablet device, etc.
- a web data store includes content items that are generally designed to be viewed on a device that includes a relatively large display, such as a desktop computer.
- An information and content data store may include content items associated with an application, content items from a data base, and so on.
- a CMS data store may include content items providing information about a user, such as a user preference, user profile information, information identifying offers that are configured to a user based on profile and purchase preferences, etc.
- the service provider 114 may include content items from any type of source.
- the one or more data stores 136 are illustrated as included in the service provider 114 , the one or more data stores 136 may alternatively, or additionally, be included in the virtual assistant service 116 and/or the smart device 102 .
- the architecture 100 may include a current context service 138 to provide current information about a context.
- the current context service 138 may provide information about current events (e.g., news articles, sports scores, blog content, social media content, a current flight status (e.g., on-time, delayed, etc.), and so on), location information about a user (e.g., a user's current location), current weather information, current times and/or dates (e.g., a current time in Japan, a current time in the US, a current time and date where a user is located, etc.), and so on.
- current events e.g., news articles, sports scores, blog content, social media content, a current flight status (e.g., on-time, delayed, etc.), and so on
- location information about a user e.g., a user's current location
- current weather information e.g., current times and/or dates (e.g., a current time in Japan, a current time in the US,
- this information is stored at the current context service 138 , while in other instances the information is sent to the service provider 114 , the virtual assistant service 116 , and/or the smart device 102 for storage.
- the current context service 138 may communicate with the virtual assistant service 116 to provide information that may be useful to the virtual assistant 112 .
- the architecture 100 may also include one or more networks 140 to enable the smart device 102 , the virtual assistant service 116 , the service provider 114 , and/or the current context service 138 to communicate with each other.
- the one or more networks 140 may include any one or combination of multiple different types of networks, such as cellular networks, wireless networks, Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and so on.
- FIG. 2 illustrates further details of the example virtual assistant service 116 of FIG. 1 .
- the virtual assistant service 116 may generally provide one or more services to implement the virtual assistant 112 on the smart device 102 .
- the virtual assistant service 116 may include one or more computing devices.
- the one or more computing devices may be implemented as one or more desktop computers, laptop computers, servers, and the like.
- the one or more computing devices may be configured in a cluster, data center, cloud computing environment, or a combination thereof.
- the virtual assistant service 116 provides cloud computing resources, including computational resources, storage resources, and the like, that operate remotely to the smart device 102 .
- the one or more computing devices of the virtual assistant service 116 may include one or more processors 202 , memory 204 , and one or more network interfaces 206 .
- the one or more processors 202 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, a digital signal processor, and so on.
- the memory 204 may include software functionality configured as one or more “modules.”
- module is intended to represent example divisions of the software for purposes of discussion, and is not intended to represent any type of requirement or required method, manner or necessary organization.
- the memory 204 includes an input processing module 208 , a task mapping module 210 , a learning module 212 , and a context module 214 .
- the input processing module 208 may be configured to obtain and/or process input received from a user. If, for example, the input is speech input, the input processing module 208 may perform speech recognition techniques to convert the input into a format that is understandable by a computing device, such as text. The input processing module 208 may store input in an input data store 216 . The input processing module 208 may also be configured to determine a task to perform. To make such a determination, the input processing module 208 may include an action-object module 218 and a task module 220 .
- the action-object module 218 may determine (e.g., identify) an action and/or an object for user input.
- An action and/or object may be explicitly included in user input, inferred from the structure and/or context of the user input, and/or obtained by prompting a user (e.g., if missing information or context).
- the action-object module 218 may utilize various techniques, such as Part-of-Speech Tagging (POST), probabilistic or statistical speech modeling, Natural Language Processing (NLP), pattern recognition language modeling, and so on. These techniques may seek to interpret or derive a meaning and/or concept of input and may include new and/or existing techniques.
- an action and/or object may be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified.
- the task module 220 may determine (e.g., identify) a task to be performed based on an action, object, and/or variable value of user input.
- the task module 220 may reference one or more task maps stored in a task map data store 222 .
- the task module 220 may identify matching information in a task map (e.g., all information in the map that includes a determined action, object, and/or variable value) and tasks that are associated with the matching information.
- the task module 220 may identify all candidate tasks in a task map that are associated with an action, object, and/or variable value that are identified from user input (e.g., all rows of table-based task map that include an identified action, object, or variable value).
- Each candidate task may be associated with a confidence score that is based on the confidence scores of the associated action and/or object for the task.
- the task module 220 may then determine whether or not any of the candidate tasks satisfy one or more criteria, such as being the only task that is associated with identified information in the task map and/or being associated with a confidence score that is greater than a threshold. When such criteria are satisfied, the task may be selected. Alternatively, if the criteria are not satisfied, then the virtual assistant service 116 may identify a task to be performed by prompting the user for information and/or ranking the candidate tasks.
- the task module 220 may make an initial determination as to a context in which user input is received and reference a task map that is customized for the context.
- the context may comprise a particular industry (e.g., field of use), platform (e.g., type of software/hardware architecture—mobile operating system, desktop operating system, etc.), device or device type, user, user type, location (e.g., user location), and so on.
- platform e.g., type of software/hardware architecture—mobile operating system, desktop operating system, etc.
- device or device type e.g., type of software/hardware architecture—mobile operating system, desktop operating system, etc.
- device or device type e.g., user, user type, location (e.g., user location), and so on.
- the task module 220 may reference a task map that is customized for a mobile platform, whereas when the user is using a laptop, the task module 220 may reference a task map that is customized for a laptop platform.
- the task module 220 may reference a task map that is personalized for a user, when the user is interacting with the virtual assistant 112 .
- the user may be identified through voice recognition, device identification information, etc.
- the virtual assistant service 116 may utilize different task maps for different contexts.
- the task module 218 may additionally, or alternatively, determine variable values for variables that are associated with a task.
- a variable value may generally relate to any type of information that may be useful for performance of a task.
- a variable value may be obtained from user input, contextual information, and so on.
- a variable of an associated variable value may include, for example:
- the task module 220 may also cause a task to be performed by the virtual assistant 112 . This may include performing the task at the virtual assistant service 116 and/or sending an instruction to another device (e.g., the smart device 102 ) to perform the task. To illustrate, in response to input of “what is the weather like today?,” the task module 220 may send an instruction to the smart device 102 to retrieve weather information and output the information to the user 104 . In another illustration, in response to “please change my password to Hawaii39,” the virtual assistant service 116 may reference information of the user to change the password. The virtual assistant service 116 may then cause a response to be output indicating that the password has been changed (e.g., “your password has been changed to Hawaii39”).
- the task mapping module 210 may configure one or more task maps. This may generally include associating an action, object, and/or variable value with a task.
- the task mapping module 210 may generate and/or customize a task map for a particular context, such as a particular industry, platform, device type, user, user type, location, and so on.
- a task map may be personalized for a particular user based on contextual information related to that user. To illustrate, if the virtual assistant service 116 learns over time that a user inputs “send a message” to initiate a task of sending an email, in contrast to sending a text message, the task mapping module 210 may associate “send a message” (e.g., the action-object of the phrase) with the task of sending an email.
- the learning module 212 may learn information to be associated with a task, such as an action, object, and/or variable value. To do so, the learning module 212 may generally analyze contextual information related to a user or conversation. To illustrate, assume that the user states “let's jam” in an effort to listen to music and the virtual assistant service 116 incorrectly interprets this input as corresponding to a different task (e.g., searching for fruit jam on the internet), which is then performed and a response is sent to the user. Here, the user may have ignored the response of the virtual assistant 112 (e.g., closed a browser window) and opened a music application to listen to music. In this illustration the learning module 212 may learn that the particular action-object pair for “let's jam” is to be associated with the task of playing music.
- the virtual assistant 112 e.g., closed a browser window
- the learning module 212 may also observe user activity and attempt to learn characteristics about a user.
- the learning module 212 may learn any number of characteristics about the user over time, such as user preferences (e.g., likes and dislikes), track patterns (e.g., user normally reads the news starting with the sports, followed by the business section, followed by the world news), behaviors (e.g., listens to music in the morning and watches movies at night, speaks with an accent that might impact language models, prefers own music collection rather than looking for new music in the cloud, etc.), and so on.
- the learning module 212 may access a user profile, track a pattern, monitor navigation of the user, monitor content that is output to the user, and so on. Each of these learned characteristics may be useful to provide context that may be utilized to interpret user input and/or to identify a task.
- the learning module 212 can record this correction from “Cobo” to “Cabo” in the event that a similar situation arises in the future.
- the virtual assistant service 116 may use the learned correction and make a new assumption that the user means “Cabo” and respond accordingly.
- the learning module 212 will learn over time that this is the user preference and make this assumption.
- the virtual assistant service 116 will make a different initial assumption to begin play of the movie, rather than the original assumption of the song “Crazy” by Willie Nelson.
- the context module 214 may be configured to identify (e.g., determine) one or more pieces of contextual information.
- Contextual information may be used to identify and/or weight an action, object, variable value, and/or task. For example, for input of “I want to buy a new coat,” the virtual assistant service 116 may reference a recent conversation in which the user requested directions to a clothing store to purchase a coat. Based on this conversation, it may be determined that the user is more interested in purchasing the coat at the store (e.g., a task of creating a reminder to purchase the coat upon arrival at the store), rather than purchasing the coat through a phone (e.g., a task of navigating to an online e-commerce site).
- contextual information may be utilized when providing a response to a user and/or when no query has been received (e.g., providing relevant information to a user upon arrival at a particular location).
- contextual information may be weighted toward providing more or less impact than other contextual information.
- contextual information may comprise any type of information that aids the virtual assistant 112 in interacting with a user (e.g., understanding the meaning of a query of a user, formulating a response, determining a task to be performed, etc.).
- Contextual information may be stored in the context data store 130 .
- Example, non-limiting pieces of contextual information may include:
- modules 208 - 214 are illustrated as being included in the virtual assistant service 116 , in some instances one or more of these modules may be included in the smart device 102 or elsewhere. As such, in some examples the virtual assistant service 116 may be eliminated entirely, such as in the case when all processing is performed locally at the smart device 102 (e.g., the smart device 102 operates independently).
- any of these operations, and/or other techniques described herein may be implemented as one or more hardware logic components, such as Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.
- FPGAs Field-Programmable Gate Arrays
- ASICs Application-Specific Integrated Circuits
- SOCs System-on-a-chip systems
- CPLDs Complex Programmable Logic Devices
- the memory 108 and/or 204 may include one or a combination of computer storage media.
- Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
- Computer storage media includes, but is not limited to, phase change memory (PRAM), static random-access memory (SRAM), dynamic random-access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device.
- PRAM phase change memory
- SRAM static random-access memory
- DRAM dynamic random-access memory
- RAM random access memory
- ROM read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technology
- CD-ROM compact disk read-only memory
- DVD digital versatile disks
- magnetic cassettes magnetic tape
- magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information
- FIG. 3 illustrates an example process to determine a task to be performed by the virtual assistant 112 .
- the user 104 has provided input 302 “I want to buy a new coat,” which has been sent to the virtual assistant service 116 for processing.
- the action-object module 218 may determine actions and objects 304 for the input 302 (actions—want, buy; object—coat).
- the actions and objects 304 may each be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified.
- the actions and objects 304 may be passed to the task module 220 .
- the task module 220 may identify information in a task map 306 that matches the actions and objects 304 .
- the task module 220 may identify all rows in the task map 306 that include an action or an object of the determined actions and object 304 .
- two rows have been identified, a row associated with a task 308 of setting a reminder and a row associated with a task 310 of purchasing an item.
- the task module 220 in order to select a task for performance, the task module 220 ranks the tasks 308 and 310 and selects a task that ranks the highest, namely the task 308 .
- the ranking may be based on confidence scores of the tasks 308 and 310 , which are based on the confidence scores of the associated actions and objects.
- the confidence scores of the actions and/or objects may be assigned by the action-object module 218 .
- a task may be selected by asking the user 104 what task the user is requesting. As illustrated in FIG. 3 , the task 308 ranks the highest and, as such, is selected for performance.
- the task module 220 may also identify values for variables 312 that are associated with the task 308 by analyzing the input 302 and/or contextual information.
- the task module 220 references conversation history 314 that indicates that the user 104 recently requested directions to the mall (e.g., in a conversation early that morning). Based on this conversation, the task module 220 may determine a value for a destination variable for triggering a reminder (e.g., the mall) and a value for when to trigger the reminder (e.g., upon arrival).
- the virtual assistant service 116 may then perform the task 308 of setting a reminder based on the values for the variables 312 .
- the values for the variables 312 are identified upon determining a task to be performed, in other examples the values may be identified when the input 302 is processed and/or at other times.
- FIG. 4 illustrates an example user interface 400 to enable a user to customize task preferences of the virtual assistant 112 .
- the interface 400 may be provided through the smart device 102 to enable the user 104 to configure information related to a task.
- the interface 400 may also be presented through other devices.
- the user 104 may input a phrase, such as “book it,” to be associated with a task selected through a drop down menu 404 , such as a task of reserving a flight.
- a phrase such as “book it”
- the user 104 may input an action (e.g., verb) into an input field 406 and/or may input an object (e.g., noun) into an input field 408 to be associated with the selected task.
- an action e.g., verb
- an object e.g., noun
- the user 104 may input variable values to be associated with the task that is selected through the drop down menu 404 .
- the user 104 may specify a window seat as a seat preference for the seat preference variable.
- the virtual assistant 112 may seek to find a window seat for the user 104 when reserving a flight.
- the user 104 may select a submit button 412 to configure the virtual assistant 112 according to the specified information (e.g., associate the information with the task of reserving a flight).
- a user may customize the virtual assistant 112 to operate in a personalized manner (e.g., customize a task map of the virtual assistant 112 ).
- the virtual assistant 112 may be customized so that the phrase “book it” corresponds to the task of reserving a flight.
- the interface 400 may enable the user 104 to specify custom tasks to be performed by the virtual assistant 112 .
- the user 104 may specify a custom task of vibrating in response to “shake it.” This may further allow the user 104 to customize a task map of the virtual assistant 112 .
- FIGS. 5A, 5B, and 6 illustrate example processes 500 and 600 for employing the techniques described herein.
- processes 500 and 600 are described as being performed in the architecture 100 of FIG. 1 .
- one or more of the individual operations of the processes 500 and 600 may be performed by the smart device 102 and/or the virtual assistant service 116 .
- the processes 500 and 600 are implemented at least in part by the virtual assistant 112 .
- the processes 500 and 600 may be performed in other architectures.
- the architecture 100 may be used to perform other processes.
- the processes 500 and 600 are illustrated as a logical flow graph, each operation of which represents a sequence of operations that can be implemented in hardware, software, or a combination thereof.
- the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations.
- computer-executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular abstract data types.
- the order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement the process. Further, any number of the described operations may be omitted.
- FIGS. 5A-5B illustrate the example process 500 to determine a task to be performed by a virtual assistant.
- the virtual assistant service 116 may obtain user input from the smart device 102 .
- the user input may be received at the smart device 102 during a conversation between a user and the virtual assistant 112 .
- the user input may be sent to the virtual assistant service 116 for processing.
- the virtual assistant service 116 may obtain contextual information and/or weight the contextual information.
- the contextual information may include, for example, conversation history of the user with the virtual assistant 112 , content output history that identifies content that has been consumed by the user, user preference information, device information indicating a type of device that is being used by the user, and so on.
- a piece of contextual information may be weighted more or less heavily than another piece of contextual information (e.g., weighted toward providing more or less impact than another piece of contextual information).
- the weighting may be based on a time associated with contextual information (e.g., a time that the contextual information was created), a predetermined value, and so on.
- the conversation with the virtual assistant 112 may be weighted more heavily than other contextual information that was created last week, such as content output history that indicates what the user viewed on the web last week.
- a user preference may be weighted more heavily than a current time of day based on a predetermined weighting scheme, which may be configurable by a user.
- the virtual assistant service 116 may analyze the user input and/or contextual information to determine (e.g., identify) an action(s) and/or an object(s). This may include utilizing various input processing techniques, such as POST, probabilistic or statistical speech modeling, NLP, pattern recognition language modeling, and so on.
- An action and/or an object may be expressly found in the user input (e.g., identifying an action of “send” for input of “please send a text message”), determined based on an analysis of the user input (e.g., determining an action of “purchase” for input of “I want to buy a new coat,” where “purchase” corresponds to a synonym for “buy”), and so on.
- An action may comprise a verb, while an object may comprise a noun (e.g., proper noun or common noun).
- the operation 506 may include determining all candidate actions and/or objects for the user input (e.g., all possible actions and/or objects). Each identified action or object may be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified. To illustrate, if a user states “I want to buy a new coat,” the virtual assistant service 116 may identify an action of “want” as being associated with a relatively low confidence score and identify an action of “buy” as being associated with a relatively high confidence score.
- contextual information may be used to identify an action and/or object and/or to assign a confidence score to the action and/or object.
- the contextual information may be weighted in some examples.
- the virtual assistant service 116 may reference a recent conversation in which the user requested directions to a clothing store to purchase a coat. Based on this conversation, an action of “want” may be assigned a relatively high confidence score, while an action of “buy” may be assigned a relatively low confidence score.
- confidence scores may suggest that the user is more interested in purchasing the coat at the store (e.g., an action-object pair of want-coat that is associated with a task of creating a reminder to purchase the coat upon arrival at the store), rather than purchasing the coat through a phone (e.g., an action-object pair of buy-coat that is associated with a task of navigating to an online e-commerce site).
- the virtual assistant service 116 may identify matching information in a task map for the action(s) and/or object(s) determined at 506 .
- the virtual assistant service 116 may identify actions and objects in the task map that correspond to all candidate action(s) and/or object(s) determined at 506 .
- the task map is represented as a table with columns for actions, objects, and tasks (e.g., the example task map 124 illustrated in FIG. 1 )
- the virtual assistant service 116 may identify each row that includes at least one piece of matching information (e.g., at least one action or object determined at 506 ).
- the virtual assistant service 116 may determine whether or not a task that is associated with matching information in the task map satisfies one or more criteria, such as being the only task that is associated with matching information and/or being associated with a confidence score that is greater than a threshold. This may generally include an initial determination as to whether or not a task is identified to be performed. For example, the virtual assistant service 116 may reference the task map to determine if the matching information in the task map corresponds to a single task (e.g., a single row is identified). In another example, the virtual assistant service 116 may determine whether or not a task that is associated with matching information in the task map is associated with a confidence score that is greater than a threshold. A confidence score of a task may be based on confidence scores of associated actions and/or objects. To illustrate, a task may be associated with a relatively high confidence score, in comparison to another task, when an action and object of the task are associated with relatively high confidence scores.
- the process 500 may proceed to 512 (e.g., the NO path). In many instances, the process 500 may proceed to 512 when the matching information in the task map corresponds to multiple tasks. Alternatively, when a task that is associated with matching information satisfies the one or more criteria, the process 500 may proceed to FIG. 5B (e.g., the YES path).
- the virtual assistant service 116 may determine whether or not to prompt the user for additional information regarding task performance. This may include determining whether or not a setting has been set to prompt the user. This setting may be configured by end-users, users of the virtual assistant service 116 , applications, and so on.
- the process 500 may proceed to 514 (e.g., the YES path).
- the process 500 may proceed to 516 (e.g., the NO path).
- the virtual assistant service 116 may prompt the user for input regarding what task the user is requesting to be performed. This may include sending an instruction to the smart device 102 to prompt the user for information that clarifies what task the user is requesting to be performed.
- the virtual assistant service 116 may provide the user with information that the virtual assistant service 116 has identified, such as an identified action, object, or task.
- the virtual assistant service 116 may identify a candidate task of setting a reminder based on an identified action-object pair of want-coat and may identify another candidate task of purchasing the coat based on an identified action-object pair of purchase-coat.
- the process 500 may return to 506 and analyze the user input and/or contextual information to determine actions and/or objects in order to further narrow down what task the user is requesting.
- the conversation between the virtual assistant 112 and the user may be goal-based.
- the virtual assistant service 116 may seek to accomplish a goal, such as collecting a threshold amount of information to identify a task.
- the conversation between the user and the virtual assistant 112 may be substantially driven by input of the user.
- the virtual assistant service 116 may ask questions and/or receive user input until a task is identified. If the user asks a question that is not related to identifying a task, the virtual assistant 112 may seek to resolve the question and return back to the task identification conversation.
- the virtual assistant service 116 may rank multiple tasks that are associated with matching information in the task map and may select a task(s) from the ranking. This may be useful when the virtual assistant service 116 has identified multiple candidate tasks from the task map (e.g., potential tasks). The ranking may be based on confidence scores associated with the multiple tasks.
- the virtual assistant service 116 may identify a candidate task of setting a reminder (e.g., a row within the task map that includes a matching action and/or object) and another candidate task of purchasing the coat (e.g., another row within the task map that includes a matching action and/or object).
- the virtual assistant service 116 may then, for example, rank the task of purchasing the coat higher than the task of setting a reminder based on the purchasing task being associated with a higher confidence score.
- a confidence score of a task may be based on a confidence score of associated actions and/or objects, which may be based on contextual information.
- the virtual assistant service 116 may then select a task(s) that ranks the highest/lowest (or the n th highest/lowest) within the ranking.
- variable value may include a value for a variable that is used to perform a task (also referred to as a value for a task variable). For example, in order to perform a task of purchasing a flight, particular variable values may be gathered, such as a departure location, a destination location, an airline, a type of seat requested (e.g., first class, coach, etc.), a date of departure, and so on.
- a type of seat requested e.g., first class, coach, etc.
- the analysis at 518 may generally seek to identify variable values from the user input obtained at 502 , the user input received in response to prompting the user at 514 , and/or the contextual information obtained and/or weighted at 504 .
- the analysis may include referencing a variable(s) associated with a task and analyzing user input and contextual information to determine if a term or phrase in the user input or contextual information matches a word type of a variable(s) (e.g., noun, verb, adjective, etc.) and/or a category of a variable(s) (e.g., location, number, item, food, or any general classification of a word or variable).
- a word type of a variable(s) e.g., noun, verb, adjective, etc.
- a category of a variable(s) e.g., location, number, item, food, or any general classification of a word or variable.
- the virtual assistant service 116 may search within user input and/or contextual information for a destination city (e.g., which may be included within the user input, described in user preference information, etc.). If, for example, the user previously had a conversation about traveling to Seattle, the virtual assistant service 116 may identify Seattle as the value for the destination variable. Additionally, a departure city of Spokane may be identified based on the user's current location, namely Spokane, and a seat type may be identified based on a seat preference that the user has set.
- a destination variable e.g., city category
- the virtual assistant service 116 may search within user input and/or contextual information for a destination city (e.g., which may be included within the user input, described in user preference information, etc.). If, for example, the user previously had a conversation about traveling to Seattle, the virtual assistant service 116 may identify Seattle as the value for the destination variable. Additionally, a departure city of Spokane may be identified based on the user's current location, namely Spokane, and
- the virtual assistant service 116 may determine whether or not a variable value(s) for performing a task is missing. If a predetermined number of variable values is missing for a task (e.g., more than 1 or 2), the process 500 may proceed to 522 (e.g., the YES path). Alternatively, if the predetermined number of variable values is not missing, the process 500 may proceed to 524 (e.g., the NO path).
- a predetermined number of variable values is missing for a task (e.g., more than 1 or 2)
- the process 500 may proceed to 522 (e.g., the YES path). Alternatively, if the predetermined number of variable values is not missing, the process 500 may proceed to 524 (e.g., the NO path).
- the virtual assistant service 116 may prompt the user for the missing variable value(s). This may include sending an instruction to the smart device 102 to prompt the user for the missing variable value(s). In some instances, this may also include informing the user of the variable values that have been gathered. For example, if the variable value for an airline is missing, the virtual assistant service 116 may ask “What airline would you like to use for your flight?” Upon receiving user input, the process 500 may return to 518 and analyze the user input and/or contextual information. In some examples, the operation 522 may include performing a goal-based dialog for each of the variables (e.g., carrying out separate requests to the user for each variable value).
- the virtual assistant service 116 may cause the task to be performed (e.g., by the virtual assistant 112 ). This may include performing the task at the virtual assistant service 116 , sending an instruction to the smart device 102 to perform the task, sending an instruction to another device, and so on. If the task is associated with variables, the values for the variables may be used to perform the task.
- the virtual assistant service 116 may learn information to be associated with a task, such as an action(s), object(s), and/or variable value(s).
- the virtual assistant service 116 may seek to identify a task that was desired by a user for input.
- the virtual assistant service 116 may identify input that is received from a user during a conversation and determine whether or not one or more criteria are satisfied to classify a particular task that was performed by the virtual assistant 112 for the input as an accurately identified task.
- the one or more criteria may be satisfied when the user views a response of the virtual assistant 112 for more than a predetermined amount of time, the user continues a conversation with the virtual assistant 112 (e.g., provides further input that does not clarify the previous input), the virtual assistant 112 confirms that it did the correct task through direction questioning (e.g., ask the user if a performed task was the task he desired), or the user otherwise acts to indicate that the virtual assistant 112 performed a task that the user desired.
- direction questioning e.g., ask the user if a performed task was the task he desired
- the virtual assistant service 116 may identify a task that was initiated by the user after the particular task was performed by the virtual assistant 112 (e.g., the user accessing an app, navigating to content, etc.). The virtual assistant service 116 may then identify an action and/or an object of the input to be associated with the task that was initiated by the user. In some instances, at 526 the virtual assistant 112 may ask the user if it should apply learned information to future conversations. By performing learning techniques, the virtual assistant service 116 may learn a task that is to be associated with an action and/or object that is determined for input.
- the virtual assistant service 116 may learn that the particular action-object pair that is determined for “let's jam” is to be associated with the task of playing music.
- the virtual assistant service 116 may learn that the object determined for “my team” is to be associated with the particular baseball team. That is, the virtual assistant service 116 may learn that the action-object pair that is determined for “how did my team do last night?” is to be associated with a task of navigating to the specific page of the particular baseball team.
- the virtual assistant service 116 may learn a variable value based on a conversation between the user and the virtual assistant 112 .
- the virtual assistant service 116 may learn that when the user refers to “Jaime,” the user is actually referring to “James” who is listed as a contact on the user's device (e.g., the user says “send a text message to Jaime . . . oh wait I mean James,” the user corrects a to-field to “James” for a text message that is generated by the virtual assistant 112 as a response to “send a text message to Jaime,” etc.).
- the learning at 526 may alternatively, or additionally, be based on explicit input from the user requesting an association.
- the virtual assistant service 116 may learn that a task of playing music is to be associated with an action-object pair that is determined for “let's jam” based on input from a user of “please associate let's jam with playing music.”
- the input is received through a user interface that enables customization of task and action-object relationships, such as the interface 400 of FIG. 4 .
- the virtual assistant service 116 may configure a task map. This may include associating an action, object, and/or variable value with a task based on the learning at 526 . In returning to the example above where the virtual assistant service 116 has learned that input of “let's jam” is to be associated with the task of playing music, the virtual assistant service 116 may associate an action-object pair that is determined for “let's jam” with the task of playing music. Alternatively, or additionally, a task map may be configured according to the process 600 of FIG. 6 .
- FIG. 6 illustrates the example process 600 to configure a task map of a virtual assistant.
- the virtual assistant service 116 may identify a context for configuring a task map.
- the task map may map tasks to be performed by a virtual assistant to action-object pairs.
- the context may comprise, for example, an industry to which the virtual assistant 112 is to be deployed (e.g., field of use), a platform for which the virtual assistant 112 is to be deployed, a device type for which the virtual assistant 112 is to be deployed, a user for which the virtual assistant 112 is to be deployed, and so on.
- the virtual assistant service 116 may obtain information related to the context.
- the information may include, for example, one or more terms or phrases that are used for an industry, platform, device type, etc.
- the information may comprise contextual information related to a user.
- the virtual assistant service 116 may configure the task map for the context. This may include assigning a task to a particular action-object part based on the information related to the context. For example, the virtual assistant service 116 may select a task based on the information related to the context and associate the task with a particular action-object pair. To illustrate, if the virtual assistant 112 is to be deployed into an airline industry application, then an action-object pair of provide-status may be associated with a task of providing a flight status (e.g., instead of a task of providing other status information, as may be the case in another industry).
- an action-object pair of provide-directions may be associated with a task of opening a navigation app (e.g., instead of a task of opening a directions-based web site, as may be the case in another platform).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Robotics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
- A growing number of users are using smart devices, such as smart phones, tablet computers, and so on, to interact with virtual assistants. The users may communicate with virtual assistants to perform a desired task, such as searching for content, checking into a flight, setting a calendar appointment, and so on. As the users provide input, the virtual assistants often incorrectly determine a task that the users are requesting. Accordingly, there is an increasing need to accurately identify a task to be performed by the virtual assistant for a user.
- The detailed description is set forth with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items or features.
-
FIG. 1 illustrates an example architecture in which techniques described herein may be implemented. -
FIG. 2 illustrates details of an example virtual assistant service. -
FIG. 3 illustrates an example process to determine a task to be performed by a virtual assistant. -
FIG. 4 illustrates an example user interface to enable a user to customize task preferences of a virtual assistant. -
FIGS. 5A-5B illustrate an example process to determine a task to be performed by a virtual assistant. -
FIG. 6 illustrates an example process to configure a task map of a virtual assistant. - This disclosure describes, in part, techniques for mapping actions and objects to tasks of a virtual assistant. In some instances, a user may interact with a virtual assistant on a smart device by providing input to the virtual assistant and/or receiving information from the virtual assistant. While interacting with the virtual assistant, the user may provide input that requests or otherwise facilitates a task to be performed by the virtual assistant. The virtual assistant may process the input to determine an action (e.g., verb) and an object (e.g., noun). For example, if the user inputs “listen to music,” the virtual assistant may identify the term “listen” as corresponding to the action and the term “music” as corresponding to the object.
- The virtual assistant may then identify a task to be performed by the virtual assistant. In some instances, the virtual assistant may reference a task map. The task map may map action-object pairs to tasks of the virtual assistant. A task may include any type of operation that is performed at least in part by a computing device. In returning to the example above, the virtual assistant may determine that the action of “listen” and the object of “music” are associated with a task of playing a song on the smart device. The virtual assistant may then perform the identified task (e.g., play the song). By utilizing a task map or other source of correlation that maps action-object pairs to tasks, a task may be efficiently identified for performance by the virtual assistant.
- In some instances, the virtual assistant may utilize contextual information to identify a task to be performed by the virtual assistant. The contextual information may include a conversation history of the user with the virtual assistant, content output history identifying content that has been output to the user, user preferences, location of the user, and so on. The contextual information may provide some indication of what task the user would like the virtual assistant to perform (e.g., what the user is requesting the virtual assistant to do). For example, if a user has discussed a flight in a recent conversation with the virtual assistant, and the user has just input “please provide a status,” which may map to multiple tasks (e.g., provide a flight status, provide a shipping status of a purchased item, provide a battery or download status, etc.), the virtual assistant may reference the recent conversation to infer that the user may be interested in flight status information. By referencing contextual information, the virtual assistant may identify a task that is relevant to the user's context.
- This disclosure also describes techniques for customizing a task map of a virtual assistant. In some instances, the task map may be personalized for a particular user (e.g., on a user-by-user basis). Here, the virtual assistant may learn what task to perform for a particular action-object pair of input from the user. For example, if the virtual assistant has identified input of “let's rock-out” from the user in a previous conversation as corresponding to a task of playing music, the virtual assistant may update the task map for that user such that an action-object pair for “let's rock-out” corresponds to the task of playing music. In another example, the virtual assistant may learn the types of content that are output to a user, and personalize tasks to those types of content. To illustrate, if a user frequently views sports content on a particular sports web site, the task map may be personalized so that an action-object pair associated with sports may be associated with a task of navigating to the particular sports web site.
- Alternatively, or additionally, a task map may be customized for a particular industry application, platform, device type, and so on, in which the virtual assistant is to be deployed. To illustrate, a task map may be generated for an airline industry implementation so that action-object pairs that are relevant to the airlines are associated with tasks that are relevant to the airlines. In this illustration, an action-object pair of check-status may be associated with a task of checking the status of an airline flight, instead of a task of checking the status of a purchased item, which may be the case in another industry application, such as an e-commerce implementation.
- By customizing a task map for a particular context, a virtual assistant may perform tasks that are relevant to the particular context. In one example, by personalizing the task map for a particular user, the virtual assistant may provide a personalized interaction with the user (e.g., a conversation that is adapted to the user). In another example, by customizing a task map for a particular industry application, the virtual assistant may provide functionality that is adapted to the particular industry application. Further, by customizing a task map based on a context for which the virtual assistant is to be utilized, the virtual assistant may provide accurate task determination, which may enhance a user's experience with the virtual assistant. Moreover, the techniques described herein may learn over time tasks that may be relevant to particular action-object pairs and evolve a task map based on the learning.
- This brief introduction is provided for the reader's convenience and is not intended to limit the scope of the claims, nor the proceeding sections. Furthermore, the techniques described in detail below may be implemented in a number of ways and in a number of contexts. Example implementations and contexts are provided with reference to the following figures, as described below in more detail. It is to be appreciated, however, that the following implementations and contexts are but some of many.
-
FIG. 1 illustrates anexample architecture 100 in which techniques described herein may be implemented. Thearchitecture 100 includes asmart device 102 configured to interact with one or more users 104 (hereinafter the user 104) and perform other processing discussed herein. Thesmart device 102 may comprise any type of computing device that is configured to perform an operation. For example, thesmart device 102 may be implemented as a laptop computer, a desktop computer, a server, a smart phone, an electronic reader device, a mobile handset, a personal digital assistant (PDA), a portable navigation device, a portable gaming device, a tablet computer, a watch, a portable media player, a television, a set-top box, a computer system in a car, an appliance, a camera, a robot, a hologram system, a security system, a home-based computer system (e.g., intercom system, home media system, etc.), a projector, an automated teller machine (ATM), a pair of glass with computing capabilities, a wearable computer, and so on. - The
smart device 102 may be equipped with one ormore processors 106,memory 108, a display(s), a microphone(s), a speaker(s), a camera(s), a sensor(s), and a network interface(s). The one ormore processors 106 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, a digital signal processor, and so on. The sensor(s) may include an accelerometer, compass, gyroscope, magnetometer, Global Positioning System (GPS), olfactory sensor (e.g., for smell), or other sensor. In some instances, the display(s) is implemented as one or more touch screens. The camera(s) may include a front facing camera and/or a rear facing camera. The display(s), microphone(s), speaker(s), camera(s), and/or sensor(s) may be configured to receive user input, such as gesture input (e.g., through the camera), touch input, audio or speech input, and so on, and/or may be configured to output content, such as audio, images, video, and so on. - The
memory 108 may include a client application 110 (e.g., module) configured to interface with theuser 104. Theclient application 110 may receive any type of input from theuser 104, such as audio or speech, text, touch, or gesture input received through a sensor or other element of thesmart device 102. Theclient application 110 may also provide any type of response, such as audio, text, interface items (e.g., icons, buttons, menu elements, etc.), and so on. In some implementations, theclient application 110 is implemented as, or in association with, a mobile application, a browser (e.g., mobile browser), and so on. - The
client application 110 may be implemented as, or in conjunction with, a virtual assistant 112 (e.g., an intelligent personal assistant). A “virtual assistant” may act as an interface between end users and information of one or more service providers 114 (hereinafter the service provider 114), information of thesmart device 102, information of avirtual assistant service 116, or any type of information. For example, in response to input from theuser 104, thevirtual assistant 112 may access content items stored on theservice provider 114 to formulate a response to theuser 104. Thevirtual assistant 112 may be configured for multi-modal input/output (e.g., receive and/or respond in audio or speech, text, touch, gesture, etc.), multi-language communication (e.g., receive and/or respond according to any type of human language), multi-channel communication (e.g., carry out conversations through a variety of computing devices, such as continuing a conversation as a user transitions from using one computing device to another), and other types of input/output or communication. In some instances, thevirtual assistant 112 may embody a human-like persona and/or artificial intelligence (AI). For example, thevirtual assistant 112 may be represented by an image or avatar that is displayed on thesmart device 102. An avatar may comprise an animated character that may take on any number of shapes and appearances, and/or resemble a human talking to a user. In some instances, the avatar may be arranged as a representative of theservice provider 114, while in other instances the avatar may be a dedicated personal assistant to a user. - The
virtual assistant 112 may interface with the user through aconversation user interface 118. Theconversation user interface 118 may provide conversation items representing information from thevirtual assistant 112 and/or information from theuser 104. For example, in response to a query from theuser 104 to “find the nearest restaurant,” theconversation user interface 118 may display a dialog representation of the user's query and a response item of thevirtual assistant 112 that identifies the nearest restaurant to theuser 104. A conversation item may comprise an icon (e.g., selectable or non-selectable), a menu item (e.g., drop down menu, radio control, etc.), text, a link, audio, video, or any other type of information. In addition to conversation items, theconversation user interface 118 may include other interface items, such as a microphone icon for speech input, a text box to input text, a keyboard (e.g., touch screen keyboard), other input icons, and so on. - Although the
conversation user interface 118 has been described as being associated with thesmart device 102, in other examples theconversation user interface 118 is associated with theservice provider 114 and/or thevirtual assistant service 116. In one instance, theinterface 118 is displayed through an online site of theservice provider 114, such as when the user navigates to the online site. Here, theinterface 118 may include a virtual assistant that embodies characteristics of theservice provider 114, such as a flight attendant for an online airline site. - The
user 104 may generally interact with thevirtual assistant 112 to cause a task to be performed by thevirtual assistant 112. In some instances, a task may be performed in response to explicit user input, such as playing music in response to “please play music.” While in other instances a task may be performed in response to inferred user input requesting that that the task be performed, such as providing weather information in response to “the weather looks nice today.” In yet further instances, a task may be performed when an event has occurred, such as providing flight information an hour before a flight. - A task may include any type of operation that is performed at least in part by a computing device. For example, a task may include logging a user into a site, setting a calendar appointment, resetting a password for a user, purchasing an item, opening an application, sending an instruction to a device to perform an act, sending an email, outputting content (e.g., outputting audio (an audible answer), video, an image, text, a hyperlink, etc.), navigating to a web site, upgrading a user's seat assignment, and so on. In some instances, a task may include providing a response to a user. The response may be addressed to or otherwise tailored to the user (e.g., “Yes, John, as a Gold Customer you are entitled to a seat upgrade, and I have provided some links below that may be of interest to you . . . . ”). Further, in some instances a task may include performing an operation according to one or more criteria (e.g., one or more default settings). To illustrate, a task may include sending an email through a particular email account, providing directions with a particular mobile application, searching for content through a particular search engine, and so on. Alternatively, or additionally, a task may include providing information through the
conversation user interface 118. - A task may be associated with variables for performing the task. For example, a task of playing music may be associated with an artist variable indicating the artist and a song variable indicating the song. In some instances, a value for a variable is obtained from the input that initiated the task. For example, if the user requests “please play Free Fallin' by Tom Petty,” the
virtual assistant 112 may identify “Free Fallin'” as a value for the song variable and “Tom Petty” as a value for the artist variable. In other instances, values for variables may be known and/or obtained from contextual information. For example, if a user requests “please text Megan,” thevirtual assistant 112 may identify a particular Megan in the user's contacts (e.g., when the contacts include multiple Megans) that was recently texted as a value for the person ID variable for the task. Alternatively, or additionally, a value for a variable may be obtained by prompting a user for the value. For example, if the user requests “book a flight,” and has not provided a destination, thevirtual assistant 112 may ask the user “where would you like to fly to?” and the user may provide a destination as the value. - The
virtual assistant 112 may generally determine a task to perform by referencing one or more task maps. A task map may map action-object pairs to tasks. A task map may generally refer to any type of data that associates a task with an action-object pair. For example, a task map may comprise a look-up table, data in a database, data of a state machine, or any other data to correlate tasks and action-object pairs. As used herein, an action may comprise a verb, while an object may comprise a noun. In some examples, a task map may specify associations for a particular type of noun, such as a common noun (e.g., a class of entities) or a proper noun (e.g., a unique entity). If, for example, a task map includes an object that corresponds to a common noun, the variable value for the task may specify the proper noun. To illustrate, if a task map includes an object that corresponds to a common noun and a user requests “please play Free Fallin',” the object may comprise a song (e.g., the common noun) while the variable value may comprise “Free Fallin'” (e.g., the proper noun). - In many instances, the
virtual assistant 112 operates in cooperation with thevirtual assistant service 116. That is, one or more functions of thevirtual assistant 112 may be performed by thevirtual assistant service 116. Thevirtual assistant service 116 may generally provide one or more services, such as input processing, speech recognition, response formulation, task mapping, context analysis, user characteristic analysis, and so on. Thevirtual assistant service 116 may generally act as a “back-end” resource for thesmart device 102. - In one illustrative example of the operations performed by the
virtual assistant service 116, thesmart device 102 may receiveinput 120 from the user 104 (e.g., “what's the score of the game?”) and send theinput 120 to thevirtual assistant service 116 for processing. Thevirtual assistant service 116 may analyze theinput 120 to determine an action and anobject 122. Here, the action comprises “provide,” while the object comprises “score.” Thevirtual assistant service 116 may then reference atask map 124 that associates action-object pairs with tasks. In this example, the action-object pair of provide-score maps to multiple tasks, namely a task 126(a) of providing the score of a sports game and a task 126(b) of providing the score of a video game. In one instance, in order to identify the particular task to be performed, thevirtual assistant service 116 may referencecontextual information 128 stored in acontext data store 130 and rank thetasks 126 based on which task is most relevant to thecontextual information 128. Here, theuser 104 had a conversation with thevirtual assistant 112 yesterday about the NXT Lions basketball team. Based on this information, thevirtual assistant service 116 may identify that the task 126(a) of providing a score of a sports game as most relevant to theinput 120. In other instances, thevirtual assistant service 116 may prompt the user for further clarification regarding a task (e.g., “would you like to view the score of the sports game or view the score of the video game?”). - The
virtual assistant service 116 may then determine variable values 132(a) for performing the task 126(a). As illustrated inFIG. 1 , thetasks 126 may be associated withvariables 132 for performing thetasks 126. In this example, thevirtual assistant service 116 again references thecontextual information 128 to identify a value for the sport and team variables (e.g., basketball and NXT Lions) of the task 126(a). However, in other examples thevirtual assistant 116 may cause thevirtual assistant 112 to prompt theuser 104 for the variable values 132(a). Upon identifying the variable values 132(a), thevirtual assistant service 116 may cause the score of the game to be provided to theuser 104, as illustrated at 134. - In the example above, a task was identified based on contextual information that provided content of a previous conversation. However, in other examples, other types of contextual information may be used. In one example, a task may be identified based on contextual information that indicates a type of device a user is using to interact with a virtual assistant. If, for instance, the user requests “call Michelle,” and the user is using a desktop computer, a task for that context may be identified, such as calling an individual through a voice over internet protocol service or setting a reminder to call the individual at a later time (e.g., when the user is on his cell phone). Whereas, if the user is using a cell phone, a different task may be identified, such as calling the individual through a cellular connection.
- In some instances, contextual information may be used to identify an order to perform multiple tasks. To illustrate, if it is determined that the user would like to buy tickets to a movie and call his girlfriend, the task for buying the movie tickets may be performed first based on a calendar event for a date with the user's girlfriend (e.g., indicating that the user may want to buy the tickets first so that he can mention the tickets to his girlfriend). Thereafter, the task for calling the user's girlfriend may be performed.
- The
architecture 100 also includes theservice provider 114 that includes one or more data stores 136 for storing content items. The one or more data stores 136 may include a mobile web data store, a smart web data store, an information and content data store, a content management service (CMS) data store, and so on. A mobile web data store may store content items that are designed to be viewed on a mobile device, such as a mobile telephone, tablet device, etc. Meanwhile, a web data store includes content items that are generally designed to be viewed on a device that includes a relatively large display, such as a desktop computer. An information and content data store may include content items associated with an application, content items from a data base, and so on. A CMS data store may include content items providing information about a user, such as a user preference, user profile information, information identifying offers that are configured to a user based on profile and purchase preferences, etc. As such, theservice provider 114 may include content items from any type of source. Although the one or more data stores 136 are illustrated as included in theservice provider 114, the one or more data stores 136 may alternatively, or additionally, be included in thevirtual assistant service 116 and/or thesmart device 102. - As illustrated, the
architecture 100 may include acurrent context service 138 to provide current information about a context. For example, thecurrent context service 138 may provide information about current events (e.g., news articles, sports scores, blog content, social media content, a current flight status (e.g., on-time, delayed, etc.), and so on), location information about a user (e.g., a user's current location), current weather information, current times and/or dates (e.g., a current time in Japan, a current time in the US, a current time and date where a user is located, etc.), and so on. In some instances, this information is stored at thecurrent context service 138, while in other instances the information is sent to theservice provider 114, thevirtual assistant service 116, and/or thesmart device 102 for storage. Thecurrent context service 138 may communicate with thevirtual assistant service 116 to provide information that may be useful to thevirtual assistant 112. - The
architecture 100 may also include one ormore networks 140 to enable thesmart device 102, thevirtual assistant service 116, theservice provider 114, and/or thecurrent context service 138 to communicate with each other. The one ormore networks 140 may include any one or combination of multiple different types of networks, such as cellular networks, wireless networks, Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and so on. -
FIG. 2 illustrates further details of the examplevirtual assistant service 116 ofFIG. 1 . As noted above, thevirtual assistant service 116 may generally provide one or more services to implement thevirtual assistant 112 on thesmart device 102. - As illustrated, the
virtual assistant service 116 may include one or more computing devices. The one or more computing devices may be implemented as one or more desktop computers, laptop computers, servers, and the like. The one or more computing devices may be configured in a cluster, data center, cloud computing environment, or a combination thereof. In one example, thevirtual assistant service 116 provides cloud computing resources, including computational resources, storage resources, and the like, that operate remotely to thesmart device 102. - The one or more computing devices of the
virtual assistant service 116 may include one ormore processors 202,memory 204, and one or more network interfaces 206. The one ormore processors 202 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, a digital signal processor, and so on. Thememory 204 may include software functionality configured as one or more “modules.” The term “module” is intended to represent example divisions of the software for purposes of discussion, and is not intended to represent any type of requirement or required method, manner or necessary organization. Accordingly, while various “modules” are discussed, their functionality and/or similar functionality could be arranged differently (e.g., combined into a fewer number of modules, broken into a larger number of modules, etc.). Further, while certain functions and modules are described herein as being implemented by software and/or firmware executable on a processor, in other embodiments, any or all of the modules may be implemented in whole or in part by hardware (e.g., as an ASIC, a specialized processing unit, etc.) to execute the described functions. As illustrated inFIG. 1 , thememory 204 includes aninput processing module 208, atask mapping module 210, alearning module 212, and acontext module 214. - The
input processing module 208 may be configured to obtain and/or process input received from a user. If, for example, the input is speech input, theinput processing module 208 may perform speech recognition techniques to convert the input into a format that is understandable by a computing device, such as text. Theinput processing module 208 may store input in aninput data store 216. Theinput processing module 208 may also be configured to determine a task to perform. To make such a determination, theinput processing module 208 may include an action-object module 218 and atask module 220. - The action-
object module 218 may determine (e.g., identify) an action and/or an object for user input. An action and/or object may be explicitly included in user input, inferred from the structure and/or context of the user input, and/or obtained by prompting a user (e.g., if missing information or context). The action-object module 218 may utilize various techniques, such as Part-of-Speech Tagging (POST), probabilistic or statistical speech modeling, Natural Language Processing (NLP), pattern recognition language modeling, and so on. These techniques may seek to interpret or derive a meaning and/or concept of input and may include new and/or existing techniques. In some instances, an action and/or object may be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified. - The
task module 220 may determine (e.g., identify) a task to be performed based on an action, object, and/or variable value of user input. Thetask module 220 may reference one or more task maps stored in a taskmap data store 222. In one example, thetask module 220 may identify matching information in a task map (e.g., all information in the map that includes a determined action, object, and/or variable value) and tasks that are associated with the matching information. In other words, thetask module 220 may identify all candidate tasks in a task map that are associated with an action, object, and/or variable value that are identified from user input (e.g., all rows of table-based task map that include an identified action, object, or variable value). Each candidate task may be associated with a confidence score that is based on the confidence scores of the associated action and/or object for the task. Thetask module 220 may then determine whether or not any of the candidate tasks satisfy one or more criteria, such as being the only task that is associated with identified information in the task map and/or being associated with a confidence score that is greater than a threshold. When such criteria are satisfied, the task may be selected. Alternatively, if the criteria are not satisfied, then thevirtual assistant service 116 may identify a task to be performed by prompting the user for information and/or ranking the candidate tasks. - In some instances, the
task module 220 may make an initial determination as to a context in which user input is received and reference a task map that is customized for the context. The context may comprise a particular industry (e.g., field of use), platform (e.g., type of software/hardware architecture—mobile operating system, desktop operating system, etc.), device or device type, user, user type, location (e.g., user location), and so on. To illustrate, when a user is using a cell phone to interact with thevirtual assistant 112, thetask module 220 may reference a task map that is customized for a mobile platform, whereas when the user is using a laptop, thetask module 220 may reference a task map that is customized for a laptop platform. In another illustration, thetask module 220 may reference a task map that is personalized for a user, when the user is interacting with thevirtual assistant 112. Here, the user may be identified through voice recognition, device identification information, etc. As such, thevirtual assistant service 116 may utilize different task maps for different contexts. - The
task module 218 may additionally, or alternatively, determine variable values for variables that are associated with a task. A variable value may generally relate to any type of information that may be useful for performance of a task. A variable value may be obtained from user input, contextual information, and so on. A variable of an associated variable value may include, for example: -
- media variables for outputting media, such as a song or movie title, an artist name, lyrics to a song, an album name, and so on;
- message variables for creating, viewing, and/or otherwise interacting with messages (e.g., emails, text messages, telephone calls, etc.), such as an email address, telephone number, content of a message, a subject line, an attachment (e.g., information identifying an attachment), and so on;
- navigation variables for directions, such as a destination location, a starting location, a route, a travel mode (e.g., by road, foot, or bike), etc.;
- travel variables, such as a flight number, a confirmation number, an airlines, a number of bags to check, a number of passengers, and so on;
- purchase variables for purchasing an item, such as item identification information, a type of item (e.g., a shoe, a bike, etc.), a shipping address, an account to charge, a type of shipment, etc.;
- calendar variables to set, view, or update a calendar event, such as a time of day for the event, a location of the event, a date for the event, an individual to be involved in the event, and so on;
- social media variables for posting and/or viewing information of a social networking service, such as content to post to a social networking site (e.g., information identifying an image or video of a user), a name of the social networking service, etc.;
- reminder variables to set and/or view a reminder, such as a time for a reminder, a date for a reminder, a type of alarm to be triggered (e.g., a ringer type), etc.;
- application variables identify an application to use for a task, such as user input requesting “provide directions via google® maps,” “send an email via my yahoo® account,” “find movie reviews with fandango®,” “find a review through yelp®,” and so on;
- a source of content variable to identify a content source for performing a task, such as user input requesting “find information about the basketball game on the web,” “look at my contact list on my phone to find Jane,” “search google® for a new car,” and so on; or
- any other type of variable.
- The
task module 220 may also cause a task to be performed by thevirtual assistant 112. This may include performing the task at thevirtual assistant service 116 and/or sending an instruction to another device (e.g., the smart device 102) to perform the task. To illustrate, in response to input of “what is the weather like today?,” thetask module 220 may send an instruction to thesmart device 102 to retrieve weather information and output the information to theuser 104. In another illustration, in response to “please change my password to Hawaii39,” thevirtual assistant service 116 may reference information of the user to change the password. Thevirtual assistant service 116 may then cause a response to be output indicating that the password has been changed (e.g., “your password has been changed to Hawaii39”). - The
task mapping module 210 may configure one or more task maps. This may generally include associating an action, object, and/or variable value with a task. Thetask mapping module 210 may generate and/or customize a task map for a particular context, such as a particular industry, platform, device type, user, user type, location, and so on. In one example, a task map may be personalized for a particular user based on contextual information related to that user. To illustrate, if thevirtual assistant service 116 learns over time that a user inputs “send a message” to initiate a task of sending an email, in contrast to sending a text message, thetask mapping module 210 may associate “send a message” (e.g., the action-object of the phrase) with the task of sending an email. - The
learning module 212 may learn information to be associated with a task, such as an action, object, and/or variable value. To do so, thelearning module 212 may generally analyze contextual information related to a user or conversation. To illustrate, assume that the user states “let's jam” in an effort to listen to music and thevirtual assistant service 116 incorrectly interprets this input as corresponding to a different task (e.g., searching for fruit jam on the internet), which is then performed and a response is sent to the user. Here, the user may have ignored the response of the virtual assistant 112 (e.g., closed a browser window) and opened a music application to listen to music. In this illustration thelearning module 212 may learn that the particular action-object pair for “let's jam” is to be associated with the task of playing music. - The
learning module 212 may also observe user activity and attempt to learn characteristics about a user. Thelearning module 212 may learn any number of characteristics about the user over time, such as user preferences (e.g., likes and dislikes), track patterns (e.g., user normally reads the news starting with the sports, followed by the business section, followed by the world news), behaviors (e.g., listens to music in the morning and watches movies at night, speaks with an accent that might impact language models, prefers own music collection rather than looking for new music in the cloud, etc.), and so on. To observe user activity and learn a characteristic, thelearning module 212 may access a user profile, track a pattern, monitor navigation of the user, monitor content that is output to the user, and so on. Each of these learned characteristics may be useful to provide context that may be utilized to interpret user input and/or to identify a task. - As an example of learning a characteristic, consider a scenario where a user incorrectly inputs “Cobo” or a speech recognition system incorrectly recognized the user input as “Cobo”. Once the user corrects this to say “Cabo”, the
learning module 212 can record this correction from “Cobo” to “Cabo” in the event that a similar situation arises in the future. Thus, when the user next speaks the phrase “Cabo San Lucas,” and even though the speech recognition might recognize the user input as “Cobo,” thevirtual assistant service 116 may use the learned correction and make a new assumption that the user means “Cabo” and respond accordingly. As another example, if a user routinely asks for the movie “Crazy,” thelearning module 212 will learn over time that this is the user preference and make this assumption. Hence, in the future, when the user says “Play Crazy,” thevirtual assistant service 116 will make a different initial assumption to begin play of the movie, rather than the original assumption of the song “Crazy” by Willie Nelson. - The
context module 214 may be configured to identify (e.g., determine) one or more pieces of contextual information. Contextual information may be used to identify and/or weight an action, object, variable value, and/or task. For example, for input of “I want to buy a new coat,” thevirtual assistant service 116 may reference a recent conversation in which the user requested directions to a clothing store to purchase a coat. Based on this conversation, it may be determined that the user is more interested in purchasing the coat at the store (e.g., a task of creating a reminder to purchase the coat upon arrival at the store), rather than purchasing the coat through a phone (e.g., a task of navigating to an online e-commerce site). In addition, contextual information may be utilized when providing a response to a user and/or when no query has been received (e.g., providing relevant information to a user upon arrival at a particular location). In some examples, contextual information may be weighted toward providing more or less impact than other contextual information. By taking context into account, a more accurate task may be identified, in comparison to traditional techniques. - Generally, contextual information may comprise any type of information that aids the
virtual assistant 112 in interacting with a user (e.g., understanding the meaning of a query of a user, formulating a response, determining a task to be performed, etc.). In some instances, contextual information is expressed as a value of one or more variables, such as whether or not a user has signed in with a site (e.g., “is_signed_in=true” or “is_signed_in=false”). Contextual information may be stored in thecontext data store 130. Example, non-limiting pieces of contextual information may include: -
- conversation history between a user and a virtual assistant, either during a current session(s) or during a previous session(s) (e.g., input and/or output information), the conversation history may indicate terms and/or phrases that are frequently used (e.g., more than a particular number of times);
- content output history that identifies content that has been output to the user (e.g., movies that have been viewed, songs that have been listened to, web sites that have been viewed, pictures that have been viewed, etc.);
- what type of content the user prefers to view or listen to (e.g., the user frequently views sports content);
- navigation history indicating content that has been navigated to by a user, in some instances the navigation history may indicate content that is navigated to for performing a task (e.g., the
virtual assistant 112 provides a sports web site in response to “what happened at the game last night?,” and the user navigates on the sports web site to a particular college basketball team); - information identifying a content source that is accessed by a user, in some instances the information may indicate a content source that is accessed during a conversation (e.g., the
virtual assistant 112 opens a sports app in response to “what happened at the game last night?,” and the user disregards the app and accesses a web site to view sports information), the content source may comprise a web source, an application, local storage, remote storage (e.g., cloud source), etc.; - input mode history indicating one or more input modes that a user has used to interact with a user interface;
- what type of input mode the user prefers to interact with a virtual assistant (e.g., input mode—whether the user prefers to submit a query textually, using voice input, touch input, gesture input, etc.), the preferred input mode may be inferred from previous interactions, explicit input of the user, profile information, etc.;
- device information indicating a type of device that is used by a user to interact with a virtual assistant (e.g., a mobile device, a desktop computer, game system, etc.);
- a user preference indicating a preference of a user (e.g., a seat preference, a home airport, a preference of whether schedule or price is important to a user, a type of weather a user enjoys, types of items acquired by a user and identifying information for those items, types of stock a user owns or sold, etc.);
- calendar information describing one or more events of a user (e.g., a scheduled flight, a work meeting, etc.);
- a location of a cursor on a site when a user provides input to a virtual assistant;
- a time of day or date on which a user provides input to a virtual assistant;
- a current time of day;
- an age or gender of a user;
- a location of a user (e.g., a geo-location of the user associated with a device through which the user provides a query, location based on network information, address of the user, etc.);
- sensor information obtained from a sensor of a device with which a user is interacting (e.g., a geo-location, environmental data including background noise or video/audio from a surrounding of the device, etc.);
- an orientation of a device which a user is using to interact with a virtual assistant (e.g., landscape or portrait);
- a communication channel which a device of a user uses to interface with the virtual assistant service (e.g., wireless network (e.g., Wi-Fi®), wired network, cellular network, etc.);
- information indicating whether a communication channel is secured or non-secured (e.g., public network communications vs. private network communications);
- a language associated with a user (e.g., a language of a query submitted by the user);
- how an interaction with a virtual assistant is initiated (e.g., via user selection of a link or graphic, via the virtual assistant proactively engaging a user, etc.);
- how a user has been communicating recently (e.g., via text messaging, via email, etc.);
- information derived from a user's location (e.g., current, forecasted, or past weather at a location, major sports teams at the location, nearby restaurants, etc.);
- current topics of interest, either to a user or generally (e.g., trending micro-blog or blog topics, current news, recent micro-blog or blog posts made by the user, etc.);
- whether or not a user has signed-in with a site of a service provider (e.g., with a user name and password);
- a status of a user with a service provider (e.g., based on miles flown, a type of membership of the user, a type of subscription purchased by the user, etc.);
- a page of a site from which a user provides a query to a virtual assistant;
- how long a user has remained on a page of a site from which the user provides a query to the virtual assistant;
- social media information (e.g., posts or other content posted to a social networking site or blog);
- user profile information (e.g., information identifying friends/family of a user, information identifying where a user works or lives, information identifying a car a user owns, etc.);
- a characteristic of a user; or
- any other type of information.
- Although the modules 208-214 are illustrated as being included in the
virtual assistant service 116, in some instances one or more of these modules may be included in thesmart device 102 or elsewhere. As such, in some examples thevirtual assistant service 116 may be eliminated entirely, such as in the case when all processing is performed locally at the smart device 102 (e.g., thesmart device 102 operates independently). - While various operations are described as being performed by modules, any of these operations, and/or other techniques described herein, may be implemented as one or more hardware logic components, such as Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.
- The
memory 108 and/or 204 (as well as all other memory described herein) may include one or a combination of computer storage media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, phase change memory (PRAM), static random-access memory (SRAM), dynamic random-access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device. As defined herein, computer storage media does not include communication media, such as modulated data signals and carrier waves. As such, computer storage media is non-transitory media. -
FIG. 3 illustrates an example process to determine a task to be performed by thevirtual assistant 112. In this example, theuser 104 has providedinput 302 “I want to buy a new coat,” which has been sent to thevirtual assistant service 116 for processing. At thevirtual assistant service 116, the action-object module 218 may determine actions and objects 304 for the input 302 (actions—want, buy; object—coat). The actions and objects 304 may each be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified. The actions and objects 304 may be passed to thetask module 220. - The
task module 220 may identify information in atask map 306 that matches the actions and objects 304. In this example where thetask map 306 is represented as a table, thetask module 220 may identify all rows in thetask map 306 that include an action or an object of the determined actions andobject 304. Here, two rows have been identified, a row associated with atask 308 of setting a reminder and a row associated with atask 310 of purchasing an item. In this example, in order to select a task for performance, thetask module 220 ranks thetasks task 308. The ranking may be based on confidence scores of thetasks object module 218. In other examples, a task may be selected by asking theuser 104 what task the user is requesting. As illustrated inFIG. 3 , thetask 308 ranks the highest and, as such, is selected for performance. - The
task module 220 may also identify values forvariables 312 that are associated with thetask 308 by analyzing theinput 302 and/or contextual information. In this example, thetask module 220 references conversation history 314 that indicates that theuser 104 recently requested directions to the mall (e.g., in a conversation early that morning). Based on this conversation, thetask module 220 may determine a value for a destination variable for triggering a reminder (e.g., the mall) and a value for when to trigger the reminder (e.g., upon arrival). Thevirtual assistant service 116 may then perform thetask 308 of setting a reminder based on the values for thevariables 312. Although in this example the values for thevariables 312 are identified upon determining a task to be performed, in other examples the values may be identified when theinput 302 is processed and/or at other times. -
FIG. 4 illustrates anexample user interface 400 to enable a user to customize task preferences of thevirtual assistant 112. As illustrated, theinterface 400 may be provided through thesmart device 102 to enable theuser 104 to configure information related to a task. Although theinterface 400 may also be presented through other devices. - Through an
input field 402, theuser 104 may input a phrase, such as “book it,” to be associated with a task selected through a drop downmenu 404, such as a task of reserving a flight. Alternatively, or additionally, theuser 104 may input an action (e.g., verb) into aninput field 406 and/or may input an object (e.g., noun) into aninput field 408 to be associated with the selected task. Through input fields 410(a)-410(n) theuser 104 may input variable values to be associated with the task that is selected through the drop downmenu 404. For example, theuser 104 may specify a window seat as a seat preference for the seat preference variable. Based on this seat preference, thevirtual assistant 112 may seek to find a window seat for theuser 104 when reserving a flight. Theuser 104 may select a submitbutton 412 to configure thevirtual assistant 112 according to the specified information (e.g., associate the information with the task of reserving a flight). By doing so, a user may customize thevirtual assistant 112 to operate in a personalized manner (e.g., customize a task map of the virtual assistant 112). For example, thevirtual assistant 112 may be customized so that the phrase “book it” corresponds to the task of reserving a flight. - Although not illustrated in
FIG. 4 , in some instances theinterface 400 may enable theuser 104 to specify custom tasks to be performed by thevirtual assistant 112. For example, theuser 104 may specify a custom task of vibrating in response to “shake it.” This may further allow theuser 104 to customize a task map of thevirtual assistant 112. -
FIGS. 5A, 5B, and 6 illustrate example processes 500 and 600 for employing the techniques described herein. For ease of illustration processes 500 and 600 are described as being performed in thearchitecture 100 ofFIG. 1 . For example, one or more of the individual operations of theprocesses smart device 102 and/or thevirtual assistant service 116. In many instances, theprocesses virtual assistant 112. However, theprocesses architecture 100 may be used to perform other processes. - The
processes 500 and 600 (as well as each process described herein) are illustrated as a logical flow graph, each operation of which represents a sequence of operations that can be implemented in hardware, software, or a combination thereof. In the context of software, the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular abstract data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement the process. Further, any number of the described operations may be omitted. -
FIGS. 5A-5B illustrate theexample process 500 to determine a task to be performed by a virtual assistant. - At 502 in
FIG. 5A , thevirtual assistant service 116 may obtain user input from thesmart device 102. The user input may be received at thesmart device 102 during a conversation between a user and thevirtual assistant 112. The user input may be sent to thevirtual assistant service 116 for processing. - At 504, the
virtual assistant service 116 may obtain contextual information and/or weight the contextual information. The contextual information may include, for example, conversation history of the user with thevirtual assistant 112, content output history that identifies content that has been consumed by the user, user preference information, device information indicating a type of device that is being used by the user, and so on. In some instances, a piece of contextual information may be weighted more or less heavily than another piece of contextual information (e.g., weighted toward providing more or less impact than another piece of contextual information). The weighting may be based on a time associated with contextual information (e.g., a time that the contextual information was created), a predetermined value, and so on. For example, if a user had a conversation with thevirtual assistant 112 yesterday (e.g., within a predetermined time period), the conversation with thevirtual assistant 112 may be weighted more heavily than other contextual information that was created last week, such as content output history that indicates what the user viewed on the web last week. In another example, a user preference may be weighted more heavily than a current time of day based on a predetermined weighting scheme, which may be configurable by a user. - At 506, the
virtual assistant service 116 may analyze the user input and/or contextual information to determine (e.g., identify) an action(s) and/or an object(s). This may include utilizing various input processing techniques, such as POST, probabilistic or statistical speech modeling, NLP, pattern recognition language modeling, and so on. An action and/or an object may be expressly found in the user input (e.g., identifying an action of “send” for input of “please send a text message”), determined based on an analysis of the user input (e.g., determining an action of “purchase” for input of “I want to buy a new coat,” where “purchase” corresponds to a synonym for “buy”), and so on. An action may comprise a verb, while an object may comprise a noun (e.g., proper noun or common noun). - In some instances, the operation 506 may include determining all candidate actions and/or objects for the user input (e.g., all possible actions and/or objects). Each identified action or object may be associated with a confidence score indicating an estimated level of accuracy that the action or object was correctly identified. To illustrate, if a user states “I want to buy a new coat,” the
virtual assistant service 116 may identify an action of “want” as being associated with a relatively low confidence score and identify an action of “buy” as being associated with a relatively high confidence score. - In some instances, contextual information may be used to identify an action and/or object and/or to assign a confidence score to the action and/or object. As noted above, the contextual information may be weighted in some examples. In returning to the illustrative input of “I want to buy a new coat,” the
virtual assistant service 116 may reference a recent conversation in which the user requested directions to a clothing store to purchase a coat. Based on this conversation, an action of “want” may be assigned a relatively high confidence score, while an action of “buy” may be assigned a relatively low confidence score. These confidence scores may suggest that the user is more interested in purchasing the coat at the store (e.g., an action-object pair of want-coat that is associated with a task of creating a reminder to purchase the coat upon arrival at the store), rather than purchasing the coat through a phone (e.g., an action-object pair of buy-coat that is associated with a task of navigating to an online e-commerce site). - At 508, the
virtual assistant service 116 may identify matching information in a task map for the action(s) and/or object(s) determined at 506. For instance, thevirtual assistant service 116 may identify actions and objects in the task map that correspond to all candidate action(s) and/or object(s) determined at 506. To illustrate, if the task map is represented as a table with columns for actions, objects, and tasks (e.g., theexample task map 124 illustrated inFIG. 1 ), then thevirtual assistant service 116 may identify each row that includes at least one piece of matching information (e.g., at least one action or object determined at 506). - At 510, the
virtual assistant service 116 may determine whether or not a task that is associated with matching information in the task map satisfies one or more criteria, such as being the only task that is associated with matching information and/or being associated with a confidence score that is greater than a threshold. This may generally include an initial determination as to whether or not a task is identified to be performed. For example, thevirtual assistant service 116 may reference the task map to determine if the matching information in the task map corresponds to a single task (e.g., a single row is identified). In another example, thevirtual assistant service 116 may determine whether or not a task that is associated with matching information in the task map is associated with a confidence score that is greater than a threshold. A confidence score of a task may be based on confidence scores of associated actions and/or objects. To illustrate, a task may be associated with a relatively high confidence score, in comparison to another task, when an action and object of the task are associated with relatively high confidence scores. - When a task that is associated with matching information does not satisfy the one or more criteria, the
process 500 may proceed to 512 (e.g., the NO path). In many instances, theprocess 500 may proceed to 512 when the matching information in the task map corresponds to multiple tasks. Alternatively, when a task that is associated with matching information satisfies the one or more criteria, theprocess 500 may proceed toFIG. 5B (e.g., the YES path). - At 512, the
virtual assistant service 116 may determine whether or not to prompt the user for additional information regarding task performance. This may include determining whether or not a setting has been set to prompt the user. This setting may be configured by end-users, users of thevirtual assistant service 116, applications, and so on. When it is determined to prompt the user, theprocess 500 may proceed to 514 (e.g., the YES path). Alternatively, when it is determined to not prompt the user, theprocess 500 may proceed to 516 (e.g., the NO path). - At 514, the
virtual assistant service 116 may prompt the user for input regarding what task the user is requesting to be performed. This may include sending an instruction to thesmart device 102 to prompt the user for information that clarifies what task the user is requesting to be performed. Here, thevirtual assistant service 116 may provide the user with information that thevirtual assistant service 116 has identified, such as an identified action, object, or task. In returning to the illustrative input of “I want to buy a new coat,” thevirtual assistant service 116 may identify a candidate task of setting a reminder based on an identified action-object pair of want-coat and may identify another candidate task of purchasing the coat based on an identified action-object pair of purchase-coat. Here, the user may be asked “Would you like to set a reminder to purchase the coat or purchase the coat through an online site?” When user input is received, theprocess 500 may return to 506 and analyze the user input and/or contextual information to determine actions and/or objects in order to further narrow down what task the user is requesting. - In some instances at 514, the conversation between the
virtual assistant 112 and the user may be goal-based. In a goal-based conversation (e.g., dialog), thevirtual assistant service 116 may seek to accomplish a goal, such as collecting a threshold amount of information to identify a task. The conversation between the user and thevirtual assistant 112 may be substantially driven by input of the user. To illustrate, if thevirtual assistant service 116 is attempting to identify a task to perform, thevirtual assistant 112 may ask questions and/or receive user input until a task is identified. If the user asks a question that is not related to identifying a task, thevirtual assistant 112 may seek to resolve the question and return back to the task identification conversation. - At 516, the
virtual assistant service 116 may rank multiple tasks that are associated with matching information in the task map and may select a task(s) from the ranking. This may be useful when thevirtual assistant service 116 has identified multiple candidate tasks from the task map (e.g., potential tasks). The ranking may be based on confidence scores associated with the multiple tasks. In returning to the illustrative input of “I want to buy a new coat,” thevirtual assistant service 116 may identify a candidate task of setting a reminder (e.g., a row within the task map that includes a matching action and/or object) and another candidate task of purchasing the coat (e.g., another row within the task map that includes a matching action and/or object). Thevirtual assistant service 116 may then, for example, rank the task of purchasing the coat higher than the task of setting a reminder based on the purchasing task being associated with a higher confidence score. As noted above, a confidence score of a task may be based on a confidence score of associated actions and/or objects, which may be based on contextual information. Thevirtual assistant service 116 may then select a task(s) that ranks the highest/lowest (or the nth highest/lowest) within the ranking. - At 518 in
FIG. 5B , thevirtual assistant service 116 may analyze user input and/or contextual information to determine a variable value for performing a task. A variable value may include a value for a variable that is used to perform a task (also referred to as a value for a task variable). For example, in order to perform a task of purchasing a flight, particular variable values may be gathered, such as a departure location, a destination location, an airline, a type of seat requested (e.g., first class, coach, etc.), a date of departure, and so on. Accordingly, the analysis at 518 may generally seek to identify variable values from the user input obtained at 502, the user input received in response to prompting the user at 514, and/or the contextual information obtained and/or weighted at 504. The analysis may include referencing a variable(s) associated with a task and analyzing user input and contextual information to determine if a term or phrase in the user input or contextual information matches a word type of a variable(s) (e.g., noun, verb, adjective, etc.) and/or a category of a variable(s) (e.g., location, number, item, food, or any general classification of a word or variable). Although operation 518 is illustrated as a separate operation, in some instances, the operation 518 may be performed at operation 506 or at other locations. - To illustrate the analysis of 518, assume a task of purchasing a flight has been identified, which is associated with a destination variable (e.g., city category). Here, the
virtual assistant service 116 may search within user input and/or contextual information for a destination city (e.g., which may be included within the user input, described in user preference information, etc.). If, for example, the user previously had a conversation about traveling to Seattle, thevirtual assistant service 116 may identify Seattle as the value for the destination variable. Additionally, a departure city of Spokane may be identified based on the user's current location, namely Spokane, and a seat type may be identified based on a seat preference that the user has set. - At 520, the
virtual assistant service 116 may determine whether or not a variable value(s) for performing a task is missing. If a predetermined number of variable values is missing for a task (e.g., more than 1 or 2), theprocess 500 may proceed to 522 (e.g., the YES path). Alternatively, if the predetermined number of variable values is not missing, theprocess 500 may proceed to 524 (e.g., the NO path). - At 522, the
virtual assistant service 116 may prompt the user for the missing variable value(s). This may include sending an instruction to thesmart device 102 to prompt the user for the missing variable value(s). In some instances, this may also include informing the user of the variable values that have been gathered. For example, if the variable value for an airline is missing, thevirtual assistant service 116 may ask “What airline would you like to use for your flight?” Upon receiving user input, theprocess 500 may return to 518 and analyze the user input and/or contextual information. In some examples, the operation 522 may include performing a goal-based dialog for each of the variables (e.g., carrying out separate requests to the user for each variable value). - At 524, the
virtual assistant service 116 may cause the task to be performed (e.g., by the virtual assistant 112). This may include performing the task at thevirtual assistant service 116, sending an instruction to thesmart device 102 to perform the task, sending an instruction to another device, and so on. If the task is associated with variables, the values for the variables may be used to perform the task. - At 526, the
virtual assistant service 116 may learn information to be associated with a task, such as an action(s), object(s), and/or variable value(s). In general, thevirtual assistant service 116 may seek to identify a task that was desired by a user for input. In one example, thevirtual assistant service 116 may identify input that is received from a user during a conversation and determine whether or not one or more criteria are satisfied to classify a particular task that was performed by thevirtual assistant 112 for the input as an accurately identified task. The one or more criteria may be satisfied when the user views a response of thevirtual assistant 112 for more than a predetermined amount of time, the user continues a conversation with the virtual assistant 112 (e.g., provides further input that does not clarify the previous input), thevirtual assistant 112 confirms that it did the correct task through direction questioning (e.g., ask the user if a performed task was the task he desired), or the user otherwise acts to indicate that thevirtual assistant 112 performed a task that the user desired. When the one or more criteria are not satisfied (e.g., the performed task was not desired), thevirtual assistant service 116 may identify a task that was initiated by the user after the particular task was performed by the virtual assistant 112 (e.g., the user accessing an app, navigating to content, etc.). Thevirtual assistant service 116 may then identify an action and/or an object of the input to be associated with the task that was initiated by the user. In some instances, at 526 thevirtual assistant 112 may ask the user if it should apply learned information to future conversations. By performing learning techniques, thevirtual assistant service 116 may learn a task that is to be associated with an action and/or object that is determined for input. - To illustrate, assume that the user states “let's jam” in an effort to listen to music and the
virtual assistant service 116 incorrectly interprets this input as corresponding to a different task (e.g., searching for fruit jam on the internet), which is then performed and a response is sent to the user. Here, the user may have ignored the response of the virtual assistant 112 (e.g., closed a browser window, quickly moved on to something else, etc.) and opened a music application to listen to music. Accordingly, in this illustration thevirtual assistant service 116 may learn that the particular action-object pair that is determined for “let's jam” is to be associated with the task of playing music. - In another illustration, assume that the user requests “how did my team do last night?” in an effort to navigate to a particular baseball team's site and the
virtual assistant service 116 has returned a home page of a sports site (e.g., the home page of ESPN®). Here, the user may navigate from that home page to a specific page for the particular baseball team. Thus, thevirtual assistant service 116 may learn that the object determined for “my team” is to be associated with the particular baseball team. That is, thevirtual assistant service 116 may learn that the action-object pair that is determined for “how did my team do last night?” is to be associated with a task of navigating to the specific page of the particular baseball team. - In a further illustration, the
virtual assistant service 116 may learn a variable value based on a conversation between the user and thevirtual assistant 112. For example, thevirtual assistant service 116 may learn that when the user refers to “Jaime,” the user is actually referring to “James” who is listed as a contact on the user's device (e.g., the user says “send a text message to Jaime . . . oh wait I mean James,” the user corrects a to-field to “James” for a text message that is generated by thevirtual assistant 112 as a response to “send a text message to Jaime,” etc.). - The learning at 526 may alternatively, or additionally, be based on explicit input from the user requesting an association. To illustrate, the
virtual assistant service 116 may learn that a task of playing music is to be associated with an action-object pair that is determined for “let's jam” based on input from a user of “please associate let's jam with playing music.” In some instances, the input is received through a user interface that enables customization of task and action-object relationships, such as theinterface 400 ofFIG. 4 . - At 528, the
virtual assistant service 116 may configure a task map. This may include associating an action, object, and/or variable value with a task based on the learning at 526. In returning to the example above where thevirtual assistant service 116 has learned that input of “let's jam” is to be associated with the task of playing music, thevirtual assistant service 116 may associate an action-object pair that is determined for “let's jam” with the task of playing music. Alternatively, or additionally, a task map may be configured according to theprocess 600 ofFIG. 6 . - Although the
operations 526 and 528 are illustrated at the end of theprocess 500, these operations, and/or any other operations, may be performed at any time during theprocess 500. -
FIG. 6 illustrates theexample process 600 to configure a task map of a virtual assistant. - At 602, the
virtual assistant service 116 may identify a context for configuring a task map. The task map may map tasks to be performed by a virtual assistant to action-object pairs. The context may comprise, for example, an industry to which thevirtual assistant 112 is to be deployed (e.g., field of use), a platform for which thevirtual assistant 112 is to be deployed, a device type for which thevirtual assistant 112 is to be deployed, a user for which thevirtual assistant 112 is to be deployed, and so on. - At 604, the
virtual assistant service 116 may obtain information related to the context. The information may include, for example, one or more terms or phrases that are used for an industry, platform, device type, etc. In another example, the information may comprise contextual information related to a user. - At 606, the
virtual assistant service 116 may configure the task map for the context. This may include assigning a task to a particular action-object part based on the information related to the context. For example, thevirtual assistant service 116 may select a task based on the information related to the context and associate the task with a particular action-object pair. To illustrate, if thevirtual assistant 112 is to be deployed into an airline industry application, then an action-object pair of provide-status may be associated with a task of providing a flight status (e.g., instead of a task of providing other status information, as may be the case in another industry). In another example, if thevirtual assistant 112 is to be deployed on a mobile platform (e.g., mobile operating system), then an action-object pair of provide-directions may be associated with a task of opening a navigation app (e.g., instead of a task of opening a directions-based web site, as may be the case in another platform). - Although embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that the disclosure is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed herein as illustrative forms of implementing the embodiments.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/105,671 US20190057298A1 (en) | 2013-10-31 | 2018-08-20 | Mapping actions and objects to tasks |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/069,074 US10055681B2 (en) | 2013-10-31 | 2013-10-31 | Mapping actions and objects to tasks |
US16/105,671 US20190057298A1 (en) | 2013-10-31 | 2018-08-20 | Mapping actions and objects to tasks |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/069,074 Continuation US10055681B2 (en) | 2013-10-31 | 2013-10-31 | Mapping actions and objects to tasks |
Publications (1)
Publication Number | Publication Date |
---|---|
US20190057298A1 true US20190057298A1 (en) | 2019-02-21 |
Family
ID=52996906
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/069,074 Active 2034-05-09 US10055681B2 (en) | 2013-10-31 | 2013-10-31 | Mapping actions and objects to tasks |
US16/105,671 Abandoned US20190057298A1 (en) | 2013-10-31 | 2018-08-20 | Mapping actions and objects to tasks |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/069,074 Active 2034-05-09 US10055681B2 (en) | 2013-10-31 | 2013-10-31 | Mapping actions and objects to tasks |
Country Status (1)
Country | Link |
---|---|
US (2) | US10055681B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11394923B2 (en) * | 2016-10-05 | 2022-07-19 | Avaya Inc. | Embedding content of interest in video conferencing |
Families Citing this family (179)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US11068954B2 (en) | 2015-11-20 | 2021-07-20 | Voicemonk Inc | System for virtual agents to help customers and businesses |
US20180143989A1 (en) * | 2016-11-18 | 2018-05-24 | Jagadeshwar Nomula | System to assist users of a software application |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
AU2014214676A1 (en) | 2013-02-07 | 2015-08-27 | Apple Inc. | Voice trigger for a digital assistant |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
CN110442699A (en) | 2013-06-09 | 2019-11-12 | 苹果公司 | Operate method, computer-readable medium, electronic equipment and the system of digital assistants |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
AU2014306221B2 (en) | 2013-08-06 | 2017-04-06 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
EP3480811A1 (en) | 2014-05-30 | 2019-05-08 | Apple Inc. | Multi-command single utterance input method |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9462112B2 (en) * | 2014-06-19 | 2016-10-04 | Microsoft Technology Licensing, Llc | Use of a digital assistant in communications |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
KR102047500B1 (en) * | 2014-11-27 | 2019-11-21 | 삼성전자주식회사 | System and method for providing to-do-list of user |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US11080284B2 (en) * | 2015-05-01 | 2021-08-03 | Microsoft Technology Licensing, Llc | Hybrid search connector |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10446142B2 (en) * | 2015-05-20 | 2019-10-15 | Microsoft Technology Licensing, Llc | Crafting feedback dialogue with a digital assistant |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US9578173B2 (en) * | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US20180308473A1 (en) * | 2015-09-02 | 2018-10-25 | True Image Interactive, Inc. | Intelligent virtual assistant systems and related methods |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US9635167B2 (en) | 2015-09-29 | 2017-04-25 | Paypal, Inc. | Conversation assistance system |
US20170092278A1 (en) * | 2015-09-30 | 2017-03-30 | Apple Inc. | Speaker recognition |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
KR102453603B1 (en) * | 2015-11-10 | 2022-10-12 | 삼성전자주식회사 | Electronic device and method for controlling thereof |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10389543B2 (en) * | 2015-12-31 | 2019-08-20 | Microsoft Technology Licensing, Llc | Starting meeting using natural user input |
US9904669B2 (en) * | 2016-01-13 | 2018-02-27 | International Business Machines Corporation | Adaptive learning of actionable statements in natural language conversation |
US10755195B2 (en) | 2016-01-13 | 2020-08-25 | International Business Machines Corporation | Adaptive, personalized action-aware communication and conversation prioritization |
US10291565B2 (en) * | 2016-05-17 | 2019-05-14 | Google Llc | Incorporating selectable application links into conversations with personal assistant modules |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
US20180047038A1 (en) * | 2016-08-10 | 2018-02-15 | International Business Machines Corporation | Leveraging hashtags to dynamically scope a target audience for a social network message |
US10217462B2 (en) | 2016-08-31 | 2019-02-26 | Microsoft Technology Licensing, Llc | Automating natural language task/dialog authoring by leveraging existing content |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US20180095938A1 (en) * | 2016-09-30 | 2018-04-05 | Sap Se | Synchronized calendar and timeline adaptive user interface |
US10437841B2 (en) * | 2016-10-10 | 2019-10-08 | Microsoft Technology Licensing, Llc | Digital assistant extension automatic ranking and selection |
JP2018072560A (en) * | 2016-10-28 | 2018-05-10 | 富士通株式会社 | Information processing system, information processor, and information processing method |
US10024671B2 (en) | 2016-11-16 | 2018-07-17 | Allstate Insurance Company | Multi-stop route selection system |
GB201620235D0 (en) * | 2016-11-29 | 2017-01-11 | Microsoft Technology Licensing Llc | Neural network data entry system |
US10664146B2 (en) * | 2017-01-04 | 2020-05-26 | Amazon Technologies, Inc. | Creation of custom user interface controls that are associated with physical devices |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US11157490B2 (en) * | 2017-02-16 | 2021-10-26 | Microsoft Technology Licensing, Llc | Conversational virtual assistant |
US11231943B2 (en) * | 2017-03-24 | 2022-01-25 | Google Llc | Smart setup of assistant services |
US11451956B1 (en) | 2017-04-27 | 2022-09-20 | Snap Inc. | Location privacy management on map-based social media platforms |
US10212541B1 (en) * | 2017-04-27 | 2019-02-19 | Snap Inc. | Selective location-based identity communication |
KR102389625B1 (en) | 2017-04-30 | 2022-04-25 | 삼성전자주식회사 | Electronic apparatus for processing user utterance and controlling method thereof |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | User interface for correcting recognition errors |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
DK201770439A1 (en) * | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
CN110612566B (en) * | 2017-05-11 | 2020-12-29 | 苹果公司 | Privacy maintenance of personal information |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770427A1 (en) | 2017-05-12 | 2018-12-20 | Apple Inc. | Low-latency intelligent automated assistant |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Multi-modal interfaces |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
DK179549B1 (en) | 2017-05-16 | 2019-02-12 | Apple Inc. | Far-field extension for digital assistant services |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US11048995B2 (en) | 2017-05-16 | 2021-06-29 | Google Llc | Delayed responses by computational assistant |
US11228613B2 (en) * | 2017-05-22 | 2022-01-18 | International Business Machines Corporation | Adaptive adjustment using sensor data and distributed data |
US20180364798A1 (en) * | 2017-06-16 | 2018-12-20 | Lenovo (Singapore) Pte. Ltd. | Interactive sessions |
KR102349681B1 (en) * | 2017-07-28 | 2022-01-12 | 삼성전자주식회사 | Electronic device for acquiring and registering lacking parameter |
US10303773B2 (en) * | 2017-07-31 | 2019-05-28 | Massively.Ai Inc. | Chatbot system and method |
KR102126207B1 (en) * | 2017-08-21 | 2020-06-24 | 주식회사 마인드웨어p스 | Intelligent type message processing system |
US20190068527A1 (en) * | 2017-08-28 | 2019-02-28 | Moveworks, Inc. | Method and system for conducting an automated conversation with a virtual agent system |
US10740726B2 (en) * | 2017-10-05 | 2020-08-11 | Servicenow, Inc. | Systems and methods for providing message templates in an enterprise system |
EP3704689A4 (en) * | 2017-11-05 | 2021-08-11 | Walkme Ltd. | Chat-based application interface for automation |
US10535346B2 (en) * | 2017-12-07 | 2020-01-14 | Ca, Inc. | Speech processing computer system forming collaborative dialog data structures |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10599469B2 (en) * | 2018-01-30 | 2020-03-24 | Motorola Mobility Llc | Methods to present the context of virtual assistant conversation |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US20190251417A1 (en) * | 2018-02-12 | 2019-08-15 | Microsoft Technology Licensing, Llc | Artificial Intelligence System for Inferring Grounded Intent |
US10977446B1 (en) * | 2018-02-23 | 2021-04-13 | Lang Artificial Intelligence Inc. | Unsupervised language agnostic intent induction and related systems and methods |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
KR102685523B1 (en) * | 2018-03-27 | 2024-07-17 | 삼성전자주식회사 | The apparatus for processing user voice input |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US20190348033A1 (en) * | 2018-05-10 | 2019-11-14 | Fujitsu Limited | Generating a command for a voice assistant using vocal input |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US11113595B2 (en) * | 2018-05-21 | 2021-09-07 | The Travelers Indemnify Company | On-demand intelligent assistant |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK179822B1 (en) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | Virtual assistant operation in multi-device environments |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10944859B2 (en) | 2018-06-03 | 2021-03-09 | Apple Inc. | Accelerated task performance |
WO2020018525A1 (en) | 2018-07-17 | 2020-01-23 | iT SpeeX LLC | Method, system, and computer program product for an intelligent industrial assistant |
WO2020018541A1 (en) | 2018-07-17 | 2020-01-23 | iT SpeeX LLC | Method, system, and computer program product for role- and skill-based privileges for an intelligent industrial assistant |
WO2020018536A1 (en) | 2018-07-17 | 2020-01-23 | iT SpeeX LLC | Method, system, and computer program product for communication with an intelligent industrial assistant and industrial machine |
CN109166574B (en) * | 2018-07-25 | 2022-09-30 | 重庆柚瓣家科技有限公司 | Information grabbing and broadcasting system for endowment robot |
CN110811115A (en) * | 2018-08-13 | 2020-02-21 | 丽宝大数据股份有限公司 | Electronic cosmetic mirror device and script operation method thereof |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US10616419B1 (en) * | 2018-12-12 | 2020-04-07 | Mitel Networks Corporation | Devices, systems and methods for communications that include social media clients |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11613008B2 (en) * | 2019-01-14 | 2023-03-28 | International Business Machines Corporation | Automating a process using robotic process automation code |
US10817317B2 (en) | 2019-01-24 | 2020-10-27 | Snap Inc. | Interactive informational interface |
JP2022520763A (en) * | 2019-02-08 | 2022-04-01 | アイ・ティー スピークス エル・エル・シー | Methods, systems, and computer program products for developing dialog templates for intelligent industry assistants |
US11741529B2 (en) | 2019-02-26 | 2023-08-29 | Xenial, Inc. | System for eatery ordering with mobile interface and point-of-sale terminal |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11537359B2 (en) * | 2019-03-28 | 2022-12-27 | Microsoft Technology Licensing, Llc | Self-learning digital assistant |
KR20200123560A (en) * | 2019-04-22 | 2020-10-30 | 라인플러스 주식회사 | Method, system, and non-transitory computer readable record medium for providing reminder messages |
US11334383B2 (en) | 2019-04-24 | 2022-05-17 | International Business Machines Corporation | Digital assistant response system to overlapping requests using prioritization and providing combined responses based on combinability |
US11302080B1 (en) * | 2019-05-06 | 2022-04-12 | Apple Inc. | Planner for an objective-effectuator |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11042706B2 (en) * | 2019-05-20 | 2021-06-22 | Sap Se | Natural language skill generation for digital assistants |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | User activity shortcut suggestions |
US11263249B2 (en) * | 2019-05-31 | 2022-03-01 | Kyndryl, Inc. | Enhanced multi-workspace chatbot |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
CN110798506B (en) * | 2019-09-27 | 2023-03-10 | 华为技术有限公司 | Method, device and equipment for executing command |
KR102433964B1 (en) * | 2019-09-30 | 2022-08-22 | 주식회사 오투오 | Realistic AI-based voice assistant system using relationship setting |
US11062270B2 (en) * | 2019-10-01 | 2021-07-13 | Microsoft Technology Licensing, Llc | Generating enriched action items |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11043220B1 (en) | 2020-05-11 | 2021-06-22 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
US11907863B2 (en) * | 2020-07-24 | 2024-02-20 | International Business Machines Corporation | Natural language enrichment using action explanations |
US10930066B1 (en) | 2020-09-11 | 2021-02-23 | Mythical, Inc. | Systems and methods for using natural language processing (NLP) to automatically generate three-dimensional objects in a virtual space |
US11077367B1 (en) * | 2020-10-09 | 2021-08-03 | Mythical, Inc. | Systems and methods for using natural language processing (NLP) to control automated gameplay |
US11935527B2 (en) * | 2020-10-23 | 2024-03-19 | Google Llc | Adapting automated assistant functionality based on generated proficiency measure(s) |
US11893985B2 (en) * | 2021-01-15 | 2024-02-06 | Harman International Industries, Incorporated | Systems and methods for voice exchange beacon devices |
US11995457B2 (en) | 2022-06-03 | 2024-05-28 | Apple Inc. | Digital assistant integration with system interface |
US12032812B1 (en) * | 2023-06-27 | 2024-07-09 | Verizon Patent And Licensing Inc. | Systems and methods for intent-based augmented reality virtual assistant |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030191629A1 (en) * | 2002-02-04 | 2003-10-09 | Shinichi Yoshizawa | Interface apparatus and task control method for assisting in the operation of a device using recognition technology |
US20070022073A1 (en) * | 2005-07-08 | 2007-01-25 | Rakesh Gupta | Building plans for household tasks from distributed knowledge |
US20100145676A1 (en) * | 2008-12-09 | 2010-06-10 | Qualcomm Incorporated | Method and apparatus for adjusting the length of text strings to fit display sizes |
US20120265528A1 (en) * | 2009-06-05 | 2012-10-18 | Apple Inc. | Using Context Information To Facilitate Processing Of Commands In A Virtual Assistant |
US20140115456A1 (en) * | 2012-09-28 | 2014-04-24 | Oracle International Corporation | System for accessing software functionality |
US20140249830A1 (en) * | 2013-03-01 | 2014-09-04 | Nuance Communications, Inc. | Virtual medical assistant methods and apparatus |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100668297B1 (en) * | 2002-12-31 | 2007-01-12 | 삼성전자주식회사 | Method and apparatus for speech recognition |
US7415101B2 (en) * | 2003-12-15 | 2008-08-19 | At&T Knowledge Ventures, L.P. | System, method and software for a speech-enabled call routing application using an action-object matrix |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US8700385B2 (en) * | 2008-04-04 | 2014-04-15 | Microsoft Corporation | Providing a task description name space map for the information worker |
US10241752B2 (en) * | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US8943094B2 (en) * | 2009-09-22 | 2015-01-27 | Next It Corporation | Apparatus, system, and method for natural language processing |
US9082402B2 (en) * | 2011-12-08 | 2015-07-14 | Sri International | Generic virtual personal assistant platform |
US20130346068A1 (en) * | 2012-06-25 | 2013-12-26 | Apple Inc. | Voice-Based Image Tagging and Searching |
US8953764B2 (en) * | 2012-08-06 | 2015-02-10 | Angel.Com Incorporated | Dynamic adjustment of recommendations using a conversation assistant |
US20140164953A1 (en) * | 2012-12-11 | 2014-06-12 | Nuance Communications, Inc. | Systems and methods for invoking virtual agent |
US9384443B2 (en) * | 2013-06-14 | 2016-07-05 | Brain Corporation | Robotic training apparatus and methods |
-
2013
- 2013-10-31 US US14/069,074 patent/US10055681B2/en active Active
-
2018
- 2018-08-20 US US16/105,671 patent/US20190057298A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030191629A1 (en) * | 2002-02-04 | 2003-10-09 | Shinichi Yoshizawa | Interface apparatus and task control method for assisting in the operation of a device using recognition technology |
US20070022073A1 (en) * | 2005-07-08 | 2007-01-25 | Rakesh Gupta | Building plans for household tasks from distributed knowledge |
US20100145676A1 (en) * | 2008-12-09 | 2010-06-10 | Qualcomm Incorporated | Method and apparatus for adjusting the length of text strings to fit display sizes |
US20120265528A1 (en) * | 2009-06-05 | 2012-10-18 | Apple Inc. | Using Context Information To Facilitate Processing Of Commands In A Virtual Assistant |
US20140115456A1 (en) * | 2012-09-28 | 2014-04-24 | Oracle International Corporation | System for accessing software functionality |
US20140249830A1 (en) * | 2013-03-01 | 2014-09-04 | Nuance Communications, Inc. | Virtual medical assistant methods and apparatus |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11394923B2 (en) * | 2016-10-05 | 2022-07-19 | Avaya Inc. | Embedding content of interest in video conferencing |
Also Published As
Publication number | Publication date |
---|---|
US10055681B2 (en) | 2018-08-21 |
US20150121216A1 (en) | 2015-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190057298A1 (en) | Mapping actions and objects to tasks | |
US11099867B2 (en) | Virtual assistant focused user interfaces | |
US11823677B2 (en) | Interaction with a portion of a content item through a virtual assistant | |
US10545648B2 (en) | Evaluating conversation data based on risk factors | |
US10809876B2 (en) | Virtual assistant conversations | |
US11093536B2 (en) | Explicit signals personalized search | |
US20170277993A1 (en) | Virtual assistant escalation | |
JP6780001B2 (en) | Automatic suggestions and other content for messaging applications | |
US20190362252A1 (en) | Learning user preferences in a conversational system | |
US10446009B2 (en) | Contextual notification engine | |
US20140245140A1 (en) | Virtual Assistant Transfer between Smart Devices | |
CN104335234A (en) | Systems and methods for interating third party services with a digital assistant |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NEXIT CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BROWN, FRED A.;MILLER, TANYA M.;BROWN, MEGAN;AND OTHERS;REEL/FRAME:047441/0808 Effective date: 20131105 Owner name: VERINT AMERICAS INC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEXT IT CORPORATION;REEL/FRAME:047441/0863 Effective date: 20180131 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |