CN112684913B

CN112684913B - Information correction method and device and electronic equipment

Info

Publication number: CN112684913B
Application number: CN202011608156.XA
Authority: CN
Inventors: 王林林
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2023-07-14
Anticipated expiration: 2040-12-30
Also published as: CN112684913A; WO2022143454A1

Abstract

The application discloses an information correction method, an information correction device and electronic equipment, and belongs to the technical field of communication. The information correction method comprises the following steps: displaying the first text information generated according to the conversion of the first voice information in a target area in an information editing interface; displaying corresponding candidate words in a candidate information area in the information editing interface according to the received first input; according to the operation of a user in the candidate information area, first indication information is obtained, and the first indication information is used for indicating target candidate words in the candidate words; and correcting the first text information in the target area according to the target candidate word. According to the method and the device, the input of the user is converted into the corresponding pinyin information, and the candidate words are generated aiming at the pinyin information, so that the user can conveniently correct the error recognition condition occurring during the voice input, the content which is required to be input is accurately obtained, and the voice input efficiency is improved.

Description

Information correction method and device and electronic equipment

Technical Field

The application belongs to the technical field of communication, and particularly relates to an information correction method, an information correction device and electronic equipment.

Background

At present, in the process that a user expresses own ideas on a mobile terminal by using an input method, a pinyin input method and a voice input method are frequently used.

With the continuous improvement of the accuracy of speech recognition, speech input methods are favored by more users. However, although the voice input method has advantages of being fast and easy to use, there is a serious disadvantage as well: when a user speaks a long sentence, the words with wrong recognition often appear in the sentence after voice recognition, so that the accuracy of the text information obtained by recognition is low, and the user is required to modify the words with wrong recognition.

In the prior art, in the process of modifying text information obtained through voice recognition by a user, the user can only input text through a keyboard input method to correct wrong words, so that the input efficiency is reduced.

Disclosure of Invention

An embodiment of the application aims to provide an information correction method, an information correction device and electronic equipment, which can solve the problem that the input efficiency of a voice input method in the prior art is to be improved.

In order to solve the technical problems, the application is realized as follows:

in a first aspect, an embodiment of the present application provides an information correction method, including:

displaying the first text information generated according to the conversion of the first voice information in a target area in an information editing interface;

displaying corresponding candidate words in a candidate information area in the information editing interface according to the received first input;

According to the operation of a user in the candidate information area, first indication information is obtained, and the first indication information is used for indicating target candidate words in the candidate words;

and correcting the first text information in the target area according to the target candidate word.

In a second aspect, an embodiment of the present application provides an information correction apparatus, including:

the display module is used for displaying the first text information generated according to the conversion of the first voice information in a target area in the information editing interface;

the processing module is used for displaying corresponding candidate words in the candidate information area in the information editing interface according to the received first input;

the target determining module is used for obtaining first indication information according to the operation of a user in the candidate information area, wherein the first indication information is used for indicating target candidate words in the candidate words;

and the information correction module is used for correcting the first text information in the target area according to the target candidate word.

In a third aspect, embodiments of the present application provide an electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method according to the first aspect when executed by the processor.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.

In the embodiment of the application, the input of the user is converted into the corresponding pinyin information, and the candidate word is unfolded aiming at the pinyin, so that the user can conveniently correct the recognition error condition occurring during the voice input, the content which is required to be input is quickly and accurately obtained, the voice input efficiency is improved, and the user experience is improved.

Drawings

FIG. 1 is one of the flowcharts of the information correction method of the embodiment of the present application;

FIG. 2 is one of the schematic diagrams of the speech input interface of the embodiments of the present application;

FIG. 3 is one of the schematic diagrams of the information editing interface in the correction mode of the embodiment of the present application;

FIG. 4 is a schematic diagram of an information editing interface deletion operation in a correction mode according to an embodiment of the present application;

FIG. 5 is one of the schematic diagrams of the process of determining target candidate words in the correction mode according to the embodiment of the present application;

FIG. 6 is a second schematic diagram of a process for determining target candidate words in a correction mode according to an embodiment of the present application;

FIG. 7 is one of the schematic diagrams of the completion of correction in the correction mode of the embodiment of the present application;

FIG. 8 is a second flowchart of a method for information correction according to an embodiment of the present application;

FIG. 9 is a second schematic diagram of a voice input interface according to an embodiment of the present application;

FIG. 10 is a second schematic diagram of an information editing interface in a correction mode according to an embodiment of the present application;

FIG. 11 is a second schematic diagram of the deletion operation of the information editing interface in the modification mode according to the embodiment of the present application;

FIG. 12 is a third diagram illustrating a process of determining target candidate words in a correction mode according to an embodiment of the present application;

FIG. 13 is a second schematic diagram of the completion of correction in the correction mode of the embodiment of the present application;

FIG. 14 is a third flowchart of a method for information correction according to the embodiment of the present application;

fig. 15 is a block diagram of a display device of an embodiment of the present application;

FIG. 16 is a block diagram of an electronic device of an embodiment of the present application;

fig. 17 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

In order to enable those skilled in the art to better understand the embodiments of the present application, the following description is provided.

Multistage candidates: the user has entered a series of correct pinyin but the desired sentence is not in the candidate list, but rather the lower order words need to be selected one by one to compose the desired sentence, a process called multi-level candidate.

Multi-order words (n-gram): we call a word with a more complete meaning a single order word (unigram) or a first order word. Such as "united states", "hangzhou", "arches" and the like. The word combined by N single-order words is called as a multi-order word, for example, the Hangzhou arch bridge consists of two single-order words, which are second-order words (bigrams) in the multi-order words.

Higher order words and sub-order words: this is a relative concept with links, if a multi-level word a is composed of other multi-level words or single-level words, then a is a higher level word of other words, and other words are a lower level word of a, also called a. For example, "Hangzhou arch cell bridge" is a higher order word of "Hangzhou", and "Hangzhou" and "arch cell bridge" are both lower order words of "Hangzhou arch cell bridge".

Weighted finite state machines (WFST, weighted Finite State Transducer): is a high-efficiency white box graph model. The built model can automatically select paths meeting each input condition according to a series of inputs, and the paths are sequentially output from a starting state to a terminating state. The method can be used for realizing the functions of converting pinyin characters into vocabulary, converting vocabulary into sentences and the like. In addition, a plurality of models can be compounded into one model, so that the effect of cascading the models can be effectively realized, for example, the direct sentence conversion of pinyin characters can be realized.

The method, the device and the electronic equipment for information correction provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

As shown in fig. 1, an embodiment of the present application provides an information correction method, including:

step 11: and displaying the first text information generated according to the conversion of the first voice information in a target area in the information editing interface.

Here, after the user inputs a whole sentence by using the voice input method, the text recognized by the voice is directly displayed on the screen and displayed in the target area, so that the advantage of fewer voice input keys can be fully exerted.

Step 12: and displaying the corresponding candidate words in the candidate information area in the information editing interface according to the received first input.

Here, when the user finds that the text recognized by the voice is different from the text to be input and needs to be corrected, the user may trigger the correction mode to be started by the user's operation, for example, by clicking or long pressing the text in the target area by the user. In the correction mode, the recognized text may be corrected based on the received first input.

For example, when the first input is voice, the user may perform a correction operation according to the voice. And converting and generating candidate words for the user to select according to the voice (namely the second voice information) input by the user at the moment. Specifically, in the correction mode, the speech recognition of the word intended by the user through speech input can be converted into pinyin, and then the multi-order candidate words are expanded according to the pinyin so that the user can select correct target candidate words in multiple stages.

For another example, the first input is second text information selected by the user in the target area, pinyin identification can be performed according to the second text information, and candidate words corresponding to the second target pinyin are displayed for selection by the user.

Step 13: and according to the operation of the user in the candidate information area, obtaining first indication information, wherein the first indication information is used for indicating a target candidate word in the candidate words.

In this step, the first indication information may be obtained according to a click or a long press of the user. For example, when the user clicks on the candidate word, the first instruction information is obtained, and the candidate word selected by the user clicking is used as the target candidate word.

Step 14: and correcting the first text information in the target area according to the target candidate word.

It should be noted that, since there are 407 Chinese phonetic alphabets, even if the tone is calculated, it is much less than 30000 Chinese phonetic alphabets, and thus the accuracy of the conversion from phonetic alphabets is high.

By utilizing the characteristics, when a user inputs a desired word by adopting a voice input method, the voice recognition input by the user is converted into the text and then is directly displayed on the screen, so that the advantage of fewer voice input keys can be fully exerted. And when the user needs to modify the text after the screen is displayed, the correction mode can be started. In this mode, the user can still input the pronunciation of the word or sentence to be input through voice. At this time, the embodiment does not convert the speech recognition into the Chinese character direct screen, but recognizes the pinyin corresponding to the speech by adopting the speech recognition mode, and then expands the multi-order candidate words for the user to select according to the pinyin. Therefore, through converting the voice information recognition into the candidate words, a user can select target candidate words in a multi-level candidate mode without pressing a plurality of keys through pinyin input to correct the text, thereby avoiding breaking the rhythm of voice input and improving the efficiency of voice input.

Optionally, the information correction method further includes: when determining that a first target position is positioned between text messages displayed in the target area or a second text message in the target area is selected according to the operation of a user in the information editing interface, starting the correction mode; wherein the first text information in the target area includes the second text information.

For example: when a user moves a cursor to a position needing to be corrected in a target area or when the user presses characters in the target area for a long time, a correction mode can be started.

Optionally, displaying, according to the received first input, the corresponding candidate word in the candidate information area in the information editing interface includes: obtaining a first target pinyin according to the received second voice information; and displaying the candidate words corresponding to the second voice information in the candidate information area according to the first target pinyin.

Specifically, because the language model module included in the WFST can make the pinyin to phrase reach a better level, the embodiment of the present application can use the WFST to convert the pinyin to the word and then convert the phrase to provide the candidate words for the user, and arrange the candidate words from high to low according to the weight (such as order, probability, etc.). It should be noted that, among these candidate words, there are not only high-order words matching all pinyin but also high-frequency low-order words matching part of pinyin. Thus, the user can avoid the trouble of inputting the pinyin of each low-order word by the first-level keyboard and the problem of increasing the selection cost caused by high ambiguity of the short word. Therefore, words wanted by the user can be obtained quickly to correct the error text.

The present embodiment requires relatively few user operations at the time of input, and is suitable for correcting a relatively long (e.g., between 2 nd and 4 th order) phrase input, and a relatively short (typically 1,2 nd order, and a small number of 3 rd orders) word with a relatively high frequency of use. The user can select a target candidate word to be input from the candidate words of each order, and the process is likely to be completed by multi-level consumption of pinyin.

It should be noted that, in the process of determining the first target pinyin according to the second voice information, the user may confirm whether the pinyin converted by the voice recognition is correct. Optionally, the received second voice information is converted into corresponding pinyin information, and the corresponding pinyin information is displayed in the pinyin information area in the information editing interface. That is, the pinyin converted according to the second voice information recognition may be displayed in a certain area (e.g., a pinyin information area) in the information editing interface, and the user may edit and modify the pinyin in the area. At this time, the pinyin displayed in the area may be used as the first target pinyin, and the candidate word corresponding to the first target pinyin may be displayed in the candidate information area. In this way, in the process of confirming the pinyin, if the user finds that the pinyin is wrong in recognition, the user can modify the pinyin through the keyboard; if the pinyin is found to be correctly identified, the target candidate word can be directly identified in the candidate information area.

In this embodiment, the confirming operation for the target candidate word may be that, when the user clicks on the candidate word, the pinyin information corresponding to the candidate word replaced by the candidate word is displayed in the pinyin information area until the pinyin information in the pinyin information area is replaced, and the candidate word in the pinyin information area is used as the target candidate word.

For example, in the correction mode, after the pronunciation of "epidemic prevention materials" is input by the user, the string of pinyin of fang ' yi ' wu ' zi is displayed in the "pinyin information area", the "house", "epidemic prevention" and "anti-overflow" waiting selection words are displayed in the candidate information area, after the user clicks "epidemic prevention", the pinyin of "fang ' yi" is consumed, the "pinyin information area" is displayed as the epidemic prevention wu ' zi, and the second selection is entered, the candidate words corresponding to "wu ' zi" such as "house", "materials" are displayed in the candidate information area, after the user clicks "materials", the string of pinyin of fang ' yi ' wu ' zi is completely consumed, the word composed of "epidemic prevention" and "materials" is confirmed as the target candidate word, and the target candidate word is directly displayed on the screen.

Optionally, displaying, according to the received first input, the corresponding candidate word in the candidate information area in the information editing interface includes: according to the received second voice information and a preset rule, displaying candidate words corresponding to the second voice information in the candidate information area;

Wherein, the preset rule comprises: when the candidate word is a phrase, displaying a higher order word in the candidate word before a lower order word, and displaying a lower order word of the higher order word before a same order word of the lower order word; the number of words contained in the higher-order words is larger than that of words contained in the lower-order words, the numbers of words contained in the two same-order words are the same, and the second-order words of the higher-order words are words contained in the higher-order words; the second voice information is voice information corresponding to a higher-order word containing the target candidate word; the first indication information is obtained according to long-press operation.

For example, when the user needs to input a word of "dominant", the word candidate of "dominant" is more easily obtained than when the user directly inputs a voice of "dominant". In the embodiment, first indication information is obtained according to the long-pressed candidate words of the user, and the long-pressed candidate words of the user are used as target candidate words.

In the case where the word to be input by the user is less common and shorter, the more words constituting the sentence, the lower the ambiguity, the fewer the number of higher-order candidate words to be selected, according to the theory of the language model. That is, the longer the phrase is, the more commonly it is, and the lower the cost of its selection operation.

Therefore, for a single-order word or a two-order word of 1 st order, if the word is not very commonly used, a longer commonly used word containing the word can be input by voice, the commonly used word is a higher-order word containing the word to be input, then a user can select a sub-order word (also the word to be used by the user) corresponding to the higher-order word from the multi-order candidate words to directly screen, and the sub-order word is accurately and quickly acquired by means of the input longer word.

Specifically, in this embodiment, the language model in the correction mode will give more weight to more common combinations, and will be ranked in the front position in the candidate word. According to a preset rule, the same-order words are not completely ordered according to probability, but the higher-order words contain sub-order words which can appear in front of words with the same order as the sub-order words. For example, the single term "epidemic prevention" is part of the higher term "epidemic prevention measure", so that among the plurality of first terms, the word "epidemic prevention" is guaranteed to be arranged in front of the word "put one" or "overflow prevention".

Optionally, the information correction method further includes: and converting the received second voice information into corresponding pinyin information and displaying the corresponding pinyin information in the information editing interface.

For example, after inputting the voice of the dominant gene, the user can recognize and convert the pinyin of "xin" according to the dominant gene, display the pinyin in the pinyin information area, and display the candidate word corresponding to the pinyin in the candidate information area.

For example, after the user inputs the voice of the dominant gene, the user finds the dominant long-term pressing of the low-order word in the candidate words, the dominant is directly displayed on the screen and displayed at the target position, and the redundant pinyin ji' yin can be discarded. By the mode, the situation that the pinyin is input through the key board for many times can be avoided, the wanted words can be obtained relatively quickly, and accordingly input efficiency can be improved.

Optionally, displaying, according to the received first input, the corresponding candidate word in the candidate information area in the information editing interface includes: determining a second target pinyin according to the second text information under the condition that the second text information in the target area is selected; displaying candidate words corresponding to the second target pinyin in the candidate information area according to the second target pinyin; wherein the first text information in the target area includes the second text information.

For example, after a sentence is input by a user through voice, the user finds that the text converted by voice recognition is "room in home" and recognition errors are all filled in "room", the correct text is "supplies", and the user can press the "room" two words in the target area for a long time. The two words of 'room' selected by the long-time pressing operation of the user are identified and converted into the pinyin of 'wu' zi ', candidate words corresponding to' wu 'zi' such as 'room', 'supplies', and the like are displayed in the candidate information area, and after the user clicks the 'supplies', the 'supplies' can replace the word of 'room' selected by the user in the target area.

Optionally, the correcting the first text information in the target area according to the target candidate word includes: deleting the text information corresponding to the second indication information in the target area through the deleting control; determining a second target position according to the second indication information, and displaying target candidate words at the second target position; the second indication information is obtained according to the operation of a user in the target area.

For example, the second instruction information may be obtained according to an operation of clicking the "delete-forward control k2" or "delete-backward control k3" on the candidate information area by the user, and deleting the text in the target area in which the recognition error occurs. Specifically, characters in at least one adjacent position of the target position in the information editing interface can be deleted through the deletion control.

Optionally, the information correction method further includes: and according to the third indication information, positioning the sentence end of the text information in the target area, and exiting the correction mode.

Here, the third instruction information may be obtained according to an operation in which the user clicks the "correction completion confirmation control k1" on the candidate information area. The user can conveniently move the cursor to the end of the paragraph by using the correction completion confirmation control k1, and exit the correction mode.

Optionally, before correcting the first text information in the target area according to the target candidate word, the information correction method further includes: displaying a deletion control and a correction completion confirmation control k1 on the candidate information area, wherein the deletion control comprises at least one of a forward deletion control k2 and a backward deletion control k 3;

the forward deleting control k2 is used for receiving an instruction of deleting characters at the left adjacent position of the first target position in the information editing interface;

the backward deleting control k3 is used for receiving an instruction of deleting characters at the right adjacent position of the first target position in the information editing interface;

the correction completion confirmation control k1 is used for receiving an instruction of moving a cursor to the sentence end of the text information in the target area;

And the first target position is determined according to the operation of the user in the information editing interface.

In this embodiment, characters in the adjacent position on the left side of the target position in the information editing interface can be deleted by deleting the control k2 forward; characters at adjacent positions on the right side of the target position in the information editing interface can be deleted by deleting the control k3 backwards; the cursor can be moved to the end of the sentence of the text information in the target area by correcting the completion confirmation control k 1.

That is, in the correction mode, a delete-back control k3 is provided in addition to the pinyin keypad and voice buttons. When the user needs to delete the wrong word segment (i.e. the character adjacent to the target position in the information editing interface) on the screen, the text adjacent to the target position can be deleted backwards in addition to the forward deletion.

The correction mode also provides a correction completion confirmation control k1, which can exit the correction mode and return to the normal mode by one key. For example, when the user has corrected all the wrong words, pressing the "correction completion confirmation control k1" can move the cursor to the end of the sentence in the target area, and prepare to start a new input. Through the shortcut key, the input efficiency can be effectively improved.

In order to facilitate an intuitive understanding of the embodiments of the present application, the embodiments of the present application are explained below in a graphical manner.

Embodiment one:

as shown in fig. 2 to 7, the user wants to input a sentence: "at present, the household epidemic prevention materials are all filled up".

As shown in fig. 2, the user has read a section of speech through a speech input method, and then the speech is erroneously recognized as "the house is now full at home", and the recognized text is directly displayed on the screen. At this time, the user finds that "put a house" is different from the text to be input, and correction is required.

As shown in fig. 3, the cursor may be moved to the error position, at which point the on correction mode is triggered. In this mode, the "delete forward control k2" or "delete backward control k3" as shown in fig. 3 may be pressed to delete the text in which the recognition error occurs. In this way, at least one adjacent character of the target location can be conveniently deleted.

As shown in fig. 4, after deleting the erroneous text, the target area in the information editing interface is displayed as "all at home now". At this time, in order to correctly input several words of "epidemic prevention materials", the user may click on the lower voice recognition button, and still use voice to input "epidemic prevention materials".

As shown in fig. 5, the speech recognition engine may share most of the modules with the speech recognition engine in the normal mode, or may be two independent engines. For example, the modified mode engine is specifically optimized for converting speech to pinyin, which ensures a high accuracy in converting speech to pinyin. For the string of Pinyin of fang ' yi ' wu ' zi, the application can adopt a WFST model to convert the string of Pinyin into a plurality of multi-order words, for example, the word "put one room" is a bigram, and the word consists of two single-order words of "put one room" and "room". Meanwhile, the Chinese characters are converted into low-order words, such as single-order words of epidemic prevention, overflow prevention and the like. At this time, since the interface shown in fig. 5 has no "epidemic prevention materials" but only "epidemic prevention", the user may select the word "epidemic prevention" at this stage (for example, select by clicking the word), and the pinyin "wu 'zi" remains after the corresponding pinyin "fang' yi" is consumed.

As shown in FIG. 6, after "epidemic prevention" is selected, a second level selection interface is entered, at this time, the word "supplies" can be selected, the corresponding pinyin "wu 'zi" is consumed, and the string of pinyin "fang' yi 'wu' zi" is consumed, so that the "epidemic prevention" and "supplies" selected by the user can be completely displayed. It should be noted that, the more common a phrase input through speech, the higher the probability that the language model ranks the multi-order word or single-order word corresponding to the phrase in front. If the user considers that all words are corrected, the user can press the correction completion confirmation control k1 shown in fig. 6, so that the cursor can be conveniently moved to the end of the paragraph, and the correction mode can be automatically exited.

As shown in fig. 7, after exiting the correction mode, a new voice input can be started by pressing the voice recognition key.

In this embodiment, a multi-level candidate function is added to the voice input method, when an inaccurate small part of a segment of a voice recognized (or a segment of a voice input by the pinyin input method) is corrected, the voice of a desired word is input by voice, the voice recognition is converted into pinyin, and then the multi-level candidate word is expanded according to the pinyin, so that a user can select correct target candidate words in multiple levels.

The flowchart of the above embodiment is shown in fig. 8, and specifically includes the following steps:

step 801: the voice input by the user (usually a sentence) is recognized, and the voice is directly displayed after being converted into the text, so that the advantage of less keys in voice input can be fully exerted.

Step 802: whether the text on the screen has wrong words or not; if there is no error, go to step 803; if there is an error, step 804 is performed.

Step 803: the next sentence is input by voice.

Step 804: to (by moving the cursor) the target location, i.e. where one of the erroneous words is located. When the cursor is moved to the wrong word, a correction mode is started, and in addition to the forward deleting control k2, a key capable of deleting the text backwards (namely, the backward deleting control k 3) is provided, so that the wrong text can be deleted more conveniently.

Step 805: the voice corresponding to the text to be input is input through the voice.

Step 806: the optimized speech recognition engine of the embodiment of the application mainly aims at phrase instead of long sentence conversion, and the speech is not directly converted into Chinese characters, but is converted into pinyin. The reason for adopting such a transformation scheme is that: the number of different Pinyin is approximately 407, and even if the tone is calculated, the number of the Pinyin is much less than that of the Chinese characters 30000, so that the accuracy of converting the voice into the Pinyin string is higher, and the conversion is easier than that of converting the voice into the Chinese character string.

Step 807: the user can confirm whether the pinyin converted by the voice recognition is correct or not; if so, go to step 809, otherwise, go to step 808;

step 808: if the pinyin is incorrectly identified, the user may still modify it via the keypad. Of course, this should be avoided as much as possible because the number of key operations is increased.

Step 809: for the correct pinyin (i.e., the first target pinyin), the trained WFST may convert the pinyin to multi-level candidate words and rank from high to low according to weights (e.g., order, probability, etc.). This WFST may be a composite of two models: namely, pinyin to first-order vocabulary and a WFST language model of the vocabulary.

Step 810: according to the candidate words, the user can select the word which is consistent with the target word and is as long as possible by clicking the candidate words, and the corresponding pinyin is consumed until all the pinyin is consumed, so that the screen of the wanted word is completed.

Step 811: judging whether the pinyin is completely consumed; if yes, go to step 812; if not, step 810 is performed, i.e., the candidate word continues to be selected, consuming the corresponding pinyin.

Step 812: when all the pinyin is consumed, the screen of the word or phrase to be input is completed.

Step 813: the user can continue to confirm whether the text of the target area in the information editing interface is correct or not, and check whether other error fragments exist in the sentence. If yes, go to step 804; if not, then step 814 is performed. That is, when there are a plurality of errors in the sentence, steps 804 to 813 may be repeatedly performed until all the erroneous fragments of the sentence are corrected.

Step 814: when the user confirms that the sentence is corrected, the correction completion confirmation control k1 can be pressed down, at this time, the cursor can automatically move to the tail part of the text in the target area, and the next sentence is ready to be input, so that the positioning operation of moving the cursor can be avoided.

In the above embodiment, if 1 error occurs in total, it is necessary to move the cursor 1 time, enter the voice 1 time, and key 7 times. Wherein, 7 key operations include deleting the wrong word 4 times (deleting "put", "one", "room", "son", respectively), voice key 1 time, and two-stage word selection 2 times. In contrast, if a string of text input by the above-mentioned phonetic input method is input, a lot of key-press times are needed even if simple spelling is used; if the traditional voice input method is used, the cursor is required to be moved to the wrong word, the pinyin is input by the keyboard to correct, a plurality of key presses are also required, and the rhythm of the voice input is interrupted. In the embodiment, most of operations are used for voice and word selection, and the key input is reduced to a lower number of times, so that a correct sentence input can be obtained by using fewer positioning and key times, and the voice input efficiency is obviously improved. The user can quickly correct the error words in the text by the information correction method provided by the application, so that the aim of quickly and accurately inputting a large text is fulfilled.

Embodiment two:

as shown in fig. 7, 9 to 13, assume that the user still wants to input: "at present, the household epidemic prevention materials are all filled up".

As shown in fig. 9, the user inputs a piece of voice using the voice input method, and the recognized text is directly displayed on the screen assuming that the user is erroneously recognized as "the translation materials are all filled in the home now". At this time, the user finds that the "translation" is different from the desired text, and correction is required.

As shown in fig. 10, the cursor may be moved to the error position, at which point the on correction mode is triggered. The user may "delete" the translate word through the delete forward control k 2.

After deleting the error text, as shown in fig. 11, the interface display is as shown in fig. 11, and the target area in the information editing interface is displayed as "all home materials are full at present". In order to correctly input the 'epidemic prevention' two words, the user can click the voice recognition key below to re-input the voice. At this time, in order to improve the word probability of "epidemic prevention", the user may select to input "epidemic prevention measure" instead of directly inputting the voice of "epidemic prevention".

As shown in fig. 12, after the user inputs the voice of "epidemic prevention measure", for the string of pinyin of the identified fang ' yi ' cuo ' shi, a WFST model may be used to convert the string of pinyin into a plurality of multi-order words, for example, the "epidemic prevention measure" is a bigram, which is composed of "epidemic prevention" and "measure", and also into some low-order words, for example, the "epidemic prevention" and "put" are single-order words. Typically, higher order words are always arranged in front of lower order words; and in the words of the same order, the ranking is mainly performed according to the WFST model probability. However, in this embodiment, the same-order words are not completely ordered according to probabilities according to a predetermined rule, but the higher-order words include the next-order words that can appear in front of the words of the same order as the next-order words. For example, the term "epidemic prevention" is a part of the term "epidemic prevention measure" of higher order, so that among the terms of the first order, the term "epidemic prevention" is guaranteed to be arranged in front of the term "put one" or "overflow prevention".

As shown in FIG. 13, the user presses the target candidate word "epidemic prevention" two words for a long time, so that the "epidemic prevention" two words are directly put on the screen, and the input of correct words is completed. In this process, the unconsumed pinyin "cuo' shi" is automatically ignored. It should be noted that the more common the phrase of the speech input, the higher the probability that the word and its next-order word will appear will be ensured by the language model. If the user considers that all words are corrected, the user can press the correction completion confirmation control k1 shown in fig. 13, so that the cursor can be conveniently moved to the end of the paragraph, and the correction mode can be automatically exited.

The flowchart of the above embodiment is shown in fig. 14, and specifically includes the following steps:

step 1401: the voice input by the user (usually a sentence) is recognized by voice, and the voice is directly displayed after being converted into text.

Step 1402: whether the text on the screen has wrong words or not; if there is no error, then step 1403 is performed; if there is an error, step 1404 is performed.

Step 1403: the next sentence is input by voice.

Step 1404: the target position, i.e. one of the wrong words, can be located by moving the cursor. When the cursor is moved to the wrong word, a correction mode is activated in which the wrong text can be deleted either forward or backward at the target position (i.e., adjacent to the target position).

Step 1405: when inputting a low-order word (typically, 1 st order or 2 nd order) which is not frequently used, in order to increase the word-out probability of the word, the user may select to input a voice of a high-order word (more frequently used, better) containing the low-order word through the voice instead of directly inputting the low-order word. The language model will give more weight to more common combinations, ranking them in the front position in the candidate word.

Step 1406: the optimized voice engine converts the voice input by the user into corresponding pinyin.

Step 1407: for the identified correct pinyin, the WFST model will convert it into a number of multi-level words, for example, "epidemic prevention measure" is a second order word, which consists of "epidemic prevention" and "measure", and "put one" and "epidemic prevention" are single-level words. Typically, higher order words always appear in front of lower order words, and within the same order words, the ranking is mainly based on WFST model probabilities. In the embodiment of the present application, according to a preset rule, the sub-order words included in the higher-order words in the candidate words are arranged in front of other candidate words in the same order as the sub-order words. Here, it is by means of the higher order words that there is a high probability of being arranged in front, so that the next order word of the higher order word has the opportunity to appear in the candidate word earlier by means of the correct probability of the higher order word. It will be understood, of course, that this sub-term is merely preceded by its cognate term and followed by other higher-order terms.

Step 1408: judging whether the second order words in the candidate words have words wanted by the user or not; if yes, go to step 1410; without this, the user may choose to modify the pinyin, i.e., perform step 1409, or may re-input other voices corresponding to the common higher-order words including the word to be input, i.e., perform step 1405, using the voices.

Step 1409: the user may modify the pinyin by means of a keypad.

Step 1410: since the user has found the desired word from the candidate words, the life of the extra pinyin (i.e., helping to find the word the user needs) has also been completed, although it is not consumed. Therefore, the user can directly screen the target candidate words by long-pressing the target candidate words, and the pinyin which is not consumed can be ignored, so that the input of the correct words can be completed.

Step 1411: the user can continue to confirm whether the text of the target area in the information editing interface is correct or not, and check whether other error fragments exist in the sentence. If yes, go to step 1404; if not, then step 1412 is performed.

Step 1412: when the user confirms that the sentence is corrected, the correction completion confirmation control k1 can be pressed down, at this time, the cursor can automatically move to the tail part of the text in the target area, and the next sentence is ready to be input, so that the positioning operation of moving the cursor can be avoided.

The information correction method provided in this embodiment is suitable for correcting shorter words (usually 1 st order or 2 nd order words, but also a few 3 rd order words) which are not very common to input. For unusual words, in extreme cases, users need to find the required word one by one and in a level by level even by means of the pinyin of a single word, which is very laborious. The method and the device can avoid the user to find the wanted word at the position of the candidate word list which is very back, and improve the input efficiency.

The correction mode speech recognition engine is optimized through a large number of pinyins with tones and corresponding speech data, and the engine can be simpler than the normal mode, but can be optimized for converting speech into pinyin, so that the accuracy rate of converting pinyin is improved to a higher level. Therefore, it is also possible to attach corresponding user-input voice information to the corrected text after the voice recognition. Therefore, on one hand, the information receiver can more confirm the accuracy of the characters, and on the other hand, the parallel corpus can also provide very good parallel corpus, and the parallel corpus can be used for individuation to improve the accuracy of a user voice engine.

In the embodiment of the application, the pinyin of the word or the phrase to be input is obtained through voice input, and the Chinese characters corresponding to the pinyin are selected by adopting a multi-level candidate and high-order word auxiliary candidate mode, so that the operation is simple and convenient.

It should be noted that, in the information correction method provided in the embodiment of the present application, the execution body may be an information correction device, or a control module in the information correction device for executing the information correction method. In the embodiment of the present application, an information correction device provided in the embodiment of the present application will be described by taking an example of a method for performing information correction by the information correction device.

As shown in fig. 15, the embodiment of the present application further provides an information correction apparatus 1500, including:

a display module 1501 for displaying first text information generated according to the conversion of the first voice information in a target area in the information editing interface;

a processing module 1502, configured to display, according to the received first input, a corresponding candidate word in a candidate information area in the information editing interface;

a target determining module 1503, configured to obtain first indication information according to an operation of a user in the candidate information area, where the first indication information is used to indicate a target candidate word in the candidate words;

and the information correction module 1504 is configured to correct the first text information in the target area according to the target candidate word.

Optionally, the processing module 1502 includes:

The first processing unit is used for obtaining a first target pinyin according to the received second voice information;

and the second processing unit is used for displaying the candidate words corresponding to the second voice information in the candidate information area according to the first target pinyin.

Optionally, the processing module 1502 includes:

the third processing unit is used for displaying candidate words corresponding to the second voice information in the candidate information area according to the received second voice information and a preset rule;

wherein, the preset rule comprises: when the candidate word is a phrase, displaying a higher order word in the candidate word before a lower order word, and displaying a lower order word of the higher order word before a same order word of the lower order word; the number of words contained in the higher-order words is larger than that of words contained in the lower-order words, the numbers of words contained in the two same-order words are the same, and the second-order words of the higher-order words are words contained in the higher-order words;

the second voice information is voice information corresponding to a higher-order word containing the target candidate word;

the first indication information is obtained according to long-press operation.

Optionally, the processing module 1502 includes:

The fourth processing unit is used for determining a second target pinyin according to the second text information when the second text information in the target area is selected;

a fifth processing unit, configured to display, according to the second target pinyin, a candidate word corresponding to the second target pinyin in the candidate information area;

wherein the first text information in the target area includes the second text information.

Optionally, the information correction apparatus 1500 further includes:

the control display module is used for displaying a deletion control and a correction completion confirmation control on the candidate information area, wherein the deletion control comprises at least one of a forward deletion control and a backward deletion control;

the forward deleting control is used for receiving an instruction of deleting characters at the left adjacent position of the first target position in the information editing interface;

the backward deleting control is used for receiving an instruction of deleting characters at the right adjacent position of the first target position in the information editing interface;

the correction completion confirmation control is used for receiving an instruction of moving a cursor to the sentence end of the text information in the target area;

Optionally, the information correction module 1504 includes:

the first correction unit is used for deleting the text information corresponding to the second indication information in the target area through the deletion control;

the second correction unit is used for determining a second target position according to the second indication information and displaying target candidate words at the second target position;

the second indication information is obtained according to the operation of a user in the target area.

The information correction device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a cell phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, wearable device, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device may be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.

The information correction device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.

The information correction device provided in the embodiment of the present application can implement each process implemented by the method embodiments of fig. 1 to 14, and in order to avoid repetition, a detailed description is omitted here.

Optionally, as shown in fig. 16, the embodiment of the present application further provides an electronic device 1600, which includes a processor 1601, a memory 1602, and a program or an instruction stored in the memory 1602 and capable of being executed on the processor 1601, where the program or the instruction implements each process of the above-mentioned information correction method embodiment when executed by the processor 1601, and the process can achieve the same technical effect, and for avoiding repetition, a detailed description is omitted herein.

The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 17 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1700 includes, but is not limited to: radio frequency unit 1701, network module 1702, audio output unit 1703, input unit 1704, sensor 1705, display unit 1706, user input unit 1707, interface unit 1708, memory 1709, and processor 1710.

Those skilled in the art will appreciate that the electronic device 1700 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1710 via a power management system so as to perform functions such as managing charge, discharge, and power consumption via the power management system. The electronic device structure shown in fig. 17 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown in the drawings, or may combine some components, or may be arranged in different components, which will not be described in detail herein.

The processor 1710 is configured to display first text information generated by converting the first voice information in a target area in the information editing interface; displaying corresponding candidate words in a candidate information area in the information editing interface according to the received first input; according to the operation of a user in the candidate information area, first indication information is obtained, and the first indication information is used for indicating target candidate words in the candidate words; and correcting the first text information in the target area according to the target candidate word.

Optionally, the processor 1710, when displaying a corresponding candidate word in the candidate information area in the information editing interface according to the received first input, is further configured to: obtaining a first target pinyin according to the received second voice information; and displaying the candidate words corresponding to the second voice information in the candidate information area according to the first target pinyin.

Optionally, the processor 1710, when displaying a corresponding candidate word in the candidate information area in the information editing interface according to the received first input, is further configured to: according to the received second voice information and a preset rule, displaying candidate words corresponding to the second voice information in the candidate information area;

Optionally, the processor 1710, when displaying a corresponding candidate word in the candidate information area in the information editing interface according to the received first input, is further configured to: determining a second target pinyin according to the second text information under the condition that the second text information in the target area is selected;

displaying candidate words corresponding to the second target pinyin in the candidate information area according to the second target pinyin;

Optionally, before correcting the first text information in the target area according to the target candidate word, the processor 1710 is further configured to: displaying a deletion control and a correction completion confirmation control on the candidate information area, wherein the deletion control comprises at least one of a forward deletion control and a backward deletion control;

Optionally, when the processor 1710 corrects the first text information in the target area according to the target candidate word, the processor is further configured to: deleting the text information corresponding to the second indication information in the target area through the deleting control; determining a second target position according to the second indication information, and displaying target candidate words at the second target position; the second indication information is obtained according to the operation of a user in the target area.

It should be appreciated that in embodiments of the present application, the input unit 1704 may include a graphics processor (Graphics Processing Unit, GPU) 17041 and a microphone 17042, with the graphics processor 17041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 1706 may include a display panel 17061, and the display panel 17061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1707 includes a touch panel 17071 and other input devices 17072. The touch panel 17071 is also referred to as a touch screen. The touch panel 17071 can include two parts, a touch detection device and a touch controller. Other input devices 17072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein. The memory 1709 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 1710 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1710.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, and when the program or the instruction is executed by a processor, the processes of the embodiment of the information correction method are implemented, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.

Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.

The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, so as to implement each process of the above information correction method embodiment, and achieve the same technical effect, so that repetition is avoided, and no redundant description is provided here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims

1. An information correction method, comprising:

correcting the first text information in the target area according to the target candidate word;

and displaying the corresponding candidate word in the candidate information area in the information editing interface according to the received first input, wherein the method comprises the following steps:

according to the received second voice information and a preset rule, displaying candidate words corresponding to the second voice information in the candidate information area;

wherein, the preset rule comprises: the candidate words are ranked according to the order of the candidate words, and the candidate words corresponding to the second voice information are displayed in the candidate information area according to the ranking result, and the method comprises the following steps: when the candidate word is a phrase, displaying a higher order word in the candidate word before a lower order word, and displaying a lower order word of the higher order word before a same order word of the lower order word; the number of words contained in the higher-order words is larger than that of words contained in the lower-order words, the numbers of words contained in the two same-order words are the same, and the second-order words of the higher-order words are words contained in the higher-order words;

the first indication information is obtained according to long-press operation.

2. The information correction method according to claim 1, wherein the displaying the corresponding candidate word in the candidate information area in the information editing interface according to the received first input includes:

obtaining a first target pinyin according to the received second voice information;

and displaying the candidate words corresponding to the second voice information in the candidate information area according to the first target pinyin.

3. The information correction method according to claim 1, wherein the displaying the corresponding candidate word in the candidate information area in the information editing interface according to the received first input includes:

determining a second target pinyin according to the second text information under the condition that the second text information in the target area is selected;

4. The information correction method according to claim 1, wherein before correcting the first text information in the target area according to the target candidate word, the information correction method further comprises:

displaying a deletion control and a correction completion confirmation control on the candidate information area, wherein the deletion control comprises at least one of a forward deletion control and a backward deletion control;

5. The method of claim 4, wherein the correcting the first text information in the target area according to the target candidate word includes:

deleting the text information corresponding to the second indication information in the target area through the deleting control;

Determining a second target position according to the second indication information, and displaying target candidate words at the second target position;

6. An information correction device, comprising:

the information correction module is used for correcting the first text information in the target area according to the target candidate word;

the processing module comprises:

the first indication information is obtained according to long-press operation.

7. The information correction device according to claim 6, wherein the processing module includes:

8. The information correction device according to claim 6, wherein the processing module includes:

9. The information correction device according to claim 6, characterized in that the information correction device further comprises:

10. The information correction device according to claim 9, wherein the information correction module includes:

11. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the information modification method of any one of claims 1 to 5.

12. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the information correction method according to any of claims 1 to 5.