CN109085932B - Candidate entry adjustment method, device, equipment and readable storage medium - Google Patents
Candidate entry adjustment method, device, equipment and readable storage medium Download PDFInfo
- Publication number
- CN109085932B CN109085932B CN201810940932.2A CN201810940932A CN109085932B CN 109085932 B CN109085932 B CN 109085932B CN 201810940932 A CN201810940932 A CN 201810940932A CN 109085932 B CN109085932 B CN 109085932B
- Authority
- CN
- China
- Prior art keywords
- word
- target
- candidate
- result
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 230000004044 response Effects 0.000 claims abstract description 15
- 230000008569 process Effects 0.000 claims description 22
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 239000012634 fragment Substances 0.000 abstract 1
- 238000012360 testing method Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0236—Character input methods using selection techniques to select from displayed items
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
The application discloses a candidate term adjustment method, a candidate term adjustment device, candidate term adjustment equipment and a readable storage medium, wherein the candidate term adjustment method, the candidate term adjustment device and the readable storage medium are used for receiving a selection request of target candidate terms in displayed candidate terms, further acquiring and displaying splitting results of the target candidate terms, determining correct terms classified into correct terms and terms classified into errors in the target candidate terms in response to accuracy classification operation of the displayed splitting results, and further acquiring and displaying re-decoding results of coding fragments corresponding to the terms classified into errors in input coding information. According to the scheme of the application, the target candidate entry is split, so that the user does not need to search for the correct word again, the encoding segment corresponding to the wrong word is re-decoded, the user only needs to select the correct entry from the re-decoding result, the search time of the user is greatly shortened, and the input efficiency is improved.
Description
Technical Field
The present disclosure relates to the field of input method technologies, and in particular, to a candidate entry adjustment method, device, apparatus, and readable storage medium.
Background
Along with the rapid development and application of the hand intelligent terminal, the input method software becomes one of the indispensable software in the process of chatting, web searching and the like in the intelligent terminal. The input method is used as an interaction entrance between people and electronic equipment, and the characteristics of nature, convenience, high efficiency and the like are always pursued by users.
In the existing input method, after receiving coding information input by a user, the full-matching candidate entry and the half-matching candidate entry are decoded and displayed. When all the full-match candidate entries do not hit the user's expectations, the user needs to compose the expected result by searching and selecting a plurality of half-match candidate entries. Referring to fig. 1a-1d, an existing input method candidate entry lookup process is illustrated. Taking the T9 keyboard input pinyin string "cessijiguimo" as an example, the user expects to get "test set size". Referring to fig. 1a, it can be seen that the candidate entry given by the input method fails to hit the user's expectations. The user needs to search and select the half matching result in turn: test- > set- > scale, splice out the desired results. Each half matching result needs to be searched by a user, taking the case of searching the 'set' word in the step (3) as an example, the previous tens of entries need to be filtered out in sequence, and the expected 'set' word can be found only on the page 4, so that the efficiency of the whole input process is extremely low.
Therefore, in the existing input method, under the condition that the full-matching candidate entries fail to hit the expected scene of the user, the user is required to search the half-matching candidate entries one by one to splice the expected result, and the problem of low efficiency of the whole input process is caused.
Disclosure of Invention
In view of the above, the present application provides a candidate entry adjustment method, device, apparatus and readable storage medium, which are used for solving the problem of low input efficiency of the existing input method.
In order to achieve the above object, the following solutions have been proposed:
a candidate term adjustment method, comprising:
receiving a selection request of target candidate entries in displayed candidate entries corresponding to input coding information;
obtaining and displaying a splitting result of the target candidate entry, wherein the splitting result comprises at least two word strings, and each word string comprises at least one word;
responding to the correctness classification operation of the displayed splitting result, and determining words classified as correct and words classified as incorrect in the target candidate entry;
and obtaining and displaying the re-decoding result of the coding segment corresponding to the word classified as the error in the input coding information.
Preferably, the process of obtaining the splitting result of the target candidate entry includes:
Acquiring all possible word strings composing the target candidate entry;
for each possible word string, querying a dictionary, determining the possible word strings existing in the dictionary as a split result of the target candidate entry.
Preferably, the process of obtaining the splitting result of the target candidate entry further includes:
acquiring word frequency and mutual information of each possible word string;
and deleting possible word strings with word frequencies lower than the set word frequency threshold or with mutual information lower than the set mutual information threshold.
Preferably, the process of obtaining the splitting result of the target candidate entry further includes:
determining the sorting order of the splitting results of the target candidate entries according to a first sorting criterion and a second sorting criterion, wherein the first sorting criterion comprises that the sequence of the initial word strings in the target candidate entries is consistent with the word string sorting order; the second ordering criterion comprises word strings with the same initial in the target candidate entry, and the word strings are ordered according to the number of words contained in the word strings by at least more sequences;
then, a process of displaying the splitting result of the target candidate entry includes:
and displaying the split results which are ordered according to the determined ordering order.
Preferably, the determining, in response to the correctness classification operation of the displayed splitting result, the words classified as correct and the words classified as incorrect in the target candidate term includes:
responding to a first editing operation of a first target word string in the displayed splitting result, wherein the first editing operation represents a word with a selected error;
determining words classified as errors in the target candidate vocabulary entries according to the first target vocabulary strings;
after determining that all first editing operations are performed on the split result, determining words except the wrong word in the target candidate entry as being classified as correct words.
Preferably, the determining, in response to the correctness classification operation of the displayed splitting result, the words classified as correct and the words classified as incorrect in the target candidate term includes:
responding to a second editing operation of a second target word string in the displayed splitting result, wherein the second editing operation represents selection of a correct word;
determining words classified as correct in the target candidate vocabulary entry according to the second target vocabulary string;
after determining that all second editing operations are performed on the split result, determining the words except the correct word in the target candidate entry as the words classified as errors.
Preferably, the method further comprises:
and hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
Preferably, the method further comprises:
determining an order in which the objects perform the second editing operation;
and hiding the second target word string and the word strings positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
Preferably, obtaining a re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information includes:
re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities;
and sequencing the re-decoding candidate entries at least according to the sequence of the matching probability from high to low.
Preferably, the sorting the re-decoding candidate entries at least in order of the matching probability from high to low includes:
and sorting the re-decoding candidate entries according to the order of the matching probability from high to low and sorting penalty of the re-decoding candidate entries containing the words classified as errors.
A candidate term adjustment device, comprising:
the selecting request receiving unit is used for receiving a selecting request of a target candidate vocabulary entry in the displayed candidate vocabulary entries corresponding to the input coding information;
the splitting result obtaining unit is used for splitting the target candidate entry, wherein the splitting result comprises at least two word strings, and each word string comprises at least one word;
the splitting result display unit is used for displaying the splitting result of the target candidate entry;
the word classification unit is used for responding to the correctness classification operation of the displayed splitting result and determining words classified as correct and words classified as wrong in the target candidate vocabulary entry;
a re-decoding result obtaining unit, configured to obtain a re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information;
and the re-decoding result display unit is used for displaying the re-decoding result.
Preferably, the resolution result obtaining unit includes:
a possible word string acquisition unit configured to acquire all possible word strings constituting the target candidate term;
and the dictionary query unit is used for querying a dictionary for each possible word string, and determining the possible word strings existing in the dictionary as the splitting result of the target candidate vocabulary entry.
Preferably, the splitting result obtaining unit further includes:
the word frequency and mutual information acquisition unit is used for acquiring the word frequency and mutual information of each possible word string;
and the word string deleting unit is used for deleting possible word strings with word frequencies lower than the set word frequency threshold value or with mutual information lower than the set mutual information threshold value.
Preferably, the splitting result obtaining unit further includes:
the splitting result ordering unit is used for determining the ordering order of the splitting results of the target candidate entries according to a first ordering criterion and a second ordering criterion, wherein the first ordering criterion comprises that the sequence of the initial word strings in the target candidate entries is consistent with the word string ordering order; the second ordering criterion comprises word strings with the same initial in the target candidate entry, and the word strings are ordered according to the number of words contained in the word strings by at least more sequences;
the split result display unit is specifically configured to display the split results ordered according to the determined ordering order.
Preferably, the word classifying unit includes:
the first editing operation response unit is used for responding to a first editing operation of a first target word string in the displayed splitting result, wherein the first editing operation represents a word with a selected error;
A first wrong word determining unit, configured to determine, according to the first target word string, a word classified as wrong in the target candidate term;
and the first correct word determining unit is used for determining the words except the wrong word in the target candidate entry as the correct words after determining that all first editing operations are performed on the split result.
Preferably, the word classifying unit includes:
a second editing operation response unit, configured to respond to a second editing operation of a second target word string in the displayed splitting result, where the second editing operation indicates that a correct word is selected;
a second correct word determining unit, configured to determine, according to the second target word string, a word classified as correct in the target candidate term;
and a second wrong word determining unit configured to determine, after determining that all second editing operations have been performed on the split result, words other than the correct word in the target candidate term as words classified as wrong.
Preferably, the method further comprises:
the first word string hiding unit is used for hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
Preferably, the method further comprises:
an execution order determining unit configured to determine an order in which the objects execute the second editing operation;
and the second word string hiding unit is used for hiding the second target word string and the word string positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
Preferably, the re-decoding result acquisition unit includes:
the re-decoding unit is used for re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries, so as to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities;
and the re-decoding ordering unit is used for ordering the re-decoding candidate entries at least according to the order of the matching probability from high to low.
Preferably, the re-decoding ordering unit includes:
and the re-decoding ordering penalty unit is used for ordering each re-decoding candidate entry according to the order of the matching probability from high to low, and ordering penalty is carried out on the re-decoding candidate entries containing the words classified as errors.
A candidate term adjustment device comprising a memory and a processor;
the memory is used for storing programs;
the processor is configured to execute the program to implement the steps of the candidate term adjustment method as described above.
A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a candidate entry alignment method as described above.
From the above technical solution, it can be seen that the candidate term adjustment method provided in the embodiment of the present application may be applicable to a scenario where a candidate term misses a user's expectation, receive a selection request for a target candidate term in a displayed candidate term, where the target candidate term may be a candidate term that is identified by the user as being closest to the expected result, further obtain and display a split result of the target candidate term, where the split result includes at least two word strings, each word string includes at least one word, determine, in response to a correctness classification operation for the displayed split result, a word classified as correct and a word classified as incorrect in the target candidate term, and further obtain and display a re-decoding result of a coded segment corresponding to the word classified as incorrect in input coded information. Based on the scheme of the application, the user can determine the correct word from the target candidate word, and further, the user can select the correct word from the displayed re-decoding result, and the correct word and the determined correct word form the final expected result to be output. Obviously, according to the scheme of the application, through splitting the target candidate entry, the user does not need to search for the correct word again, and the encoding segment corresponding to the wrong word is re-decoded, so that the user only needs to select the correct entry from the re-decoding result, the search time of the user is greatly shortened, and the input efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIGS. 1a-1d illustrate a schematic diagram of a prior art input method candidate entry selection process;
FIG. 2 is a flowchart of a candidate entry adjustment method disclosed in an embodiment of the present application;
FIG. 3 illustrates a schematic view of the effect of a selected approach to target candidate entries;
FIG. 4 illustrates an effect diagram from a user selection of a target candidate term to a target candidate term split result presentation;
FIG. 5 illustrates a re-decoding result of a coded segment corresponding to a word classified as erroneous in input coding information;
FIG. 6 illustrates a user performing a first editing operation on the split result to select an erroneous word;
FIG. 7 illustrates a graphical representation of an interface presentation effect in response to a second editing operation by a user;
Fig. 8 is a schematic structural diagram of a candidate entry adjustment device according to an embodiment of the present disclosure;
fig. 9 is a block diagram of a hardware structure of a candidate entry adjustment device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The embodiment of the application discloses a candidate entry adjustment method, which can greatly reduce the input cost of a user and improve the input efficiency under the condition that the candidate entry is not hit in a scene expected by the user. Next, referring to fig. 2, a method for adjusting candidate entries in the present application will be described, which includes:
step S100, receiving a selection request of target candidate entries in the displayed candidate entries corresponding to the input coding information.
Specifically, the user, after inputting the encoded information, the input method system presents candidate terms corresponding to the input encoded information, which are typically a plurality of terms. For a scenario where the candidate term does not hit the user's desire, generally, some of all the words contained in the candidate term may be user-desired and some of the words may not be user-desired. Based on this, the user can select the candidate term containing the most desired term therein as the object of the re-editing. In this embodiment, a request for selecting a target candidate term from displayed candidate terms is received.
Here, the initiation manner of the selection request may be various, such as clicking, voice specification, and the like. Referring to fig. 3, a way of selecting target candidate entries is illustrated. Assume that what the user desires to input is "test set size". After inputting the encoded information "shijignuimo", the system gives candidate entries as shown in fig. 3. It can be seen that the candidate term "tester size" is closest to the user's desired output, so the user can trigger the target candidate term "tester size" displayed.
Step S110, obtaining and displaying the splitting result of the target candidate entry.
Wherein the split result comprises at least two word strings, each word string comprising at least one word. In the step, a splitting result of the target candidate entry can be obtained and displayed. Fig. 4 illustrates an effect diagram of splitting a result presentation from a target candidate term selected by a user to a target candidate term.
And step 120, responding to the correctness classification operation of the displayed splitting result, and determining the words classified as correct and the words classified as wrong in the target candidate vocabulary entry.
Specifically, by displaying the splitting result of the target candidate entry to the user, it can be determined that the splitting result contains a partially correct word and a partially incorrect word. Here, the correct word is the same word as the one included in the target candidate term and the wrong word is a word different from the one included in the target candidate term and the one included in the user desired output result. Still taking the split result illustrated in fig. 4 as an example, since the user desires to output the result as "test set size", the "test", "scale", "rule", "and" module "in the split result are all correct words, and the" machine "is an incorrect word.
The user can classify the correctness of the split result through various types of correctness classification operations. For example, the correct word may be selected, with the remainder being the wrong word. Alternatively, the wrong word is selected, and the remainder is the correct word.
And step 130, obtaining and displaying the re-decoding result of the coding segment corresponding to the word classified as the error in the input coding information.
Specifically, the words classified as correct and incorrect in the target candidate entry are determined, the correct words can be directly reserved, the corresponding coding segments of the incorrect words in the input coding information are determined, the re-decoding results of the coding segments are obtained, and the re-decoding results are displayed for a user to select a desired result from, and the final result is formed with the correct words.
Referring to fig. 5, a diagram illustrating a re-decoding result of a coded segment corresponding to a word classified as an error in input coded information is illustrated. The word classified as an error is "machine", the corresponding code segment in the input code information is "ji", and the re-decoding result of the code segment as "ji" is obtained and shown in fig. 5. The user can quickly find the desired result set, so that the input efficiency is improved.
According to the scheme provided by the embodiment of the application, the target candidate entry is split, so that the user does not need to search for the correct word again, the encoding segment corresponding to the wrong word is re-decoded, and the user only needs to select the correct entry from the re-decoding result, so that the user searching time is greatly shortened, and the input efficiency is improved.
The embodiment of the present application describes the process of obtaining the splitting result of the target candidate entry in step S110.
For the splitting process of the target candidate term, there may be multiple splitting manners, such as splitting the target candidate term according to a single word or phrase form. Considering that the splitting result needs to ensure that the operation cost of the user is reduced as much as possible, the embodiment provides a splitting mode, which is as follows:
first, all possible word strings that make up the target candidate entry are obtained.
Specifically, the first word of the target candidate term is traversed backwards, the traversed word is used as a target word, 0-L words are sequentially selected backwards from the target word, the traversed word and the selected word form a possible word string, and L is the number of words located behind the target word in the target candidate term.
Taking the target candidate entry as an example of 'test machine scale', assuming that the currently traversed target word is 'machine', forming a possible word string by 'machine', 'rule' and 'machine scale', selecting the next word 'rule' as the target word, and repeating the above process.
Further, a dictionary is queried for each possible word string, and the possible word strings existing in the dictionary are determined as a split result of the target candidate entry.
It will be appreciated that some of the possible word strings obtained above may not be in accordance with the grammatical specification, i.e., are not canonical words or word strings. In this step, by querying the dictionary, the possible word strings existing in the dictionary are determined as the split result of the target candidate entry.
Taking the possible word string "machine scale" corresponding to the "tester scale" as an example, it is obvious that it is not a canonical word string, and the word string does not exist in the dictionary, so it is impossible to be a split result.
Finally, the resolution results for "tester scale" are shown in table 1 below.
TABLE 1
On the basis of the above, certain words or word strings can be filtered out in consideration of low usage of the words or word strings or low coherence between words contained in the word strings. Specifically:
The word frequency and mutual information of each possible word string are acquired. Further, possible word strings whose word frequencies are below a set word frequency threshold, or whose mutual information is below a set mutual information threshold, are deleted.
The term frequency reflects the use frequency of terms and can be obtained according to statistics in advance.
Mutual information can also be obtained according to statistics in advance, the mutual information is mainly divided into two substrings of x and y by the entry, and the mutual information is calculated as follows:
wherein P (x, y) is the joint distribution of the vocabulary entries x and y, P (x) is the marginal distribution of the vocabulary entries x, and P (y) is the marginal distribution of the vocabulary entries y.
By filtering word strings using word frequency and mutual information, the "tester" in table 1 above can be filtered out, and the split result of "tester scale" is shown in table 2 below.
TABLE 2
Further, for the splitting result of the target candidate entry, word strings included in the splitting result may be sequenced, and the sequenced splitting result may be displayed.
In this embodiment, two sorting criteria are set, namely a first sorting criterion and a second sorting criterion. Wherein the first ranking criterion comprises: the sequence of the initial letters of the word strings in the target candidate entries is consistent with the sequence of word string ordering. The second ranking criteria includes: word strings with the same initial in the target candidate entry are ranked according to the number of words contained in the word strings by at least sequence.
Based on this, a ranking order of the split results of the target candidate entries is determined according to a first ranking criterion and a second ranking criterion.
Through the sorting mode, the user can determine correct words and incorrect words in the words more conveniently, and the implementation process of the correctness classification operation is more convenient.
For the "tester scale" split results illustrated in table 2 above, the results are shown in fig. 4, ordered according to the two principles described above.
In another embodiment of the present application, the process of determining the words classified as correct and the words classified as incorrect in the target candidate term in response to the correctness classification operation of the displayed split result is described in step S120.
Specifically, the present embodiment may configure that the user first editing operation indicates that the wrong word is selected, or configure that the user second editing operation indicates that the correct word is selected. Based on this, there may be two kinds of correctness classification operations, respectively described below:
first kind:
s1, responding to a first editing operation of a first target word string in the displayed splitting result, wherein the first editing operation represents a word with a selected error.
The first editing operation may be operation gesture operations such as clicking, double clicking, etc., and may be in the form of voice instructions.
The operation object corresponding to the first editing operation is a first target word string.
S2, determining words classified as errors in the target candidate entry according to the first target word string.
Specifically, if the first target word string includes one word, the first target word string included in the target candidate term may be determined to be the wrong word.
S3, after the fact that all first editing operations are executed on the splitting result is determined, determining words except the wrong word in the target candidate vocabulary entry as words classified as correct words.
Specifically, the user may sequentially perform the first editing operations on the split result, each time the first editing operation is performed, corresponding to one first target word string, and after it is determined that all the first editing operations are performed on the split result, the user may be considered that all the wrong words have been selected, so that words other than the wrong words in the target candidate term may be determined to be classified as correct words.
Referring to fig. 6, a process in which a user performs a first editing operation on a split result to select an erroneous word is illustrated.
For the split result illustrated in fig. 4, where "machine" is the wrong word, the user may perform a first editing operation thereon, such as clicking, and then take it as the wrong word. Words other than "machine" in the target candidate entry are all correct words.
Second kind:
s1, responding to a second editing operation of a second target word string in the displayed splitting result, wherein the second editing operation represents selecting a correct word.
The second editing operation may be an operation gesture operation such as clicking or double clicking, and may be a voice command.
The operation object corresponding to the second editing operation is a second target word string.
S2, determining words classified as correct in the target candidate vocabulary entry according to the second target vocabulary string.
Specifically, the second target word string included in the target candidate entry may be directly taken as the correct word.
S3, after the fact that all second editing operations are executed on the split result is determined, determining words except the correct words in the target candidate vocabulary entry as words classified as errors.
Specifically, the user may sequentially perform the second editing operations on the split result, each time the second editing operations are performed corresponding to one second target word string, and after it is determined that all the second editing operations are performed on the split result, the user may be considered that all the correct words have been selected, so that words other than the correct words in the target candidate term may be determined to be classified as incorrect words.
On this basis, in the foregoing splitting results obtained by splitting the target candidate entry, the same word may be included between two different word strings, such as "test" and "test" in fig. 4, and the latter two splitting results are both included in the first splitting result, when the user performs the second editing operation on "test", it may be determined that "test" and "test" should also belong to the correct word, so as to reduce the cost of the user performing the second editing operation on "test" and "test" again, in this embodiment of the present application, the following processing procedure may be further added on the foregoing basis:
and hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
That is, when the user performs the second editing operation on the split result once and determines the second target word string corresponding to the second editing operation, the second target word string and other word strings including the words in the second target word string in the split result may be hidden.
Referring to fig. 7, an interface presentation effect diagram in response to a second editing operation by a user is illustrated. After the user performs the second editing operation on "test", both "test" and "test" are hidden.
Still further, it is contemplated that the user may generally perform the second editing operation on the split result in a certain order, such as selecting the second target word string sequentially from left to right to perform the second editing operation, or from right to left. Based on this, when the user selects a second target word string to perform the second editing operation, it is apparent that the second target word string has been performed by the user, without continuing to display the second target word string, and thus can be hidden. Further, according to the order of user selection, the word strings preceding the second target word string are words that the user considers to be wrong, so that such word strings can be hidden, and the specific operation process may include:
an order in which the objects perform the second editing operation is determined. And hiding the second target word string and the word strings positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
Assume that the split result includes A, B, C, D, E. The determination object performs a second editing operation in left to right order, i.e. selects the correct word.
And C is firstly selected by the object, and C is confirmed to be a second target word string. According to the scheme of the application, the word strings before C and C are hidden, and only D, E is displayed after the word strings are hidden.
Wherein, for the process of determining the order in which the objects perform the second editing operation, may include:
the object inputs its own operation sequence, including the object self-edit input operation sequence, or the present application provides an operation sequence list for the object to select from, and determines the operation sequence selected by the object as the sequence in which the object performs the second edit operation.
Or,
and determining the operation sequence of the object habit according to the sequence of the second editing operation executed by the object history, and executing the sequence of the second editing operation as the object.
In yet another embodiment of the present application, the process of obtaining the re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information in the foregoing step S130 is described.
Alternatively, the encoded segments corresponding to the words classified as errors in the input encoded information may be directly re-decoded to obtain a re-decoding result.
Alternatively, when re-decoding the encoded segment corresponding to the word classified as the error, the influence of the word classified as the correct word in the target candidate term on the re-decoding may be considered, namely:
re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities; and sequencing the re-decoding candidate entries at least according to the sequence of the matching probability from high to low.
It will be appreciated that if the wrong word is located at the end of the target candidate term, it may be determined that the wrong word has been determined above; if the wrong word is located at the head of the target candidate term, it can be determined that the following of the wrong word has been determined; if the wrong word is located in the middle of the target candidate term, it may be determined that the context of the wrong word has been determined. When the coding segment corresponding to the wrong word is re-decoded, the following formula is targeted:
P(AY i b|above = a, below = B
Wherein Y is i The i-th re-decoding candidate entry of the coding segment corresponding to the wrong word, wherein A is the context information of the wrong word in the target candidate entry, and B is the context information of the wrong word in the target candidate entry.
According to the formula, each re-decoding candidate entry and the matching probability P of the corresponding re-decoding candidate entry can be obtained through decoding.
In this embodiment, the re-decoding candidate entries may be ordered directly according to the order of the matching probability from high to low.
Further, there may be words containing the above-determined errors in the re-decoding candidate entries, and for such re-decoding candidate entries, it is impossible for the re-decoding candidate entries to become entries desired by the user with a high probability, so that when sorting in order of the matching probabilities from the high probability to the low probability, sorting penalty may be performed on the re-decoding candidate entries containing the words classified as errors.
As illustrated in fig. 5, for the encoded segment of the "ji" belonging to the error word, the target candidate term includes the error word "machine" corresponding to the encoded segment, so that the re-decoding candidate term "machine" including the "machine" is subjected to the sorting penalty during re-decoding, and the final sorting position is at a later position.
The candidate term adjustment device provided in the embodiments of the present application will be described below, and the candidate term adjustment device described below and the candidate term adjustment method described above may be referred to correspondingly to each other.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a candidate entry adjustment device according to an embodiment of the present application.
As shown in fig. 8, the apparatus may include:
a selected request receiving unit 11, configured to receive a selected request for a target candidate term from displayed candidate terms corresponding to input coding information;
a splitting result obtaining unit 12, configured to split the target candidate entry, where the splitting result includes at least two word strings, and each word string includes at least one word;
a splitting result display unit 13, configured to display a splitting result of the target candidate entry;
a word classification unit 14, configured to determine words classified as correct and words classified as incorrect in the target candidate term in response to a correctness classification operation of the presented split result;
A re-decoding result obtaining unit 15, configured to obtain a re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information;
and a re-decoding result display unit 16, configured to display the re-decoding result.
Optionally, the splitting result obtaining unit may include:
a possible word string acquisition unit configured to acquire all possible word strings constituting the target candidate term;
and the dictionary query unit is used for querying a dictionary for each possible word string, and determining the possible word strings existing in the dictionary as the splitting result of the target candidate vocabulary entry.
Optionally, the splitting result obtaining unit may further include:
the word frequency and mutual information acquisition unit is used for acquiring the word frequency and mutual information of each possible word string;
and the word string deleting unit is used for deleting possible word strings with word frequencies lower than the set word frequency threshold value or with mutual information lower than the set mutual information threshold value.
Optionally, the splitting result obtaining unit may further include:
the splitting result ordering unit is used for determining the ordering order of the splitting results of the target candidate entries according to a first ordering criterion and a second ordering criterion, wherein the first ordering criterion comprises that the sequence of the initial word strings in the target candidate entries is consistent with the word string ordering order; the second ordering criterion comprises word strings with the same initial in the target candidate entry, and the word strings are ordered according to the number of words contained in the word strings by at least more sequences;
The split result display unit is specifically configured to display the split results ordered according to the determined ordering order.
Alternatively, the word classifying unit may include:
the first editing operation response unit is used for responding to a first editing operation of a first target word string in the displayed splitting result, wherein the first editing operation represents a word with a selected error;
a first wrong word determining unit, configured to determine, according to the first target word string, a word classified as wrong in the target candidate term;
and the first correct word determining unit is used for determining the words except the wrong word in the target candidate entry as the correct words after determining that all first editing operations are performed on the split result.
Alternatively, the word classifying unit may include:
a second editing operation response unit, configured to respond to a second editing operation of a second target word string in the displayed splitting result, where the second editing operation indicates that a correct word is selected;
a second correct word determining unit, configured to determine, according to the second target word string, a word classified as correct in the target candidate term;
and a second wrong word determining unit configured to determine, after determining that all second editing operations have been performed on the split result, words other than the correct word in the target candidate term as words classified as wrong.
Further, the apparatus of the present application may further include:
the first word string hiding unit is used for hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
Further, the apparatus of the present application may further include:
an execution order determining unit configured to determine an order in which the objects execute the second editing operation;
and the second word string hiding unit is used for hiding the second target word string and the word string positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
Optionally, the re-decoding result obtaining unit may include:
the re-decoding unit is used for re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries, so as to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities;
and the re-decoding ordering unit is used for ordering the re-decoding candidate entries at least according to the order of the matching probability from high to low.
Optionally, the re-decoding ordering unit may include:
And the re-decoding ordering penalty unit is used for ordering each re-decoding candidate entry according to the order of the matching probability from high to low, and ordering penalty is carried out on the re-decoding candidate entries containing the words classified as errors.
The candidate term adjusting device provided by the embodiment of the application can be applied to candidate term adjusting equipment, such as a mobile terminal, a PC terminal, a cloud platform, a server cluster and the like. Alternatively, fig. 9 shows a block diagram of a hardware structure of the candidate term adjustment device, and referring to fig. 9, the hardware structure of the candidate term adjustment device may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;
in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete communication with each other through the communication bus 4;
the processor 1 may be a central processing unit CPU or an ASIC
(Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the invention, etc.;
the memory 3 may comprise a high-speed RAM memory, and may further comprise a non-volatile memory (non-volatile memory) or the like, such as at least one magnetic disk memory;
Wherein the memory stores a program, the processor is operable to invoke the program stored in the memory, the program operable to:
receiving a selection request of target candidate entries in displayed candidate entries corresponding to input coding information;
obtaining and displaying a splitting result of the target candidate entry, wherein the splitting result comprises at least two word strings, and each word string comprises at least one word;
responding to the correctness classification operation of the displayed splitting result, and determining words classified as correct and words classified as incorrect in the target candidate entry;
and obtaining and displaying the re-decoding result of the coding segment corresponding to the word classified as the error in the input coding information.
Alternatively, the refinement function and the extension function of the program may be described with reference to the above.
The embodiment of the application also provides a readable storage medium, which can store a program suitable for being executed by a processor, the program being configured to:
receiving a selection request of target candidate entries in displayed candidate entries corresponding to input coding information;
obtaining and displaying a splitting result of the target candidate entry, wherein the splitting result comprises at least two word strings, and each word string comprises at least one word;
Responding to the correctness classification operation of the displayed splitting result, and determining words classified as correct and words classified as incorrect in the target candidate entry;
and obtaining and displaying the re-decoding result of the coding segment corresponding to the word classified as the error in the input coding information.
Alternatively, the refinement function and the extension function of the program may be described with reference to the above.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (13)
1. A method for candidate term adjustment, comprising:
receiving a selection request of target candidate entries in displayed candidate entries corresponding to input coding information; the coded information is correct coded information;
obtaining and displaying a splitting result of the target candidate entry, wherein the splitting result comprises at least two word strings, and each word string comprises at least one word;
Responding to the correctness classification operation of the displayed splitting result, and determining words classified as correct and words classified as incorrect in the target candidate entry;
obtaining and displaying the re-decoding result of the coding segment corresponding to the word classified as the error in the input coding information;
the process for obtaining the splitting result of the target candidate entry comprises the following steps:
acquiring all possible word strings composing the target candidate entry;
for each possible word string, querying a dictionary, determining the possible word strings existing in the dictionary as a split result of the target candidate entry.
2. The method of claim 1, wherein the determining, in response to the correctness classification operation of the presented split result, words in the target candidate term that are classified as correct and words that are classified as incorrect comprises:
responding to a first editing operation of a first target word string in the displayed splitting result, wherein the first editing operation represents a word with a selected error;
determining words classified as errors in the target candidate vocabulary entries according to the first target vocabulary strings;
after determining that all first editing operations are performed on the split result, determining words except the wrong word in the target candidate entry as being classified as correct words.
3. The method of claim 1, wherein the determining, in response to the correctness classification operation of the presented split result, words in the target candidate term that are classified as correct and words that are classified as incorrect comprises:
responding to a second editing operation of a second target word string in the displayed splitting result, wherein the second editing operation represents selection of a correct word;
determining words classified as correct in the target candidate vocabulary entry according to the second target vocabulary string;
after determining that all second editing operations are performed on the split result, determining the words except the correct word in the target candidate entry as the words classified as errors.
4. A method according to claim 3, further comprising:
and hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
5. The method according to claim 3 or 4, further comprising:
determining an order in which the objects perform the second editing operation;
and hiding the second target word string and the word strings positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
6. The method of claim 1, wherein obtaining the re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information comprises:
re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities;
and sequencing the re-decoding candidate entries at least according to the sequence of the matching probability from high to low.
7. The method of claim 6, wherein the ordering each of the re-decoding candidate entries at least in order of a matching probability from a high order to a low order comprises:
and sorting the re-decoding candidate entries according to the order of the matching probability from high to low and sorting penalty of the re-decoding candidate entries containing the words classified as errors.
8. A candidate term adjustment device, comprising:
the selecting request receiving unit is used for receiving a selecting request of a target candidate vocabulary entry in the displayed candidate vocabulary entries corresponding to the input coding information; the coded information is correct coded information;
A splitting result obtaining unit, configured to obtain a splitting result of the target candidate entry, where the splitting result includes at least two word strings, and each word string includes at least one word;
the splitting result display unit is used for displaying the splitting result of the target candidate entry;
the word classification unit is used for responding to the correctness classification operation of the displayed splitting result and determining words classified as correct and words classified as wrong in the target candidate vocabulary entry;
a re-decoding result obtaining unit, configured to obtain a re-decoding result of the encoded segment corresponding to the word classified as the error in the input encoded information;
a re-decoding result display unit, configured to display the re-decoding result;
the word classification unit includes:
a second editing operation response unit, configured to respond to a second editing operation of a second target word string in the displayed splitting result, where the second editing operation indicates that a correct word is selected;
a second correct word determining unit, configured to determine, according to the second target word string, a word classified as correct in the target candidate term;
and a second wrong word determining unit configured to determine, after determining that all second editing operations have been performed on the split result, words other than the correct word in the target candidate term as words classified as wrong.
9. The apparatus as recited in claim 8, further comprising:
the first word string hiding unit is used for hiding the second target word string and other word strings containing words in the second target word string in the displayed splitting result.
10. The apparatus according to claim 8 or 9, further comprising:
an execution order determining unit configured to determine an order in which the objects execute the second editing operation;
and the second word string hiding unit is used for hiding the second target word string and the word string positioned in front of the second target word string in the splitting result when responding to the second editing operation on the second target word string according to the sequence of the second editing operation executed by the object.
11. The apparatus of claim 8, wherein the re-decoding result acquisition unit comprises:
the re-decoding unit is used for re-decoding the coding segments corresponding to the words classified as errors in the input coding information according to the words classified as correct in the target candidate vocabulary entries, so as to obtain a plurality of re-decoding candidate vocabulary entries containing matching probabilities;
and the re-decoding ordering unit is used for ordering the re-decoding candidate entries at least according to the order of the matching probability from high to low.
12. A candidate term adjustment device, comprising a memory and a processor;
the memory is used for storing programs;
the processor being configured to execute the program to implement the steps of the candidate term adjustment method as defined in any one of claims 1 to 7.
13. A readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the candidate term adjustment method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810940932.2A CN109085932B (en) | 2018-08-17 | 2018-08-17 | Candidate entry adjustment method, device, equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810940932.2A CN109085932B (en) | 2018-08-17 | 2018-08-17 | Candidate entry adjustment method, device, equipment and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109085932A CN109085932A (en) | 2018-12-25 |
CN109085932B true CN109085932B (en) | 2023-07-25 |
Family
ID=64793794
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810940932.2A Active CN109085932B (en) | 2018-08-17 | 2018-08-17 | Candidate entry adjustment method, device, equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109085932B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003270985A1 (en) * | 2002-12-23 | 2004-07-08 | Canon Kabushiki Kaisha | Method for Presenting Hierarchical Data |
CN101556596A (en) * | 2007-08-31 | 2009-10-14 | 北京搜狗科技发展有限公司 | Input method system and intelligent word making method |
CN101694608A (en) * | 2008-12-04 | 2010-04-14 | 北京搜狗科技发展有限公司 | Input method and system of same |
CN101697109A (en) * | 2009-10-26 | 2010-04-21 | 北京搜狗科技发展有限公司 | Method and system for acquiring candidates of input method |
CN102103416A (en) * | 2009-12-17 | 2011-06-22 | 新浪网技术(中国)有限公司 | Chinese character input method and device |
CN103942190A (en) * | 2014-04-16 | 2014-07-23 | 安徽科大讯飞信息科技股份有限公司 | Text word-segmentation method and system |
CN106202153A (en) * | 2016-06-21 | 2016-12-07 | 广州智索信息科技有限公司 | The spelling error correction method of a kind of ES search engine and system |
CN107390892A (en) * | 2016-05-17 | 2017-11-24 | 富士通株式会社 | The method and apparatus for generating user-oriented dictionary |
CN108259132A (en) * | 2018-01-03 | 2018-07-06 | 重庆邮电大学 | One kind is based on adaptive multiple decoded two-way cooperation cut-in method |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1256650C (en) * | 2004-01-05 | 2006-05-17 | 郑方 | Chinese whole sentence input method |
CN101699385B (en) * | 2008-12-31 | 2015-11-25 | 北京搜狗科技发展有限公司 | A kind of input method interface display method and device |
CN102541282B (en) * | 2010-12-25 | 2016-04-27 | 上海量明科技发展有限公司 | Utilize icon moving to the method, the Apparatus and system that complete vocabulary and edit again |
CN103345308B (en) * | 2013-06-08 | 2016-02-24 | 百度在线网络技术(北京)有限公司 | For inputting the method and apparatus of amendment |
CN104915264A (en) * | 2015-05-29 | 2015-09-16 | 北京搜狗科技发展有限公司 | Input error-correction method and device |
CN105930340A (en) * | 2016-03-31 | 2016-09-07 | 北京奇虎科技有限公司 | Entry error correction method and apparatus based on encyclopedic entries |
CN106250364A (en) * | 2016-07-20 | 2016-12-21 | 科大讯飞股份有限公司 | A kind of text modification method and device |
-
2018
- 2018-08-17 CN CN201810940932.2A patent/CN109085932B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003270985A1 (en) * | 2002-12-23 | 2004-07-08 | Canon Kabushiki Kaisha | Method for Presenting Hierarchical Data |
CN101556596A (en) * | 2007-08-31 | 2009-10-14 | 北京搜狗科技发展有限公司 | Input method system and intelligent word making method |
CN101694608A (en) * | 2008-12-04 | 2010-04-14 | 北京搜狗科技发展有限公司 | Input method and system of same |
CN101697109A (en) * | 2009-10-26 | 2010-04-21 | 北京搜狗科技发展有限公司 | Method and system for acquiring candidates of input method |
CN102103416A (en) * | 2009-12-17 | 2011-06-22 | 新浪网技术(中国)有限公司 | Chinese character input method and device |
CN103942190A (en) * | 2014-04-16 | 2014-07-23 | 安徽科大讯飞信息科技股份有限公司 | Text word-segmentation method and system |
CN107390892A (en) * | 2016-05-17 | 2017-11-24 | 富士通株式会社 | The method and apparatus for generating user-oriented dictionary |
CN106202153A (en) * | 2016-06-21 | 2016-12-07 | 广州智索信息科技有限公司 | The spelling error correction method of a kind of ES search engine and system |
CN108259132A (en) * | 2018-01-03 | 2018-07-06 | 重庆邮电大学 | One kind is based on adaptive multiple decoded two-way cooperation cut-in method |
Non-Patent Citations (4)
Title |
---|
An evaluation of statistical approaches to text categorization;Yang Yiming;《Information retrieval》;第1卷;69-90 * |
Incremental decoding for phrase-based statistical machine translation;Sankaran B. 等;《Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR》;216-223 * |
基于正向反馈的交互式机器翻译技术研究;徐萍;《中国优秀硕士学位论文全文数据库信息科技辑》(第05期);I138-552 * |
智能数字中文输入法的研究与开发;王晓博;《中国优秀硕士学位论文全文数据库信息科技辑》(第03期);I138-6333 * |
Also Published As
Publication number | Publication date |
---|---|
CN109085932A (en) | 2018-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5597255B2 (en) | Ranking search results based on word weights | |
JP6517352B2 (en) | Method and system for providing translation information | |
US9449076B2 (en) | Phrase generation using part(s) of a suggested phrase | |
CN106708799B (en) | Text error correction method and device and terminal | |
US12118336B2 (en) | Method for presenting associated conflict block and device | |
TW200945079A (en) | Search results ranking using editing distance and document information | |
CN111428120B (en) | Information determination method and device, electronic equipment and storage medium | |
AU2017268599A1 (en) | Method, device, server and storage medium of searching a group based on social network | |
JP6185379B2 (en) | RECOMMENDATION DEVICE AND RECOMMENDATION METHOD | |
CN109948122A (en) | Error correction method and device for input text and electronic equipment | |
CN110377684A (en) | A kind of spatial key personalization semantic query method based on user feedback | |
CN106886294B (en) | Input method error correction method and device | |
CN103984754A (en) | Search system and search method | |
CN109085932B (en) | Candidate entry adjustment method, device, equipment and readable storage medium | |
CN114297143A (en) | File searching method, file displaying device and mobile terminal | |
CN111191087A (en) | Character matching method, terminal device and computer-readable storage medium | |
CN109144290B (en) | Candidate entry adjustment method, device, equipment and readable storage medium | |
CN109614542B (en) | Public number recommendation method, device, computer equipment and storage medium | |
JP5971794B2 (en) | Patent search support device, patent search support method, and program | |
CN106610733B (en) | Conventional input method determining method and device and input information determining method and device | |
CN111261165A (en) | Station name identification method, device, equipment and storage medium | |
JPWO2018203510A1 (en) | Question estimator | |
CN103885669B (en) | Cloud candidate input method and mobile terminal | |
CN111310419B (en) | Method and device for updating word rewriting candidate set | |
CN117290325A (en) | Task sequence discovery method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |