CN1952928A - Computer system to constitute natural language base and automatic dialogue retrieve - Google Patents
Computer system to constitute natural language base and automatic dialogue retrieve Download PDFInfo
- Publication number
- CN1952928A CN1952928A CN 200510100419 CN200510100419A CN1952928A CN 1952928 A CN1952928 A CN 1952928A CN 200510100419 CN200510100419 CN 200510100419 CN 200510100419 A CN200510100419 A CN 200510100419A CN 1952928 A CN1952928 A CN 1952928A
- Authority
- CN
- China
- Prior art keywords
- ere
- knowledge
- sentence
- semantic
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This invention relates to a knowledge database establishing and automatically answer index system based on HNC natural language, which extracts ERE knowledge to establish ERE knowledge database and to define and establish knowledge frame to describe frame knowledge data. The two database receives user inquires by natural language through analysis on question sentences and aim ERE structure extraction.
Description
Technical field
The present invention relates to a kind of computer system, relate in particular to the computer system of setting up natural language knowledge base and automatic dialogue retrieve thereof, based on the HNC natural language processing technique, by knowledge base is set up in natural language processing, and accept the query requests of puing question in the natural language mode, provide answer automatically.
Background technology
At present people are used for the instrument such as the query software of Query Information, search engine such as google etc., employing mainly be the keyword coupling, the web page interlinkage analysis, technology such as statistical study are searched the content that the user wants in the information ocean of vastness.But people are difficult to the search intention by the incompatible accurate definition of simple keyword sets oneself, and the search procedure of keyword coupling is not handled the combination of speech meaning, the semantic factors such as semantic relation of statement inside.So often having to spend the plenty of time that the results web page tabulation of huge amount is remake the Artificial Cognition, people seek desirable answer.
So a kind of use natural language definition search intention, the query software that is directly provided the desirable answer of natural language by computing machine will be saved a large amount of time for people.
Existing natural Language Processing technology is come out the linguistics character labeling of the various piece in the statement, this statement and sentence group's mark is still keeping for the natural language dependence in form and the very complicated of natural language form for the expression of knowledge, so be unfavorable for setting up one efficiently, the model of unified knowledge processing.
Summary of the invention
Efficiently the model of unified knowledge processing generates a kind of computer system of setting up the natural language knowledge base to the objective of the invention is to set up one.
A kind of computer system of setting up the natural language knowledge base disclosed by the invention, by the various chapter texts that obtain being carried out the statement mark of HNC natural language processing, also, set up the ERE knowledge base according to from the statement mark of described HNC natural language processing, extracting the ERE knowledge expression; Described ERE knowledge expression is the triple form that comprises E1, E2 and R, and wherein R is equivalent to a logical predicate, the semantic relation between expression E1 and the E2; E1 and E2 can represent any semantic primitive, as the component part of statement, semantic chunk, semantic chunk, combination, word or the another one ERE knowledge expression of word; E1, E2 can be single semantic primitives, also can be the combinations of a plurality of semantic primitives.
A kind of computer system of setting up the natural language knowledge base disclosed by the invention, also set up the knowledge frame structure according to the basis of the ERE knowledge base of from the statement mark of described HNC natural language processing, setting up, the framework knowledge base is set up in requirement according to the definition of framework knowledge base, finds the solution and obtains the desired specific ERE knowledge of framework knowledge base; Described knowledge frame structure is the description center with a class things, and defined attribute (Slot) structure of such things, between each Slot and the description center, ERE semantic relation between the different Slot, and the mode that from ERE knowledge, extracts Slot, described framework knowledge base also defined each knowledge frame the Slot correspondence target ERE feature and from the mode of target ERE clauses and subclauses to the Slot mapping.
A kind of computer system of setting up natural language knowledge base automatic dialogue retrieve disclosed by the invention comprises following treatment step: the first step, interrogative sentence is analyzed the target ERE structure that obtains to comprise target answer requirement; Second step: the evaluating objects notion in chapter to be selected probability of occurrence and chapter to be selected in contain sentence group's distribution situation of target concept, the degree that chapter to be selected is contained target concept is carried out preliminary assessment; In the 3rd step, find the inner semantic primitive that can merge by mode such as refer to, same concept, semantic chunk are shared that disperses of chapter, and relevant ERE is done fusion treatment; In the 4th step, by the similarity degree of COMPREHENSIVE CALCULATING chapter and target ERE structure, each group obtains the integrate score of chapter in the chapter for the answer degree of answer; In the 5th step, return the tabulation of answer chapter according to the integrate score ordering of chapter.
The computer system of setting up natural language knowledge base automatic dialogue retrieve disclosed by the invention, finish the above-mentioned first step following step of execution later on: second step, by the conceptual analysis of target ERE structure being judged the various framework knowledge types that may contain the target answer; In the 3rd step, the concept matching of the ERE relation by target ERE structure and the knowledge entry of notion and framework knowledge base obtains containing the knowledge entry of target answer, and obtains the answer value from the appointment Slot of appointment knowledge entry; The 4th step generated the answer statement, returned the user.。
Patent of the present invention is compared with similar technology in the past, because system accepts the query requests that the user puts question to natural language, makes the user to make things convenient for and the query intention of definition oneself accurately; Because the question sentence analysis is done to the question sentence of inquiry by system, can discern the semantic relation of question sentence and the requirement of target answer; System extracts the semantic formula of ERE on the complex form of language, the feasible trouble that can break away from the linguistic form complicacy for the processing of semanteme; System finds that by the deduction mode of ERE knowledge the implication between the semanteme concerns, the system that makes has good expansible performance for the discovery of latent knowledge in the language; System integrally holds the comprehensive expression of a chapter for the target semanteme by the fusion of the ERE of chapter inside, has strengthened the degree of depth and range for semantic processes; System calculates to mate by the similarity between the ERE combination and seeks the target answer, makes answer meet the requirement of semantic and knowledge.
The present invention sets up the computer system of natural language knowledge base and automatic dialogue retrieve thereof by the HNC natural language processing, ERE extracts, framework knowledge base knowledge is set up three levels natural language is handled, the information that makes progressively is structured to knowledge, and the reluctant natural language information of computing machine progressively is converted to can be flexibly by the knowledge base of Computer Processing.Because the unified succinct expression content that diversified linguistic form contains of the mode of ERE three meta-expressions that the present invention adopts, do not rely on the external expression-form of language, help designing efficiently knowledge processing system at the ERE structure.The present invention is based on the unified knowledge representation pattern of ERE, define the production rule that the implication under a series of various semantic situations is derived, from existing semantic ERE combination, derive and obtain new semantic ERE, thereby deepen the understanding of computing machine for the semanteme of natural language.The invention allows for the semantic technology that merges in a kind of chapter, by finding the same concept in the chapter, refer to and quote, semantic chunk share to wait between the semantic primitive that language phenomenon contained etc. semantic nature, each is merged according to the semantic primitive such as grade at the semantic ERE that the chapter interior location disperses, thereby provide answer accurately.The present invention is by carrying out the interrogative sentence analysis to user's question sentence, target ERE extracts, preliminary assessment to chapter to be selected, the derivation deduction and the identification and matching of object-oriented ERE structure in chapter to be selected, in the framework knowledge base,, finally return to the answer that the user meets semanteme, meets knowledge automatically for the analysis of finding the solution of target ERE.The present invention comes each chapter to be selected of preliminary assessment to contain target answer possibility by the methods such as probability of occurrence of statistics target concept at chapter to be selected, has reduced calculated amount.
Description of drawings
The present invention includes following accompanying drawing:
Fig. 1 is a same concept method for amalgamation processing synoptic diagram in the chapter;
Fig. 2 polarization class ERE and polarization class ERE statement be process flow diagram relatively;
Comparison process flow diagram between Fig. 3 sentence class ERE and the sentence class ERE;
Fig. 4 is based on the automatic question answering process flow diagram of framework knowledge base.
Specific implementation method
Below in conjunction with accompanying drawing the present invention is described in further details.
Technical scheme disclosed by the invention is based on the knowledge base of HNC natural language understanding and sets up and the automatic dialogue retrieve computer system, natural language is a kind of non-structured message form, by using the HNC technology to carry out natural language processing to it, obtain the sentence group of statement, the sentence class, semantic chunk, markup informations such as semantic chunk structure.The present invention uses the ERE expression formula to express semantic knowledge on the basis of this natural language processing.On HNC sentence category analysis (sca) result's basis, extract ERE knowledge, and then make up the ERE knowledge base, definition and fill frame knowledge base.
Earlier to ERE knowledge expression and the explanation of framework knowledge base.ERE (Entity-Relation-Entity) is a kind of knowledge representation mode of triple form, and its R is equivalent to a logical predicate, and the semantic relation between expression E1 and the E2 is by its implication of system definition and computing character.E1 and E2 can represent any semantic primitive, as statement, and semantic chunk, the semantic chunk component part, the combination of word, word, or another one ERE are so ERE can express semanteme by the mode of nested combination.
E1, E2 can be single semantic primitives, also can be a plurality of semantic primitives according to or the combination of relation.Has the hierarchical structure of upwards inheriting between the ERE, if subclass ERE does not state the value of certain attribute, then the property value of parent ERE correspondence will be implicitly inherited,, then the corresponding attribute of parent ERE will be override with the property value of oneself if explicitly has been stated the attribute of oneself.The organization definition of ERE is exemplified below table:
The ERE title: | ModifyERE (polarization relation) |
Parent ERE | RootERE |
The E1 explanation | The polarization for certain attribute represented in the polarization language, as color, and shape, character, the polarization of state etc. |
The R explanation | Polarization relation, ModifyERE has subclass to represent polarization for certain particular community of E2. |
The E2 explanation | By the entity of polarization, can be another one ERE, the perhaps combination of ERE. |
The meaning of whole ERE is described: | ModifyERE has represented the polarization of the represented attribute characteristic of E1 for E2. |
ERE extracts the architectural feature in source and describes | ModifyERE extracts the language construction derive from the polarization relation, with " " 141 notions are the polarization relation of feature.The polarization word of polarization relation is u, x, and the notion of types such as z is p, w, type concepts such as g by the polarization language. |
ERE extracts the rule of mapping | The polarization corresponding E1 that speaks in the polarization of the semantic structure relation is by polarization language (core language) corresponding E2. |
Relevant ERE derivation rule |
The processing procedure that extracts ERE knowledge from HNC sentence category analysis (sca) result comprises: at first obtain the chapter text in online or other guide source, carry out the statement mark of HNC natural language processing again; From HNC sentence class result, extract ERE knowledge; Use ERE reasoning deduce rule, the implicit ERE knowledge that reasoning makes new advances, ERE inference rule is moved in rule-based (Rule Based) inference machine system; The implicit ERE knowledge that makes new advances is deducted in system applies ERE reasoning, perhaps with different expression-forms.
Extracting ERE knowledge from HNC sentence class result comprises:
The abstracting method of sentence class ERE:
The sentence class is divided into the essential sentence class according to unitized construction, mixed sentence class, compound sentence class.
For the essential sentence class, according to the sentence class and the extraction of ERE relation of system definition.
For the mixed sentence class, get two sentence class ERE of each self-corresponding semantic chunk structure of two sentence classes respectively.
As: mixed sentence class XP01*211J=A+XP01+PBC
Extraction is obtained two sentence class ERE:<A X NULL 〉,<NULL P01 PBC 〉
For the compound sentence class, get the counterpart semantic chunk structure sentence class ERE separately of two sentence classes respectively, if share semantic chunk between two sentence classes of compound sentence class, this semantic chunk also will be shared by two sentence class ERE.
As: compound sentence class: (T2b+Y0) * 1J
T2bJ=TA+T2b+TB2
Y0J=YB+Y0+YC
Extraction is obtained two sentence class ERE:<TA T2b TB2 〉,<YB Y02 YC wherein second ERE share first semantic chunk of first ERE.
Major-minor semantic chunk concerns that ERE extracts:
According to the classification of auxilliary semantic chunk, system definition means, instrument, approach, condition, reference, the polarization class ERE of several auxilliary pieces of cause and effect and subclass polarization master piece thereof.Extract according to the sentence category analysis (sca) result is corresponding.
Word notion collocation ERE abstracting method:
Word notion collocation relation has polarization class ERE, logical combination class ERE.
For polarization class ERE, system definition some subclasses of polarization class ERE, extract polarization class ERE accordingly according to the sentence category analysis (sca) result.
Logical combination class ERE has and or relation and the pairing logical combination ERE knowledge expression of other logic class notions (1 genus).
HNC class ERE:
System definition actor, object, content refers to, the sentence slough off, piece extends to several HNC class of its subclass ERE, the result extracts ERE according to sentence category analysis (sca).
The reasoning of ERE has following a few class:
Inference rule between the sentence class ERE:
Inference rule between the sentence class ERE has defined the method for deriving implicit another one sentence class ERE knowledge from a sentence class ERE knowledge.This rule-like is suitable for the semantic resolution with all statements of this class.
Carry out the inference rule of the sentence class ERE of verb (LV genus) grouping according to the derivation characteristic:
Some verb has identical semanteme and contains characteristic, and system contains the derivation characteristic with this class verb according to semanteme and divides into groups, and the definition of the inference rule that should organize after the grouping is identical with the inference rule definition of sentence class ERE.
Sentence class ERE deduces to the reasoning of polarization class ERE:
Sentence class ERE is in order to realize the direct comparison of a class ERE and polarization class ERE and word notion, with computing semantic similarity to the fundamental purpose of the conversion of polarization class ERE.
Just can show as the form that a key element sentence is sloughed off as statement, sentence class ERE can be converted to the polarization class ERE (or ERE combination) of close semanteme.
As statement: Concorde has improved people's travelling speed.
Sentence class ERE:<A X B 〉
Can be converted to polarization class ERE:
<<A actor object B〉polarization X 〉: Concorde is with the raising of people's travelling speed.
Actor object ERE is a kind of of HNC class ERE, expression E1, and E2 is the relation of the actor and the object of same V notion.
The ERE knowledge base set up comprise system with the ERE entry record that extracts among database, can be easily according to the ERE classification, the chapter in source, E1, R, the notion of E2 is set up various index.The ERE clauses and subclauses will write down E1, R, and the quoting and the chapter in ERE clauses and subclauses sources of E2 statement part pairing with it, statement sign ID quotes.
At first need to carry out the interrogative sentence analysis, comprise the target ERE structure that the query center requires obtaining one.Comprise the steps: HNC sentence category analysis (sca) to interrogative sentence; To the analysis at the query center of interrogative sentence with to the extraction of the ERE structure of interrogative sentence.
The HNC sentence category analysis (sca) of interrogative sentence uses HNC sentence category analysis (sca) method, and the method that the ERE structure of interrogative sentence extracts is identical with extraction ERE knowledge from HNC sentence category analysis (sca) structure.
The analysis at the query center of interrogative sentence: answer has different requirements to the different interrogative sentence structures that interrogative guided to target, for the ease of at analysis, system definition two notions: query center, query centre word.The query centre word is meant the word of guiding of yet interrogative and polarization.Has as question sentence: where he gone? its query centre word is local; The query center is meant the structure that interrogative and query centre word are formed.Has as question sentence: where he gone? why local the query center is.System obtains the notion and the desired semantic structure of target answer by the analysis to query center and query centre word, and with the counter structure matching ratio of this and sentence to be selected, as a key factor calculating sentence answer accuracy to be selected.
Relevant interrogative sentence is analyzed as follows table:
Interrogative | Frequency | Typical structure | The query center, the target answer is described |
What who what many; [how] which is where why how; How, how | 913 214 166 112 58 105 15 35 | [J is expressed in statement] be [being j111] what [classification, country, the time, content etc.]? [polarization language] [h$141, h$ug] [people p genus] [being j111] who? [polarization language] [quantitative attribute notion: long, high, speed] [have, be j111] what [notion zz of unit of quantity]? [polarization language] [Jkn] [has, be j111] many [attributive concepts: long, high, greatly, for a long time, the fast u that waits]? any [measure word zz] [p, pe, w, pw, the jw genus, or static g, effect r notion, or class concepts] [J is expressed in statement]? which [a bit] [notion] with class-meaning? where be [J is expressed in statement] [at v50001]? [why] [J is expressed in statement]? [how, how, how] [J is expressed in statement]? | Interrogative " what " and query centre word [classification, country, the time, content etc.] serve as the JK of sentence.Expression is to the query of the semantic chunk that it substituted.Target answer: meet the concept similarity requirement with the query centre word." who " serves as a JK in sentence, target answer: p, pe interrogative " how much " replaces the polarization of quantity, and expression is to the query of quantity.The target answer: number j3 interrogative " many " substitutes the quantity description query centre word is carried out polarization, expression is to quantity, the query of degree.Target answer: number j3, or the notion of expression amount.J41, requiring of unit of quantity's notion needs of jzu41 and target answer and query center is corresponding.A JK of question sentence is often served as in the query centre word combination of interrogative " where " and institute's polarization.Target answer: concept and range of query centre word ordinary representation that " where " guided, the notion of a classification, the target answer is concrete concept normally, proper noun etc." which " is a special interrogative, and the answer of its requirement is not one, but satisfactory a plurality of answer.The auxilliary piece FK in place is served as at the query center, and the target answer: concept type is the wj2 genus.Interrogative " why " alternative reason for the E piece in sentence, the polarization of purpose etc., expression is for reason Pr, the query of purpose Rt.Target answer: the semantic component that has corresponding semantic relation structure with question sentence.Expression is for means Ms, approach Wy, instrument In, the query of polarization E pieces such as condition C n.Target answer: the semantic component that has corresponding semantic relation structure with question sentence. |
ERE structure applications reasoning deduce rule for interrogative sentence obtains one group of new target ERE structure.
In the process that waits semantic interrogative sentence target ERE structure that the derivation of application reasoning deduce rule makes new advances, applied reasoning deduce rule must meet symmetry and require.The new ERE structure of also promptly deriving is to be to have identical semanteme with the ERE structure of interrogative sentence, multi-form ERE, rather than the new ERE knowledge that contains.
To same concept fusion treatment in the chapter, be the synoptic diagram of fusion treatment then as Fig. 1.The process that merges just is to use inner these of target concept A replacement chapter to have the ERE semantic primitive of identical semantic expressiveness, and processing procedure comprises: find that semantic chunk is shared, the same concept of modes such as the identical word notion of diverse location to refer to.The ERE that identical therewith notion is correlated with uses target concept to replace this same concept.
Doing the reasoning deduction of object-oriented ERE structure on the basis of merging again analyzes.Method is as follows: from chapter to be selected merge about constantly deducing forward the ERE structure of target concept, when the process of reasoning can not produce new ERE knowledge, stop.Inference method is identical with the reasoning in the ERE extraction process.
In the ERE result of merging also reasoning deduction, seek the coupling ERE structure identical, similar then, and calculate the similarity between ERE to be selected and the target ERE with target ERE structure.Similarity calculating method between the ERE (combination) is as follows:
ERE similarity in the automatic question answering process relatively, with target ERE (combination) target as a comparison, obtain ERE to be selected (combination) with respect to the similarity between the target ERE (combination) by calculating ERE more to be selected (combination) and the target ERE difference between making up.The semantic ERE that carries out similarity calculating has two kinds, a polarization class ERE and a sentence class ERE.The mode that single ERE compares mutually has: the comparison of polarization class ERE and polarization class ERE, the comparison between sentence class ERE and the sentence class ERE, the comparison between sentence class ERE and the polarization class ERE.Below discuss it respectively:
Comparison flow process such as Fig. 2 of polarization class ERE and polarization class ERE.Computing method are as follows:
simLean(t,b)=recur(simConp(tCore,bCore)*βcore+simConp(tLean,bLean)*βlean)
SimLean: the similarity of polarization class ERE.
T: target ERE.
B: ERE to be selected.
Recur: the function of the nested polarization ERE of recursive calculation.
β core: the weight parameter of the core of this polarization class ERE.
β lean: the weight parameter of the polarization part of this polarization class ERE.
SimConp:b with respect to concept similarity.
Comparison between sentence class ERE and the polarization class ERE.In the comparison of carrying out between a class ERE and the sentence class ERE, when certain semantic chunk of a side wherein by sentence slough off, sentence class ERE such as piece expansion is during nested serving as, and just need carry out the comparison of a class ERE and polarization class ERE.Because semanteme and the structure of the polarization ERE that subordinate clause class ERE converts all are different from common polarization ERE, so system is referred to as broad sense polarization ERE with these polarizations ERE.The viewpoint definition that system intersects from actor, object, content and the combination of E piece the rule of sentence class ERE to broad sense polarization class ERE conversion.The reason that the ERE that concerns between these actor, object, content and the E piece also ranges broad sense polarization class ERE is that this class ERE can use the polarization linguistic form of " XX of XX " to express.As: she beats me.
<A?X?B>
The rule of correspondence can be converted to:
<<E?EObject?B>?subjectIs?A>
Beat I she.
It is identical with comparative approach between the polarization ERE to change later comparative approach.
Comparison between sentence class ERE and the sentence class ERE.
Each HNC sentence class expression formula can be converted to the sentence class ERE expression formula of (or a group), comparison process flow diagram such as Fig. 3 between the sentence class ERE, and its process is:
The upward dress of E piece can be categorized as according to its semantic action:
Basic decision logic is modified.
Logic of language is modified.
Tense is modified.
The explanation of space or effect.
Attribute is modified.
The following dress of E piece has according to its semantic action classification:
Tense is modified.
The explanation in effect and space.
Attribute is modified.
Going up between the dress of two E pieces, between the following dress, and the comparability between the upload and download is judged by whether the two semantic action classification is identical.
The situation of constituting of the Ek of E piece has:
Ek=E
Ek=EQ+EH
Ek=EQ+E
Ek=E+EH
For the E piece of unitized construction, principle relatively is, with EQ and EQ relatively, E and E comparison, EH and EH are relatively.For the array mode and the comparison that has the Ek array mode of E of EQ+EH,, then will move part and compare with E if EQ+EH is the sound collocation.
The similarity calculating method of the R part (the E piece of corresponding HNC sentence category analysis (sca)) among the sentence class ERE is as follows: and simE (t, b)=(∑ simUD (t, b) * β ud-∑ difUD (t, b) * β ud)+(∑ simEk (t, b) * β ek-∑ difEk (t, b) * β ek)
SimE: the similarity comparative approach between two E pieces is as follows:
T: target E piece.
B: E piece to be selected.
Similarity on the simUD:E piece between dress or the following dress is computing function relatively.
β ud: the weight parameter that goes up dress or adorn down.
Difference computing function on the difUD:E piece between dress or the following dress.
The similarity computing function of simEk:Ek.
The weight parameter of each ingredient of β ek:Ek.
The difference computing function of dif:k.
The computing method of sentence class ERE are as follows:
simSERE(t,b)=simE(t,b)+∑simC(t,b)*βc
SimSERE: the similarity of two sentence class ERE.
T: target sentence class ERE.
B: sentence class ERE to be selected.
SimC: the similarity between the generalized object semantic chunk, (t b) calculates by polarization class ERE similarity simLean.
β c: the weight parameter of this semantic chunk.
Next step calculates, and each group calculates the similarity degree of chapter and target ERE structure for the answer degree of answer in the chapter, and chapter integral body is for the answer degree of target answer.The chapter of integrate score similarity degree and chapter integral body by COMPREHENSIVE CALCULATING chapter and target ERE structure can access to(for) the answer degree of target answer.
synAtc(t,b)=syn(∑simERE(t,b)*βere,satfAns(t,b),quaAtc)
SynAtc: the integrate score for interrogative sentence target ERE structure of a chapter.
T: interrogative sentence target ERE structure.
B: chapter to be selected.
Syn: COMPREHENSIVE CALCULATING chapter scoring function.
SimERE: each ERE to be selected is with respect to the similarity of the target ERE of correspondence, as sentence class ERE, and the result of calculation of polarization class semantic chunk ERE.
β ere: the weight parameter of this ERE.
SatfAns: the ERE combination is for the answer degree of target answer in the chapter.
QuaAtc: the preliminary assessment of chapter internal object notion distributed mass.
Return to the tabulation of user's answer chapter at last according to the integrate score descending sort of chapter, and show that under each chapter link this chapter is for several higher groups of answer answer degree.
To selecting the step that chapter contains the preliminary assessment of target answer degree to be:
In all word notions of interrogative sentence, remove 1 logic of language genus, keep other notions as target concept.
Search target concept in the probability of occurrence of chapter to be selected and the distribution situation in statement.
Estimate the sentence group that each contains target concept.
The sentence group of containing target concept herein is defined as: front and back next-door neighbour's the statement set that contains target concept.
quaSC=tgtSC/tTgt*βs+βf
QuaSC: the sentence group is contained the quality of target concept distribution situation.
TgtSC: the number that contains target concept.
TTgt: the total number of target concept.
β s: the parameter that obtains by this grouping statement number.
β f: the parameter that obtains by the total degree of this group's object appearing notion.
The possibility that the sentence group of containing target concept that analysis position is disperseed is merged by the mode of referring to.
proSC=(quaSC1+quaSC2)*p(distS)
ProSC: two sentence groups of containing target concept quote mutually by the mode of referring to and have possibility, and proSC represents that this possibility contains the gain of the quality of target answer for chapter.
P: the sentence group of the space that obtains by statistics is by referring to the possibility computing function of quoting mutually.
DistS: the number of the statement at institute interval between two sentence groups of containing target concept.
The target concept of comprehensive evaluation chapter distributes.
quaAtc=[∑(quaSC)+∑(proSC)]*dTgt(tgtS,tS)
QuaAtc: chapter contains the quality of target answer.
DTgt: the target concept density calculation function of chapter inside.
TgtS: the statement number that contains target concept.
TS: chapter statement number altogether.
According to the chapter quality-ordered.
The framework structure of knowledge defines the attribute list of a class things, and this things is called the description center, and the attribute representation of this things is Slot.The type mode of Slot has: monodrome structure, unitized construction, enumerate structure, nesting frame structure and composite structure.
The monodrome structure be exactly a Slot corresponding simply the value of an attribute; Unitized construction refers to the combination of the corresponding a plurality of values of Slot; Enumerate structure and refer to that the Slot correspondence the formation of an identical value structure; The nesting frame structure refers to that the value of a Slot represented by the another one knowledge frame; Composite structure refers to that the value of Slot combines by above four kinds of structures are mixed.
Knowledge frame has defined between each Slot and the description center, the ERE semantic relation between the different Slot.The knowledge frame definition is exemplified below table:
The knowledge frame title | The personage | ||||
Slot defines tabulation | The Slot title | The organization definition of Slot: | Slot extracts recognition rule | The ERE knowledge source is to the rule of SLOT mapping | With the description center, the ERE of the relation of other Slot represents. |
Name | The unitized construction surname: Name | Meet the name recognition feature | |||
National | Monodrome structure nationality | The Pj52 genus | |||
The education experience | Composite structure 1. zero-times Concluding time School Degree 2. zero-time Concluding time School Learn | The statement of expression description center (personage) education experience. |
The position (degree is another framework structure of knowledge) |
Automatic question answering processing procedure based on the framework knowledge base is:
Interrogative sentence is analyzed.
The interrogative sentence analysis comprises the steps:
HNC sentence category analysis (sca) to interrogative sentence.
Analysis to the query center of interrogative sentence.
Extraction to the ERE structure of interrogative sentence.
The analytical structure of interrogative sentence will obtain one and comprise the target ERE structure that the query center requires.
For the ERE structure applications reasoning deduce rule of interrogative sentence, obtain one group of new target ERE structure that waits semanteme.
By the conceptual analysis of target ERE structure being judged the framework knowledge type that may contain the target answer.
Require the necessary description center of target concept to have generic relation with knowledge frame.
Concern that by the ERE between the inner Slot of target ERE and knowledge frame classification mates the SLOT that judges the framework knowledge that may contain the target answer.
By the ERE relation of the SLOT of knowledge frame is done object-oriented ERE structure deduction seek target ERE structure.
The concept matching of the notion by target ERE structure and the knowledge entry of framework knowledge base obtains containing the knowledge entry of target answer.
Obtain the answer value from the appointment SLOT that specifies knowledge entry.
Generate the answer statement, return the user.
The present invention adopted a series of technology solve based on the knowledge base of natural language understanding technology set up and automatic question answering in the problem that runs into:
With natural language expressing is formalization knowledge.The present invention has adopted a kind of mode of ERE three meta-expressions to express the semantic knowledge of natural language.Single ERE expression formula and ERE combination has powerful semantic meaning representation ability, and can extract mapping easily from the sentence category analysis (sca) result of HNC and obtain the ERE semantic formula of statement.The ERE expression formula does not rely on the external expression-form of language, contains inherent semantic and use unified succinct mode to express diversified linguistic form.ERE and ERE combination is a kind of unified semantic meaning representation structure, helps designing efficiently the knowledge processing system at the ERE structure.
Phase mutual implication between the semantic knowledge is derived.In the time of the human brain understanding language, can recognize very habitually that a semanteme is containing the another one semanteme natch, such as statement: he has arrived Shanghai afternoon that day by train.The semanteme of its implication has: the locus in his afternoon that day is in Shanghai.Based on the unified knowledge representation pattern of ERE, system can define the production rule that the implication under a series of various semantic situations is derived, and deriving from existing semantic ERE combination obtains new semantic ERE.Thereby deepen the understanding of computing machine for the semanteme of natural language.
The fusion of ERE knowledge in the chapter.Chapter often is distributed in the various piece of chapter for the content of the description of something or other, such as the time that this things is described in opening paragraph, has explained its other characteristics behind the space-number paragraph.People can synthetically understand the various piece of chapter for the expressed semanteme of this things.The present invention proposes the semantic technology that merges in a kind of chapter, by finding the same concept in the chapter, refer to and quote, semantic chunk share to wait between the semantic primitive that language phenomenon contained etc. semantic nature, with each semantic ERE that disperses at the chapter interior location according to waiting semantic primitive to merge.
Automatically answer the problem that the user proposes with natural language.The information inquiry user wishes to use more natural, and mode defines the search request of oneself more accurately, and hope can access at semanteme, meets the answer of query intention on the knowledge aspect exactly, and is not only the answer tabulation that matches keyword.The present invention is by carrying out the interrogative sentence analysis to user's question sentence, target ERE extracts, preliminary assessment to chapter to be selected, the derivation deduction and the identification and matching of object-oriented ERE structure in chapter to be selected, in the framework knowledge base for the analysis of finding the solution of target ERE, finally return to the user and meet semanteme, meet the answer of knowledge.
Chapter to be selected contains the preliminary assessment of target answer degree.Owing to the process calculated amount of the derivation deduction of a chapter being done object-oriented ERE structure and identification and matching is bigger, so system obtains one according to the chapter sequence that contains the ordering of target answer possibility, to do deep semantic analysis at the chapter that most possibly contains the target answer by a preliminary assessment of chapter to be selected is handled.The present invention is by the probability of occurrence of statistics target concept at chapter to be selected, analysis contains the sentence public sentiment condition of target concept, the target concept of disperseing comes each chapter to be selected of preliminary assessment to contain target answer possibility by the possibility that mode such as refer to, same concept, semantic chunk are shared merges.
Identification and matching between the ERE structure.System uses the ERE expression formula to express knowledge, so the process of seeking for target knowledge in the automatic question answering process is exactly the process of the identification and matching between the ERE structure, the present invention adopts the different coupling account form that meets semantic requirements to various ERE, partly adopt different account forms for different language constructions, propose the method for computing for the ERE structure of various nested combinations, thereby calculated the semantic similarity between ERE structure to be selected and the target ERE structure.
Sentence class ERE is to conversion and the comparison of polarization class ERE.In the matching process of ERE, can be divided into a class ERE and polarization class ERE according to the classification of semantic structure.So when needs carry out semantic similarity relatively the time with sentence class ERE and polarization class ERE, the two need be converted to identical version so that comparison, the present invention proposes a kind of method that sentence class ERE is converted to polarization class ERE, by define various sentence class ERE to polarization class ERE conversion could property, mode with conversion is converted to one or one group of ERE structure with sentence class ERE.So polarization class ERE can by partly or integrally and sentence class ERE mate the semantic similarity that calculates between polarization class ERE and the sentence class ERE.
Based on the problem of seeking the target answer in the framework knowledge base.The framework knowledge base is a kind of more structurized knowledge representation model, and compare the loose knowledge expression structures in ERE storehouse it describes the knowledge of certain class things from all angles.System by the definition frame knowledge base the description center and knowledge frame in ERE between each groove (Slot) concern the semantic structure of explaining knowledge frame.In the process based on the automatic question answering of framework knowledge base, the ERE of system by coupling knowledge objective ERE structure and framework knowledge base inside concerns and seeks the target answer.
The natural language processing method of native system is as follows:
1.1.1 the statement step of hierarchical network of concepts speech level knowledge base is as follows:
1.1.1.1 statement is divided into 7 essential sentence classes by semanteme, the effect sentence, the process sentence shifts sentence, the effect sentence, the relation sentence, sentence judged in the state sentence; According to the dependence power of semantic chunk with the sentence class, semantic chunk is divided into main semantic chunk and auxilliary semantic chunk, wherein auxilliary semantic chunk comprises: condition, means, instrument, approach, reference, because of, really; From its common feature main semantic chunk is divided into: feature semantic chunk, actor, object, content: the general physical representation formula of setting up semantic chunk: SK=individual character+general character=sentence category information+semantic chunk type information; When the feature semantic chunk of sentence comprises the feature of two essential sentence classes, constitute mixed sentence; When explaining the feature of two or more essential sentence classes with two or more feature semantic chunks in the sentence, constitute the compound sentence class; Form with symbol is come out above-mentioned information representation, forms knowledge base.
1.1.1.2 for the vocabulary in the knowledge base, if its concept classification contains V, determine the node effect sentence Φ 0 of its correspondence according to the semantic knowledge of itself, process sentence Φ 1, shift sentence Φ 2, effect sentence Φ 3, relation sentence Φ 4 and state sentence Φ 5 and generally judge among sentence class Φ 8 and other subclasses jl1 that judges sentence and be the main contents of representative determine that according to corresponding situation vocabulary belongs to any of 7 essential sentence classes; Sentence category code in corresponding 57 subclasses; If being main contents, the semanteme of this word comprises the aforementioned nodes of two correspondences, then pressing the mixed sentence class handles, the code of mixed sentence class is decided to be approximately, first semantic chunk begins in two essential sentence classes of mixed sentence class to constitute, K represents total number of non-E semantic chunk, n represents the start sequence number of the semantic chunk that takes out from second essential sentence class E2, when n=m+1, n can not write, to causing the vocabulary of compound sentence class, add * number with in the middle of the code E1 of two essential sentence classes forming the compound sentence class and the E2, fill in a category information: when analysis, can from notion aspect sentence class expression knowledge base, take out the format indication of two sentence classes according to the indication of E1 and E2.
1.1.1.3 when the sentence category code was effective, according to the sentence category code in (1.2.1.2), a concrete definite sentence class belongs to two sentences, three sentences and four sentences: determine that specifically way is as follows: the general mathematical expression of statement can be write as:
Connect the feature semantic chunk of sentence behind the number one generalized object semantic chunk JK, connect second generalized object semantic chunk after again, connect the 3rd generalized object semantic chunk after again, all the other generalized object semantic chunks are listed in proper order:
Do not limit the number of generalized object semantic chunk JK in the expression, but for the essential sentence class, the practical natural language only need consider that the JK number is 1,2,3 situation, they are respectively corresponding to two main piece sentences, three main piece sentences and four main piece sentences:
For four main piece sentences, JK2 is necessarily based on object B, JK3 is necessarily based on content C, for three main piece sentences, B or C can serve as the main body of JK: for two main piece sentences, can not have E, but at this moment JK2 must be based on C, the narrative order that master's semantic chunk often adopts during according to this word group sentence is determined the format code that it is concrete; Provide this phrase form that the period of the day from 11 p.m. to 1 a.m often adopts that forms a complete sentence with the form of code: in the time of a plurality of form, with [1] [2] ... form label, so that the different situations of corresponding expression different-format in every below: often adopt standard format and cannonical format during as the composition sentence, this can not filled out.
1.1.1.4 when the sentence category code is effective, this word is according in (b) during sentence category code group sentence, if the expection relation is arranged between this word and the generalized object semantic chunk, promptly this word requires specific notion to serve as its certain generalized object semantic chunk, then that this is specific, the method that notion preferential and this collocations is chatted by F=∑ (alphabetic string) (numeric string) provides: this expection, comprise expection: at this moment to certain composition in the generalized object semantic chunk structure, at first the configuration information to semantic chunk is described, provide the preferential notion of tie element then, Yi @S representative in the formation knowledge of semantic chunk and the preferential notion unit of each component part, the formation knowledge of JK semantic chunk, with=and+fill in this: the preferential conceptual knowledge of each several part, with: expression, also fill in this: if the sentence that the v notion constitutes often requires a sentence to become wherein semantic chunk, if vocabulary has this knowledge, just this represents respectively that with JK=J and JK:=J a certain semantic chunk JK must be expanded into sentence and maybe may be expanded into sentence in knowledge base: about definite opinion (vi).
The component part of semantic chunk or semantic chunk can be from partitioning object B on the intension and two parts of content C, also can be from be divided into preceding Q in form, two parts of back H: belong to agreement for this formation, need not to write out expression formula, only encourage after this semantic chunk or component part, to add above-mentioned four each letter bs C explicit, Q, H provides its preferential notion, just represents that this formation exists, and illustrates the preferential notion of this part simultaneously.
1.1.1.5 if a part and other parts that the generalized object semantic chunk of describing is (iv) formed in the structure are not to be right after together, but appear at respectively on two positions that separate of statement, this situation is represented in semantic chunk constitutes, may be separated the part of necessarily separating with semantic chunk with [] with () expression semantic chunk respectively
1.1.1.6, during simultaneously according to the sentence category code group sentence that provides, require a sentence to serve as its a certain generalized object semantic chunk when sentence category code when effective, this situation is indicated, promptly provide the knowledge that some semantic chunk that this vocabulary causes expands to statement,
1.1.2 determine that the concrete steps of sentence category analysis (sca) are as follows:
1.1.2.1, carry out the dictionary coupling to the sentence of input, be syncopated as the speech that runs in the sentence, from knowledge base, obtain the semantic knowledge of these vocabulary:
1.1.2.2, be foundation with semantic chunk differentiation designator 10 genuses and verb v notion according to the indication of concept classification information, form the semantic chunk blank, form the E hypothesis:
If, turn to (1.2.2.9) 1.1.2.3 fail to form the E hypothesis; Otherwise continue;
1.1.2.4 whole E are supposed to screen to close queuing, mainly utilize information to be: sentence category code, format code.
1.1.2.5, carry out a class check successively according to the ordering of selected E hypothesis; Mainly utilize information to be: the preferred sex knowledge of the notion of semantic chunk core: if the one-hundred-percent inspection failure turns to (1.2.2.10); Otherwise continue;
Constitute check 1.1.2.6 carry out semantic chunk; The main information of utilizing is: semantic chunk constitutes knowledge and constitutes the knowledge of the preferential notion of semantic chunk each several part, if the one-hundred-percent inspection failure turns to (1.2.2.10); Otherwise continue;
Separate check 1.1.2.7 carry out semantic chunk where necessary, the main information of utilizing is: the sentence class conversion knowledge that vocabulary causes.
1.1.2.8 there is not the check of E semantic chunk sentence class:, otherwise turn to (1.2.2.10) if failure continues:
1.1.2.9 do the E hypothesis again, successfully turn to (1.2.2.4).
1.1.2.10 record final analysis result.
The natural language processing method of native system is as follows:
1.1.3 the statement step of hierarchical network of concepts speech level knowledge base is as follows:
1.1.3.1 statement is divided into 7 essential sentence classes by semanteme, the effect sentence, the process sentence shifts sentence, the effect sentence, the relation sentence, sentence judged in the state sentence; According to the dependence power of semantic chunk with the sentence class, semantic chunk is divided into main semantic chunk and auxilliary semantic chunk, wherein auxilliary semantic chunk comprises: condition, means, instrument, approach, reference, because of, really; From its common feature main semantic chunk is divided into: feature semantic chunk, actor, object, content: the general physical representation formula of setting up semantic chunk: SK=individual character+general character=sentence category information+semantic chunk type information; When the feature semantic chunk of sentence comprises the feature of two essential sentence classes, constitute mixed sentence; When explaining the feature of two or more essential sentence classes with two or more feature semantic chunks in the sentence, constitute the compound sentence class; Form with symbol is come out above-mentioned information representation, forms knowledge base.
1.1.3.2 for the vocabulary in the knowledge base, if its concept classification contains V, determine the node effect sentence Φ 0 of its correspondence according to the semantic knowledge of itself, process sentence Φ 1, shift sentence Φ 2, effect sentence Φ 3, relation sentence Φ 4 and state sentence Φ 5 and generally judge among sentence class Φ 8 and other subclasses jl1 that judges sentence and be the main contents of representative determine that according to corresponding situation vocabulary belongs to any of 7 essential sentence classes; Sentence category code in corresponding 57 subclasses; If being main contents, the semanteme of this word comprises the aforementioned nodes of two correspondences, then pressing the mixed sentence class handles, the code of mixed sentence class is decided to be approximately, first semantic chunk begins in two essential sentence classes of mixed sentence class to constitute, K represents total number of non-E semantic chunk, n represents the start sequence number of the semantic chunk that takes out from second essential sentence class E2, when n=m+1, n can not write, to causing the vocabulary of compound sentence class, add * number with in the middle of the code E1 of two essential sentence classes forming the compound sentence class and the E2, fill in a category information: when analysis, can from notion aspect sentence class expression knowledge base, take out the format indication of two sentence classes according to the indication of E1 and E2.
1.1.3.3 when the sentence category code was effective, according to the sentence category code in (1.2.1.2), a concrete definite sentence class belongs to two sentences, three sentences and four sentences: determine that specifically way is as follows: the general mathematical expression of statement can be write as:
Connect the feature semantic chunk of sentence behind the number one generalized object semantic chunk JK, connect second generalized object semantic chunk after again, connect the 3rd generalized object semantic chunk after again, all the other generalized object semantic chunks are listed in proper order:
Do not limit the number of generalized object semantic chunk JK in the expression, but for the essential sentence class, the practical natural language only need consider that the JK number is 1,2,3 situation, they are respectively corresponding to two main piece sentences, three main piece sentences and four main piece sentences:
For four main piece sentences, JK2 is necessarily based on object B, JK3 is necessarily based on content C, for three main piece sentences, B or C can serve as the main body of JK: for two main piece sentences, can not have E, but at this moment JK2 must be based on C, the narrative order that master's semantic chunk often adopts during according to this word group sentence is determined the format code that it is concrete; Provide this phrase form that the period of the day from 11 p.m. to 1 a.m often adopts that forms a complete sentence with the form of code: in the time of a plurality of form, with [1] [2] ... form label, so that the different situations of corresponding expression different-format in every below: often adopt standard format and cannonical format during as the composition sentence, this can not filled out.
1.1.3.4 when the sentence category code is effective, this word is according in (b) during sentence category code group sentence, if the expection relation is arranged between this word and the generalized object semantic chunk, promptly this word requires specific notion to serve as its certain generalized object semantic chunk, then that this is specific, the method that notion preferential and this collocations is chatted by F=∑ (alphabetic string) (numeric string) provides: this expection, comprise expection: at this moment to certain composition in the generalized object semantic chunk structure, at first the configuration information to semantic chunk is described, provide the preferential notion of tie element then, Yi @S representative in the formation knowledge of semantic chunk and the preferential notion unit of each component part, the formation knowledge of JK semantic chunk, with=and+fill in this: the preferential conceptual knowledge of each several part, with: expression, also fill in this: if the sentence that the v notion constitutes often requires a sentence to become wherein semantic chunk, if vocabulary has this knowledge, just this represents respectively that with JK=J and JK:=J a certain semantic chunk JK must be expanded into sentence and maybe may be expanded into sentence in knowledge base: about definite opinion (vi).
The component part of semantic chunk or semantic chunk can be from partitioning object B on the intension and two parts of content C, also can be from be divided into preceding Q in form, two parts of back H: belong to agreement for this formation, need not to write out expression formula, only encourage after this semantic chunk or component part, to add above-mentioned four each letter bs C explicit, Q, H provides its preferential notion, just represents that this formation exists, and illustrates the preferential notion of this part simultaneously.
1.1.3.5 if a part and other parts that the generalized object semantic chunk of describing is (iv) formed in the structure are not to be right after together, but appear at respectively on two positions that separate of statement, this situation is represented in semantic chunk constitutes, may be separated the part of necessarily separating with semantic chunk with [] with () expression semantic chunk respectively
1.1.3.6, during simultaneously according to the sentence category code group sentence that provides, require a sentence to serve as its a certain generalized object semantic chunk when sentence category code when effective, this situation is indicated, promptly provide the knowledge that some semantic chunk that this vocabulary causes expands to statement,
1.1.4 determine that the concrete steps of sentence category analysis (sca) are as follows:
1.1.4.1, carry out the dictionary coupling to the sentence of input, be syncopated as the speech that runs in the sentence, from knowledge base, obtain the semantic knowledge of these vocabulary:
1.1.4.2, be foundation with semantic chunk differentiation designator 10 genuses and verb v notion according to the indication of concept classification information, form the semantic chunk blank, form the E hypothesis:
If, turn to (1.2.2.9) 1.1.4.3 fail to form the E hypothesis; Otherwise continue;
1.1.4.4 whole E are supposed to screen to close queuing, mainly utilize information to be: sentence category code, format code.
1.1.4.5, carry out a class check successively according to the ordering of selected E hypothesis; Mainly utilize information to be: the preferred sex knowledge of the notion of semantic chunk core: if the one-hundred-percent inspection failure turns to (1.2.2.10); Otherwise continue;
Constitute check 1.1.4.6 carry out semantic chunk; The main information of utilizing is: semantic chunk constitutes knowledge and constitutes the knowledge of the preferential notion of semantic chunk each several part, if the one-hundred-percent inspection failure turns to (1.2.2.10); Otherwise continue;
Separate check 1.1.4.7 carry out semantic chunk where necessary, the main information of utilizing is: the sentence class conversion knowledge that vocabulary causes.
1.1.4.8 there is not the check of E semantic chunk sentence class:, otherwise turn to (1.2.2.10) if failure continues:
1.1.4.9 do the E hypothesis again, successfully turn to (1.2.2.4).
1.1.4.10 record final analysis result.
Claims (10)
1. computer system of setting up the natural language knowledge base, by the various chapter texts that obtain being carried out the statement mark of HNC natural language processing, it is characterized in that, also, set up the ERE knowledge base according to from the statement mark of described HNC natural language processing, extracting the ERE knowledge expression; Described ERE knowledge expression is the triple form that comprises E1, E2 and R, and wherein R is equivalent to a logical predicate, the semantic relation between expression E1 and the E2; E1 and E2 can represent any semantic primitive, as the component part of statement, semantic chunk, semantic chunk, combination, word or the another one ERE knowledge expression of word; E1, E2 can be single semantic primitives, also can be the combinations of a plurality of semantic primitives.
2. the computer system of the right language knowledge base of foundation according to claim 1 is characterized in that, extracts the ERE knowledge expression in the described statement mark according to the HNC natural language processing, comprises and extracts sentence class ERE; Extract the ERE of auxilliary piece and main piece relation; And extract the ERE that concerns between inner each word notion of semantic chunk, and by writing down the E1 of described ERE knowledge expression, R, the quoting of E2 statement part pairing with it, the chapter in ERE source, reference statement identifies ID and sets up the ERE knowledge base.
3. the computer system of setting up the natural language knowledge base according to claim 2, it is characterized in that, described statement mark according to the HNC natural language processing extracts the ERE knowledge expression, comprise that also using the ERE reasoning deducts the implicit ERE knowledge that makes new advances, with the different expression-form of same ERE knowledge; Above-mentioned application ERE reasoning is deducted the implicit ERE knowledge expression that makes new advances and comprised: the reasoning between the sentence class ERE is deduced, and the reasoning deduction and the sentence class ERE that carry out the sentence class ERE of verb grouping according to the derivation characteristic deduce to the reasoning of polarization class ERE; And, be filled in the ERE knowledge base according to the organization definition of ERE with the reasoning resulting new ERE knowledge of deducing.
4. according to claim 1, the 2 or 3 described computer systems of setting up the natural language knowledge base, it is characterized in that, also set up the knowledge frame structure according to the basis of the ERE knowledge base of from the statement mark of described HNC natural language processing, setting up, the framework knowledge base is set up in requirement according to the definition of framework knowledge base, finds the solution and obtains the desired specific ERE knowledge of framework knowledge base; Described knowledge frame structure is the description center with a class things, and defined attribute (Slot) structure of such things, between each Slot and the description center, ERE semantic relation between the different Slot, described framework knowledge base also defined each knowledge frame the Slot correspondence target ERE feature and from the mode of target ERE clauses and subclauses to the Slot mapping.
5. the computer system of setting up the natural language knowledge base according to claim 4, it is characterized in that attribute (Slot) structure of described description center knowledge frame comprises: monodrome structure, unitized construction, enumerate structure, nesting frame structure and composite structure; The monodrome structure be exactly a Slot corresponding simply the value of an attribute; Unitized construction refers to the combination of the corresponding a plurality of values of Slot; Enumerate structure and refer to that the Slot correspondence the formation of an identical value structure; The nesting frame structure refers to that the value of a Slot represented by the another one knowledge frame; The value that composite structure is meant Slot by monodrome structure, unitized construction, enumerate structure and four kinds of structures of nesting frame structure are mixed combines.
6. a computer system of setting up natural language knowledge base automatic dialogue retrieve is characterized in that, comprises following treatment step: the first step, interrogative sentence is analyzed the target ERE structure that obtains to comprise target answer requirement; Second step: the evaluating objects notion in chapter to be selected probability of occurrence and chapter to be selected in contain sentence group's distribution situation of target concept, the degree that chapter to be selected is contained target concept is carried out preliminary assessment; In the 3rd step, find the inner semantic primitive that can merge by mode such as refer to, same concept, semantic chunk are shared that disperses of chapter, and relevant ERE is done fusion treatment; In the 4th step, by the similarity degree of COMPREHENSIVE CALCULATING chapter and target ERE structure, each group obtains the integrate score of chapter in the chapter for the answer degree of answer; In the 5th step, return the tabulation of answer chapter according to the integrate score ordering of chapter.
7. the computer system of setting up natural language knowledge base automatic dialogue retrieve according to claim 6, it is characterized in that, behind described the 3rd EOS, also being included in the reasoning of doing object-oriented ERE structure on the basis of fusion deduces and analyzes, constantly deduce forward from the ERE structure that chapter to be selected merges about target concept, obtain ERE how to be selected, when the process of reasoning can not produce new ERE knowledge, stop, carrying out for the 4th step again.
8. the computer system of setting up natural language knowledge base automatic dialogue retrieve according to claim 6, it is characterized in that, analysis comprises the described first step to interrogative sentence: to the sentence category analysis (sca) of interrogative sentence, to the analysis at the query center of interrogative sentence, to the extraction of the ERE structure of interrogative sentence, with ERE structure applications reasoning transformation rule, obtain the target ERE structure of an expanded range for interrogative sentence.
9. the computing system of setting up natural language knowledge base automatic dialogue retrieve according to claim 6, it is characterized in that, finish the first step following step of execution later on: second step, by the conceptual analysis of target ERE structure being judged the various framework knowledge types that may contain the target answer; In the 3rd step, the concept matching of the ERE relation by target ERE structure and the knowledge entry of notion and framework knowledge base obtains containing the knowledge entry of target answer, and obtains the answer value from the appointment Slot of appointment knowledge entry; The 4th step generated the answer statement, returned the user.
10. the computer system of setting up natural language knowledge base automatic dialogue retrieve according to claim 9, it is characterized in that, the method of finishing described the 3rd step is that the ERE that possible contain the framework knowledge base Slot of target answer concerns the deduction of doing object-oriented ERE structure, obtains containing in the knowledge base Slot of target answer again by coupling identification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200510100419 CN1952928A (en) | 2005-10-20 | 2005-10-20 | Computer system to constitute natural language base and automatic dialogue retrieve |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200510100419 CN1952928A (en) | 2005-10-20 | 2005-10-20 | Computer system to constitute natural language base and automatic dialogue retrieve |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1952928A true CN1952928A (en) | 2007-04-25 |
Family
ID=38059272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200510100419 Pending CN1952928A (en) | 2005-10-20 | 2005-10-20 | Computer system to constitute natural language base and automatic dialogue retrieve |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1952928A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012109786A1 (en) * | 2011-02-16 | 2012-08-23 | Empire Technology Development Llc | Performing queries using semantically restricted relations |
CN102955823A (en) * | 2011-08-30 | 2013-03-06 | 方方 | Processing method of sample data in television program assessment surveying process |
CN103186676A (en) * | 2013-04-08 | 2013-07-03 | 湖南农业大学 | Method for searching thematic knowledge self growth form focused crawlers |
CN103221952A (en) * | 2010-09-24 | 2013-07-24 | 国际商业机器公司 | Lexical answer type confidence estimation and application |
CN103229120A (en) * | 2010-09-28 | 2013-07-31 | 国际商业机器公司 | Providing answers to questions using hypothesis pruning |
CN103761242A (en) * | 2012-12-31 | 2014-04-30 | 威盛电子股份有限公司 | Indexing method, indexing system and natural language understanding system |
CN104657346A (en) * | 2015-01-15 | 2015-05-27 | 深圳市前海安测信息技术有限公司 | Question matching system and question matching system in intelligent interaction system |
CN104809676A (en) * | 2015-05-11 | 2015-07-29 | 成都准星云学科技有限公司 | Method and device for analyzing mistake type of answer |
CN104978396A (en) * | 2015-06-02 | 2015-10-14 | 百度在线网络技术(北京)有限公司 | Knowledge database based question and answer generating method and apparatus |
CN105068995A (en) * | 2015-08-19 | 2015-11-18 | 刘战雄 | Natural language semantic calculation method and apparatus based on question semantics |
CN105912645A (en) * | 2016-04-08 | 2016-08-31 | 上海智臻智能网络科技股份有限公司 | Intelligent question and answer method and apparatus |
CN106844368A (en) * | 2015-12-03 | 2017-06-13 | 华为技术有限公司 | For interactive method, nerve network system and user equipment |
CN107632987A (en) * | 2016-07-19 | 2018-01-26 | 腾讯科技(深圳)有限公司 | One kind dialogue generation method and device |
CN108369596A (en) * | 2015-12-11 | 2018-08-03 | 微软技术许可有限责任公司 | Personalized natural language understanding system |
CN109074353A (en) * | 2016-10-10 | 2018-12-21 | 微软技术许可有限责任公司 | The combination of language understanding and information retrieval |
WO2019000240A1 (en) * | 2017-06-27 | 2019-01-03 | 华为技术有限公司 | Question answering system and question answering method |
CN110083818A (en) * | 2018-01-26 | 2019-08-02 | 尹岩 | A kind of system according to spatial term animation based on tool body cognition |
CN110516157A (en) * | 2019-08-30 | 2019-11-29 | 盈盛智创科技(广州)有限公司 | A kind of document retrieval method, equipment and storage medium |
WO2024016139A1 (en) * | 2022-07-19 | 2024-01-25 | 华为技术有限公司 | Query method and related device |
-
2005
- 2005-10-20 CN CN 200510100419 patent/CN1952928A/en active Pending
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103221952B (en) * | 2010-09-24 | 2016-01-20 | 国际商业机器公司 | The method and system of morphology answer type reliability estimating and application |
CN103221952A (en) * | 2010-09-24 | 2013-07-24 | 国际商业机器公司 | Lexical answer type confidence estimation and application |
US11409751B2 (en) | 2010-09-28 | 2022-08-09 | International Business Machines Corporation | Providing answers to questions using hypothesis pruning |
US9323831B2 (en) | 2010-09-28 | 2016-04-26 | International Business Machines Corporation | Providing answers to questions using hypothesis pruning |
CN103229120A (en) * | 2010-09-28 | 2013-07-31 | 国际商业机器公司 | Providing answers to questions using hypothesis pruning |
US10216804B2 (en) | 2010-09-28 | 2019-02-26 | International Business Machines Corporation | Providing answers to questions using hypothesis pruning |
US9317586B2 (en) | 2010-09-28 | 2016-04-19 | International Business Machines Corporation | Providing answers to questions using hypothesis pruning |
CN103380426A (en) * | 2011-02-16 | 2013-10-30 | 英派尔科技开发有限公司 | Performing queries using semantically restricted relations |
US9245049B2 (en) | 2011-02-16 | 2016-01-26 | Empire Technology Development Llc | Performing queries using semantically restricted relations |
CN103380426B (en) * | 2011-02-16 | 2017-09-22 | 英派尔科技开发有限公司 | Inquiry is performed using semantic restriction relation |
WO2012109786A1 (en) * | 2011-02-16 | 2012-08-23 | Empire Technology Development Llc | Performing queries using semantically restricted relations |
CN102955823A (en) * | 2011-08-30 | 2013-03-06 | 方方 | Processing method of sample data in television program assessment surveying process |
CN102955823B (en) * | 2011-08-30 | 2016-01-20 | 方方 | A kind of disposal route to television program assessment investigation sample data |
CN103761242A (en) * | 2012-12-31 | 2014-04-30 | 威盛电子股份有限公司 | Indexing method, indexing system and natural language understanding system |
CN103186676B (en) * | 2013-04-08 | 2016-03-02 | 湖南农业大学 | A kind of thematic knowledge self-propagation type search method for focused web crawler |
CN103186676A (en) * | 2013-04-08 | 2013-07-03 | 湖南农业大学 | Method for searching thematic knowledge self growth form focused crawlers |
CN104657346A (en) * | 2015-01-15 | 2015-05-27 | 深圳市前海安测信息技术有限公司 | Question matching system and question matching system in intelligent interaction system |
CN104809676A (en) * | 2015-05-11 | 2015-07-29 | 成都准星云学科技有限公司 | Method and device for analyzing mistake type of answer |
CN104809676B (en) * | 2015-05-11 | 2019-12-17 | 林辉 | Method and device for analyzing error type of answer |
CN104978396A (en) * | 2015-06-02 | 2015-10-14 | 百度在线网络技术(北京)有限公司 | Knowledge database based question and answer generating method and apparatus |
CN105068995A (en) * | 2015-08-19 | 2015-11-18 | 刘战雄 | Natural language semantic calculation method and apparatus based on question semantics |
CN105068995B (en) * | 2015-08-19 | 2018-05-29 | 刘战雄 | A kind of method and device of the natural language semantic computation based on query semanteme |
CN106844368A (en) * | 2015-12-03 | 2017-06-13 | 华为技术有限公司 | For interactive method, nerve network system and user equipment |
CN106844368B (en) * | 2015-12-03 | 2020-06-16 | 华为技术有限公司 | Method for man-machine conversation, neural network system and user equipment |
US11640515B2 (en) | 2015-12-03 | 2023-05-02 | Huawei Technologies Co., Ltd. | Method and neural network system for human-computer interaction, and user equipment |
CN108369596A (en) * | 2015-12-11 | 2018-08-03 | 微软技术许可有限责任公司 | Personalized natural language understanding system |
CN105912645A (en) * | 2016-04-08 | 2016-08-31 | 上海智臻智能网络科技股份有限公司 | Intelligent question and answer method and apparatus |
CN105912645B (en) * | 2016-04-08 | 2019-03-05 | 上海智臻智能网络科技股份有限公司 | A kind of intelligent answer method and device |
US10740564B2 (en) | 2016-07-19 | 2020-08-11 | Tencent Technology (Shenzhen) Company Limited | Dialog generation method, apparatus, and device, and storage medium |
CN107632987A (en) * | 2016-07-19 | 2018-01-26 | 腾讯科技(深圳)有限公司 | One kind dialogue generation method and device |
CN107632987B (en) * | 2016-07-19 | 2018-12-07 | 腾讯科技(深圳)有限公司 | A kind of dialogue generation method and device |
CN109074353B (en) * | 2016-10-10 | 2022-11-08 | 微软技术许可有限责任公司 | Method, device and system for information retrieval |
CN109074353A (en) * | 2016-10-10 | 2018-12-21 | 微软技术许可有限责任公司 | The combination of language understanding and information retrieval |
WO2019000240A1 (en) * | 2017-06-27 | 2019-01-03 | 华为技术有限公司 | Question answering system and question answering method |
CN110083818A (en) * | 2018-01-26 | 2019-08-02 | 尹岩 | A kind of system according to spatial term animation based on tool body cognition |
CN110516157A (en) * | 2019-08-30 | 2019-11-29 | 盈盛智创科技(广州)有限公司 | A kind of document retrieval method, equipment and storage medium |
WO2024016139A1 (en) * | 2022-07-19 | 2024-01-25 | 华为技术有限公司 | Query method and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1952928A (en) | Computer system to constitute natural language base and automatic dialogue retrieve | |
CN109684448B (en) | Intelligent question and answer method | |
CN106776711B (en) | Chinese medical knowledge map construction method based on deep learning | |
RU2662688C1 (en) | Extraction of information from sanitary blocks of documents using micromodels on basis of ontology | |
Casellas | Legal ontology engineering: Methodologies, modelling trends, and the ontology of professional judicial knowledge | |
CN110502642B (en) | Entity relation extraction method based on dependency syntactic analysis and rules | |
Shutova et al. | Multilingual metaphor processing: Experiments with semi-supervised and unsupervised learning | |
CN113987212A (en) | Knowledge graph construction method for process data in numerical control machining field | |
CN105701253A (en) | Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method | |
WO2014160379A1 (en) | Dimensional articulation and cognium organization for information retrieval systems | |
CN113312922B (en) | Improved chapter-level triple information extraction method | |
CN113094449B (en) | Large-scale knowledge map storage method based on distributed key value library | |
Barkovich | Informational Linguistics: The New Communicational Reality | |
Song et al. | Scalable distributed semantic network for knowledge management in cyber physical system | |
RU2662699C2 (en) | Comprehensive automatic processing of text information | |
Litvin et al. | A New Approach to Automatic Ontology Generation from the Natural Language Texts with Complex Inflection Structures in the Dialogue Systems Development | |
CN107423439A (en) | A kind of Chinese charater problem mapping method based on LDA | |
Zaiß | Instance-based ontology matching and the evaluation of matching systems. | |
Schneider | A database-driven ontology for German grammar | |
Saad et al. | Methodology of Ontology Extraction for Islamic Knowledge Text | |
CN113268608A (en) | Knowledge concept construction method and device | |
Awangga et al. | Ontology design based on data family planning field officer using OWL and RDF | |
Oltramari et al. | New trends of research in ontologies and lexical resources: Ideas, projects, systems | |
Priya et al. | A novel approach for merging ontologies using formal concept analysis | |
Zhu et al. | Auto-construction of course knowledge graph based on course knowledge |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |