CN106886516A - The method and device of automatic identification statement relationship and entity - Google Patents

The method and device of automatic identification statement relationship and entity Download PDF

Info

Publication number
CN106886516A
CN106886516A CN201710108288.8A CN201710108288A CN106886516A CN 106886516 A CN106886516 A CN 106886516A CN 201710108288 A CN201710108288 A CN 201710108288A CN 106886516 A CN106886516 A CN 106886516A
Authority
CN
China
Prior art keywords
entity
read statement
relation
deep learning
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710108288.8A
Other languages
Chinese (zh)
Inventor
简仁贤
王海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intelligent Technology (shanghai) Co Ltd
Original Assignee
Intelligent Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intelligent Technology (shanghai) Co Ltd filed Critical Intelligent Technology (shanghai) Co Ltd
Priority to CN201710108288.8A priority Critical patent/CN106886516A/en
Publication of CN106886516A publication Critical patent/CN106886516A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Machine Translation (AREA)

Abstract

The invention belongs to intelligent identification technology field, there is provided the method and device of a kind of automatic identification statement relationship and entity.The method of automatic identification statement relationship of the invention and entity includes:During the read statement of user projected into a space for fixed dimension, sentence vector of the read statement in the space of the fixed dimension is obtained;By the good deep learning grader of sentence vector input training in advance, the relation classification of the read statement is obtained;If identifying relation classification, the entity in the read statement is recognized.The method and system that the present invention is provided, using deep learning, judge user input from semantically, can precisely recognize relation;Entity recognition is modeled as sequence labelling problem, optimal mark is solved using condition random field, so as to precisely recognize entity;With reference to deep learning and condition random field, the automatic decimation of relation and entity is realized.

Description

The method and device of automatic identification statement relationship and entity
Technical field
The present invention relates to Intelligent Recognition art field, and in particular to the method and dress of a kind of automatic identification statement relationship and entity Put.
Background technology
In interactive system, we be often required to identifying user whether be express some specific areas information, than Such as hobby, pet name information;If user is to express these information, we often there is a need for being able to accurately extract these letters Cease signified specific object.Generally, these information can be indicated by relation and entity.Relation is primarily referred to as user and exists Which type of information is expressed, than such as whether being hobby, pet name etc.;And entity refers to then the signified specific object of relation.Such as use " I likes eating spicy hot pot " is expressed at family, and corresponding relation is " liking ", and corresponding entity is " spicy hot pot ".In conversational system In, how this specific area of automatic identification relation and entity be a problem for having much challenge.
The most frequently used method to recognize relation and entity mainly has two kinds:Based on keyword and based on regular expression.
Method based on keyword is mainly by keyword to recognize relation.By taking hobby as an example, if user input " liking " one word is included in sentence, is taken as liking in expression;If comprising " not liking " one word, be taken as in expression not Like.Then the entity of the relation is extracted in conjunction with grammer dependency analysis or semantic character labeling (SRL).Such as " I likes Joyous Zhou Jielun ", wherein comprising liking, the method based on keyword thinks that the words is in expression " liking ";By dependency analysis It is recognised that " Zhou Jielun " depends on core word " liking ", thus like pair as if " Zhou Jielun ", that is, the entity for identifying is " Zhou Jielun ".The shortcoming of the method based on keyword is the presence of substantial amounts of erroneous judgement, i.e., the sentence comprising certain keyword and differ It is fixed necessarily to express the relation.Take as a example by the hobby in face, user input " I is also unable to say for certain whether like Zhou Jielun at present " is inner Face both includes keyword " liking ", and the meaning of expression is but a kind of uncertain state.If including " liking " according to the inside, just It is considered to like relation, just loses unavoidably biased.This example is disclosed still cannot be judged in itself only according to keyword Go out relation, because the Limited information that keyword is included.Included in itself than keyword for the information required for judgement relation The big situation of information, such as " being unable to say for certain and whether like " information for being included will than the information content of single " liking " one word Greatly, the method based on keyword is just helpless.
In order to solve the problems, such as above, people generally add more qualifications using regular expression, so as to enter Row relation judges and entity is extracted.Relation is such as liked to recognize by regular expression " I likes (.*) ", represents there was only sentence Included in son " I likes ", just relation is liked in expression at last;" (.*) " below is represented and is followed all behind " I likes " Word, is regarded as the object liked, i.e. entity.Such as " I likes Zhou Jielun ", the relation that can be recognized is " liking ", real Body is " Zhou Jielun ".
Method based on regular expression there is also with the same shortcoming of the method based on keyword, that is, there is substantial amounts of mistake Sentence, the situation for being not belonging to the relation is also identified as the relation.Another of method based on regular expression has the disadvantage reality The function that body is extracted is more fragile, can usually extract the entity of mistake.Such as " I likes Zhou Jielun just to blame " meets above " I likes (.*) " pattern, and the meaning is completely contradicted, user's expression is the relation not liked.If according to it is above just Then, system identification is the relation liked, and like pair as if " Zhou Jielun just monster ";Under such case, relation and entity are all Identification mistake.
Another of method based on keyword and regular expression has the disadvantage to be difficult to safeguard.Due to natural language expressing Diversity is, it is necessary to substantial amounts of keyword and regular expression cover various situations.And with keyword and canonical table Up to increasing for formula, system can also become very complicated.Newly-increased keyword and regular expression be possible to it is existing in keyword and Regular expression mutually conflicts.What is worse, this conflict is generally more hidden, and people are generally difficult to judge whether this in advance Plant conflict.Many situations are after going wrong, by the root of tracing problem, just to find the conflict being originally between rule Caused.
Entity is extracted based on SRL or dependence also perfect not to the utmost.Due to Chinese expression complexity, SRL or Dependence accuracy rate in itself is not just high.Under this accuracy situation not high, various rules are recycled to carry out Entity recognition, Its precision can also be affected, and cause the problem that entity extraction is inaccurate.
In sum, the defect of prior art is as follows:
1st, relation judges inaccurate problem.Only according to keyword or canonical, sentence language in itself is not accounted for Justice, so as to cause relation to be judged by accident.
2nd, the inaccurate problem of entity extraction.Extracted according to regular expression, SRL, syntactic analysis, dependency analysis Entity, is easily influenceed by the precision that the method exists in itself, causes entity extraction mistake.
3rd, increasing with rule, system complexity is uprised, it is difficult to judge in advance newly-increased rule whether can with it is original Rule it is compatible, therefore system is difficult to safeguard.
The content of the invention
The automatic identification statement relationship and the method and device of entity provided for defect of the prior art, the present invention, Using deep learning, user input is judged from semantically, can precisely recognize relation;Entity recognition is modeled as sequence Mark problem, solves optimal mark, so as to precisely recognize entity using condition random field;With reference to deep learning and condition random , realize the automatic decimation of relation and entity.
In a first aspect, the method for a kind of automatic identification statement relationship of present invention offer and entity, including:By the defeated of user Enter during sentence projects to a space for fixed dimension, obtain sentence of the read statement in the space of the fixed dimension Vector;By the good deep learning grader of sentence vector input training in advance, the relation classification of the read statement is obtained; If identifying relation classification, the entity in the read statement is recognized.
Automatic identification statement relationship and the method for entity that the present invention is provided, using deep learning, from semantically to user Read statement judged, can precisely recognize relation, be favorably improved the degree of accuracy of Entity recognition.
Preferably, it is described that the read statement of user projected into a space for fixed dimension, obtain the input language Sentence vector of the sentence in the space of the fixed dimension, including:Read statement to user carries out participle;By searching Word2vec term vectors, corresponding term vector is converted into by each participle;According to the term vector of each participle, the input is obtained Sentence vector of the sentence in a space for fixed dimension.
Preferably, the deep learning grader that sentence vector input training in advance is good, obtains the input The relation classification of sentence, including:By the input of sentence vector, CNN layers carries out convolution operation, obtains the office of the read statement Portion's feature;The local feature is input into LSTM layers, the relation coding between the front and rear word in the read statement is obtained;By institute Stating ReLU layers of relation coding input carries out nonlinear transformation;Nonlinear transformation result is passed into output layer, the input is obtained The relation classification of sentence.
Preferably, the deep learning grader includes CNN layers of multiple.
Preferably, the deep learning grader includes LSTM layers of multiple.
Preferably, the output layer of the deep learning grader uses Softmax functions or Sigmoid functions.
Preferably, the entity in the identification read statement, including:The read statement is input into CRF models, is obtained Optimal sequence to the read statement is marked, and the entity in the read statement is obtained according to optimal sequence mark.
Preferably, the training step of the deep learning grader includes:The sentence vector input of training sample is advance The deep learning grader of structure, the projected relationship classification LP of training sample is obtained by feedforward;By loss function F (LP, L) Loss values are obtained, wherein, L is the relation classification of the actual mark of sample, and loss values are the difference degree between LP and L, according to institute Loss values are stated, gradient backpropagation is carried out using stochastic gradient descent, change the parameter of the deep learning grader;Iteration The deep learning grader is trained, until the projected relationship classification and the actual mark of sample of deep learning grader output The other loss values of relation object be less than threshold value set in advance, or iterations exceed frequency threshold value set in advance.
Preferably, the loss function can be cross entropy or mean square error.
Second aspect, a kind of automatic identification statement relationship and the device of entity that the present invention is provided, including:Pretreatment mould Block, in the read statement of user projected into a space for fixed dimension, obtains the read statement in the fixation Sentence vector in the space of dimension;Relation recognition module, for sentence vector to be input into the good depth of training in advance Grader is practised, the relation classification of the read statement is obtained;Entity recognition module, if for identifying relation classification, recognizing Entity in the read statement.
Automatic identification statement relationship and the device of entity that the present invention is provided, using deep learning, from semantically to user Read statement judged, can precisely recognize relation, be favorably improved the degree of accuracy of Entity recognition.
Brief description of the drawings
A kind of automatic identification statement relationship and the flow chart of the method for entity that Fig. 1 is provided by the embodiment of the present invention;
A kind of automatic identification statement relationship and the structured flowchart of the device of entity that Fig. 2 is provided by the embodiment of the present invention;
Fig. 3 is the deep learning framework that deep learning grader provided in an embodiment of the present invention is used.
Specific embodiment
The embodiment of technical solution of the present invention is described in detail below in conjunction with accompanying drawing.Following examples are only used for Technical scheme is clearly illustrated, therefore is intended only as example, and protection of the invention can not be limited with this Scope.
It should be noted that unless otherwise indicated, technical term used in this application or scientific terminology should be this hair The ordinary meaning that bright one of ordinary skill in the art are understood.
As shown in figure 1, the method for a kind of automatic identification statement relationship provided in an embodiment of the present invention and entity, including:
Step S1, during the read statement of user projected into a space for fixed dimension, obtains read statement in fixation Sentence vector in the space of dimension.
Step S2, by the good deep learning grader of sentence vector input training in advance, obtains the relation object of read statement Not.
Step S3, if identifying relation classification, the entity in identified input sentence.
Wherein, entity first must be a noun, and entity refers to a self-existent object, such as name or Person's things name etc., but do not include pronoun, such as " I " " you " " he ".Such as, read statement is " I likes Zhou Jielun ", reality therein Body is " Zhou Jielun ".
Automatic identification statement relationship and the method for entity that the present embodiment is provided, using deep learning, from semantically to The read statement at family judged, can precisely recognize relation, is favorably improved the degree of accuracy of Entity recognition.
Wherein, the preferred embodiment of step S1 is as follows, including:
Step S11, the read statement to user carries out participle.
Step S12, by searching word2vec term vectors, corresponding term vector is converted into by each participle.
Step S13, according to the term vector of each participle, obtains sentence of the read statement in a space for fixed dimension Vector.
Wherein, the concrete methods of realizing of step S11~step S13 is as follows:
Participle is carried out to read statement, if vocabulary quantity gives up the vocabulary of overage more than N.N is to preset Read statement vocabulary quantity maximum, such as N be 25.Because user is input into the form of chatting, N values are not It is very big.By statistics, user chat when, the number of words being input into when most is within 10 words.
By searching word2vec term vectors, each participle is converted into corresponding term vector.Might as well assume each word to The dimension of amount is M, such as M is 300 dimensions.Wherein, Word2vec term vectors are good off-line trainings, need to only be called related disclosed Interface, by searching Word2vec term vectors, participle vocabulary is converted into corresponding term vector.
These term vectors are spliced.If vocabulary lazy weight N, 0 is mended below, until formed NM dimension to Amount.Such as N is 300 for 25, M, if user input only has 23 vocabulary, except splice this 23 300 dimension term vectors it Outward, in addition it is also necessary to fill 20 vectors of M dimensions later, that is, fill 2 × 300 zero (i.e. 600 zero).This kind of vector of filling M dimensions 0 Way be called padding.
By above step, in read statement being projected into a space for fixed dimension, such as above example It is in projecting to N × M dimension spaces, if N is 300 for 25, M, then in projecting to the spaces of 25 × 300 dimensions.
Vector representation of the read statement in N × M dimension spaces is the sentence vector of the read statement.
Wherein, the deep learning framework that the deep learning grader in step S2 is used is as shown in figure 3, the bottom is using volume Product neutral net (Convolutional Neural Network, CNN), for the sentence extracted from read statement vector Convolution operation is carried out, the local feature of read statement is obtained, it is preferred to use two-layer CNN is superimposed, and can get more abstract Local feature;The local feature is passed through as the input of time recurrent neural network (Long Short-Term Memory, LSTM) Two-layer LSTM is crossed, the dependence between front and rear word in sentence is encoded;The relation coding for obtaining passes to activation letter again Several layers (Rectified Linear Units, ReLu), carries out nonlinear transformation;Nonlinear transformation result passes to output layer, Finally give the relation classification of read statement.Wherein, output layer can use Softmax functions or Sigmoid functions, if adopting Softmax functions are used, then deep learning grader is output as many-valued output, such as, for preference categories device, can be modeled as Multi-class Classifier:Like, do not like, other;According to Sigmoid functions, then to be output as two-value defeated for deep learning grader Go out, such as, for pet name grader, two-value grader can be modeled as:The pet name, other.
Based on above-mentioned deep learning framework, the training for carrying out having supervision by the labeled data of specific area so that depth Study strategies and methods can accurately and efficiently recognize the relation classification represented in sentence, the training step bag of deep learning grader Include:
Step S21, the deep learning grader that the sentence vector input of training sample is built in advance, by feedforward (forward pass) obtains the projected relationship classification LP of training sample.
Step S22, loss values are obtained by loss function F (LP, L).Wherein, LP is projected relationship classification, and L is sample reality The relation classification of border mark, loss values have weighed the difference journey between the relation classification of projected relationship classification and the actual mark of sample Degree, F can be cross entropy (Cross Entropy) or mean square error (MSE, Mean Squared Error).
Step S23, according to loss values, carries out backward pass and (is also back using stochastic gradient descent (SGD) Propagation, gradient backpropagation), change the parameter of deep learning grader so that the deep learning classification after modification Relation classification of the projected relationship classification of device output closer to the actual mark of sample.
Step S24, repetitive exercise deep learning grader, until deep learning grader output projected relationship classification with The other loss values of relation object of the actual mark of sample are less than threshold value set in advance, or iterations exceedes set in advance time Number threshold value.
The framework that above-mentioned deep learning grader is used, can well model the succession in sentence between vocabulary and close System.For this reason, this framework has suitable sensitiveness to negative word, can distinguish such as " I likes Zhou Jielun " and " I Like the Zhou Jielun just strange " as difference, while being also capable of identify that situation of " I does not like Zhou Jielun " so expression negative And the situation of " I is not not like Zhou Jielun " so multiple negative.
Identification entity can be modeled as sequence labelling problem, specifically, to each character in sentence, be labeled as BMESO, wherein B (Begin) expression are the beginning characters of entity, and M (Middle) expressions are the intermediate character of entity, E (End) table Show be entity termination character, S (Single) represents the entity of single character composition.For the character of non-physical, O can be used (Other) it is labeled, expression is not belonging to the part of entity.Such as " I/happiness/joyous/week/outstanding person/human relations ", the O/ for me can be marked The joyous O/ weeks B/ outstanding person's M/ human relations E of happiness O/ ", wherein BME altogether, is obtained " Zhou Jielun ", and the entity that expression is liked is " Zhou Jielun ";Compare again As " I/happiness/joyous/song ", can mark as my the joyous O/ songs S of O/ happinesses O/ " and, wherein S represents single character entity, likes here Entity is " song ".
Entity recognition problem can solve optimal mark with condition random field, so as to accurately extract the reality in sentence Body, therefore, the preferred embodiment that step S3 is used is as follows:Read statement is input into CRF models, the optimal sequence of read statement is obtained Mark, the entity in read statement is obtained according to optimal sequence mark.
Wherein, the detailed process that the optimal sequence for obtaining read statement by CRF models is marked is as follows:
Sequence labelling problem can be solved by condition random field.Formally, for given read statement x (i.e. One character string) and annotated sequence y based on the sequence, condition random field modeled conditional probability:
Wherein, exp (x) represents ex, e is natural constant, and w can be the weight vectors of training, wTIt is the transposition of vectorial w, y' It is all possible marks of sequence x, F (x, y) is characteristic vectors of the annotated sequence y on x.Conditional probability p (y | x, w) represent The given weight w in the case of, character string x is marked into the possibility size of annotated sequence y.
Given n is to training data { xi,yi, solve following object function:
Optimal w can be found by the method for stochastic gradient descent (SGD).
Find after optimal w, for each possible mark y', we can calculate its corresponding p (y'| x, w) Value.Optimal mark y is so that the maximum annotated sequences of p (y | x, w).In order to improve calculating performance, can be calculated by Viterbi Method finds optimal annotated sequence.
After finding optimal annotated sequence, then marked by BME therein or S and accurately to extract the reality in sentence Body.
Based on the method identical inventive concept with above-mentioned automatic identification statement relationship and entity, the embodiment of the present invention is also carried The device of a kind of automatic identification statement relationship and entity has been supplied, including:Pretreatment module 101, for by the read statement of user Project in a space for fixed dimension, obtain sentence vector of the read statement in the space of fixed dimension;Relation recognition Module 102, for by the good deep learning grader of sentence vector input training in advance, obtaining the relation classification of read statement; Entity recognition module 103, if for identifying relation classification, the entity in identified input sentence.
The method and device of automatic identification statement relationship provided in an embodiment of the present invention and entity, using deep learning, from Semantically the read statement to user judges, can precisely recognize relation;Entity recognition is modeled as sequence labelling problem, Optimal mark is solved using condition random field, so as to precisely recognize entity;With reference to deep learning and condition random field, pass is realized System and the automatic decimation of entity;Using machine learning, relation and entity are judged from semantically, overcome due to nature Language performance diversity brings influence.Such as " I likes the song of Zhou Jielun ", " song of Zhou Jielun is my favorite ", " love is dead The song of Zhou Jielun " can be identified as in expression " liking " relation, and the object liked is then " song of Zhou Jielun ".Separately Outward, method and system provided in an embodiment of the present invention are more readily maintained compared to traditional method.If necessary to increase coverage rate, only The data for needing addition new, train new model.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent Pipe has been described in detail with reference to foregoing embodiments to the present invention, it will be understood by those within the art that:Its according to The technical scheme described in foregoing embodiments can so be modified, or which part or all technical characteristic are entered Row equivalent;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology The scope of scheme, it all should cover in the middle of the scope of claim of the invention and specification.

Claims (10)

1. a kind of method of automatic identification statement relationship and entity, it is characterised in that including:
During the read statement of user projected into a space for fixed dimension, the read statement is obtained in the fixed dimension Space in sentence vector;
By the good deep learning grader of sentence vector input training in advance, the relation classification of the read statement is obtained;
If identifying relation classification, the entity in the read statement is recognized.
2. method according to claim 1, it is characterised in that described that the read statement of user is projected into a fixed dimension In the space of degree, sentence vector of the read statement in the space of the fixed dimension is obtained, including:
Read statement to user carries out participle;
By searching word2vec term vectors, each participle is converted into corresponding term vector;
According to the term vector of each participle, sentence vector of the read statement in a space for fixed dimension is obtained.
3. method according to claim 2, it is characterised in that the depth that sentence vector input training in advance is good Degree Study strategies and methods, obtain the relation classification of the read statement, including:
By the input of sentence vector, CNN layers carries out convolution operation, obtains the local feature of the read statement;
The local feature is input into LSTM layers, the relation coding between the front and rear word in the read statement is obtained;
By the relation coding input, ReLU layers carries out nonlinear transformation;
Nonlinear transformation result is passed into output layer, the relation classification of the read statement is obtained.
4. method according to claim 3, it is characterised in that the deep learning grader includes CNN layers of multiple.
5. method according to claim 3, it is characterised in that the deep learning grader includes LSTM layers of multiple.
6. method according to claim 3, it is characterised in that the output layer of the deep learning grader is used Softmax functions or Sigmoid functions.
7. method according to claim 1, it is characterised in that the entity in the identification read statement, including:
The read statement is input into CRF models, the optimal sequence mark of the read statement is obtained, according to the optimal sequence Mark obtains the entity in the read statement.
8. method according to claim 1, it is characterised in that the training step of the deep learning grader includes:
The deep learning grader that the sentence vector input of training sample is built in advance, the pre- of training sample is obtained by feedforward Survey relation classification LP;
Loss values are obtained by loss function F (LP, L), wherein, L is the relation classification of the actual mark of sample, loss values for LP and Difference degree between L,
According to the loss values, gradient backpropagation is carried out using stochastic gradient descent, change the deep learning grader Parameter;
Deep learning grader described in repetitive exercise, until the projected relationship classification and sample of deep learning grader output The other loss values of relation object of actual mark are less than threshold value set in advance, or iterations exceedes number of times threshold set in advance Value.
9. method according to claim 8, it is characterised in that the loss function is cross entropy or mean square error.
10. the device of a kind of automatic identification statement relationship and entity, it is characterised in that including:
Pretreatment module, in the read statement of user projected into a space for fixed dimension, obtains the input language Sentence vector of the sentence in the space of the fixed dimension;
Relation recognition module, for by the good deep learning grader of sentence vector input training in advance, obtaining described defeated Enter the relation classification of sentence;
Entity recognition module, if for identifying relation classification, recognizing the entity in the read statement.
CN201710108288.8A 2017-02-27 2017-02-27 The method and device of automatic identification statement relationship and entity Pending CN106886516A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710108288.8A CN106886516A (en) 2017-02-27 2017-02-27 The method and device of automatic identification statement relationship and entity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710108288.8A CN106886516A (en) 2017-02-27 2017-02-27 The method and device of automatic identification statement relationship and entity

Publications (1)

Publication Number Publication Date
CN106886516A true CN106886516A (en) 2017-06-23

Family

ID=59180680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710108288.8A Pending CN106886516A (en) 2017-02-27 2017-02-27 The method and device of automatic identification statement relationship and entity

Country Status (1)

Country Link
CN (1) CN106886516A (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316654A (en) * 2017-07-24 2017-11-03 湖南大学 Emotion identification method based on DIS NV features
CN107451433A (en) * 2017-06-27 2017-12-08 中国科学院信息工程研究所 A kind of information source identification method and apparatus based on content of text
CN107526799A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of knowledge mapping construction method based on deep learning
CN107622050A (en) * 2017-09-14 2018-01-23 武汉烽火普天信息技术有限公司 Text sequence labeling system and method based on Bi LSTM and CRF
CN107797989A (en) * 2017-10-16 2018-03-13 平安科技(深圳)有限公司 Enterprise name recognition methods, electronic equipment and computer-readable recording medium
CN107797993A (en) * 2017-11-13 2018-03-13 成都蓝景信息技术有限公司 A kind of event extraction method based on sequence labelling
CN108038209A (en) * 2017-12-18 2018-05-15 深圳前海微众银行股份有限公司 Answer system of selection, device and computer-readable recording medium
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN108416058A (en) * 2018-03-22 2018-08-17 北京理工大学 A kind of Relation extraction method based on the enhancing of Bi-LSTM input informations
CN108920448A (en) * 2018-05-17 2018-11-30 南京大学 A method of the comparison based on shot and long term memory network extracts
CN109033068A (en) * 2018-06-14 2018-12-18 北京慧闻科技发展有限公司 It is used to read the method, apparatus understood and electronic equipment based on attention mechanism
CN109062897A (en) * 2018-07-26 2018-12-21 苏州大学 Sentence alignment method based on deep neural network
CN109062910A (en) * 2018-07-26 2018-12-21 苏州大学 Sentence alignment method based on deep neural network
CN109460434A (en) * 2018-10-25 2019-03-12 北京知道创宇信息技术有限公司 Data extract method for establishing model and device
CN109815456A (en) * 2019-02-13 2019-05-28 北京航空航天大学 A method of it is compressed based on term vector memory space of the character to coding
WO2019174422A1 (en) * 2018-03-16 2019-09-19 北京国双科技有限公司 Method for analyzing entity association relationship, and related apparatus
CN110826320A (en) * 2019-11-28 2020-02-21 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN111046180A (en) * 2019-12-05 2020-04-21 竹间智能科技(上海)有限公司 Label identification method based on text data
CN111209751A (en) * 2020-02-14 2020-05-29 全球能源互联网研究院有限公司 Chinese word segmentation method, device and storage medium
CN111339250A (en) * 2020-02-20 2020-06-26 北京百度网讯科技有限公司 Mining method of new category label, electronic equipment and computer readable medium
CN111914547A (en) * 2020-07-17 2020-11-10 深圳宜搜天下科技股份有限公司 Improved semantic intention recognition method and LSTM framework system
CN112270179A (en) * 2020-10-15 2021-01-26 和美(深圳)信息技术股份有限公司 Entity identification method and device and electronic equipment
WO2021073254A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Knowledge graph-based entity linking method and apparatus, device, and storage medium
CN113011170A (en) * 2021-02-25 2021-06-22 万翼科技有限公司 Contract processing method, electronic equipment and related products
CN113468309A (en) * 2021-06-30 2021-10-01 竹间智能科技(上海)有限公司 Answer extraction method in text and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510221A (en) * 2009-02-17 2009-08-19 北京大学 Enquiry statement analytical method and system for information retrieval
CN103309926A (en) * 2013-03-12 2013-09-18 中国科学院声学研究所 Chinese and English-named entity identification method and system based on conditional random field (CRF)
CN105628951A (en) * 2015-12-31 2016-06-01 北京小孔科技有限公司 Method and device for measuring object speed
CN106096568A (en) * 2016-06-21 2016-11-09 同济大学 A kind of pedestrian's recognition methods again based on CNN and convolution LSTM network
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510221A (en) * 2009-02-17 2009-08-19 北京大学 Enquiry statement analytical method and system for information retrieval
CN103309926A (en) * 2013-03-12 2013-09-18 中国科学院声学研究所 Chinese and English-named entity identification method and system based on conditional random field (CRF)
CN105628951A (en) * 2015-12-31 2016-06-01 北京小孔科技有限公司 Method and device for measuring object speed
CN106096568A (en) * 2016-06-21 2016-11-09 同济大学 A kind of pedestrian's recognition methods again based on CNN and convolution LSTM network
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李弼程 等: "《网络舆情分析理论技术与应对策略》", 31 March 2015 *

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451433A (en) * 2017-06-27 2017-12-08 中国科学院信息工程研究所 A kind of information source identification method and apparatus based on content of text
CN107451433B (en) * 2017-06-27 2020-05-22 中国科学院信息工程研究所 Information source identification method and device based on text content
CN107316654A (en) * 2017-07-24 2017-11-03 湖南大学 Emotion identification method based on DIS NV features
CN107526799A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of knowledge mapping construction method based on deep learning
CN107622050A (en) * 2017-09-14 2018-01-23 武汉烽火普天信息技术有限公司 Text sequence labeling system and method based on Bi LSTM and CRF
CN107622050B (en) * 2017-09-14 2021-02-26 武汉烽火普天信息技术有限公司 Bi-LSTM and CRF-based text sequence labeling system and method
CN107797989A (en) * 2017-10-16 2018-03-13 平安科技(深圳)有限公司 Enterprise name recognition methods, electronic equipment and computer-readable recording medium
CN107797993A (en) * 2017-11-13 2018-03-13 成都蓝景信息技术有限公司 A kind of event extraction method based on sequence labelling
CN108038209A (en) * 2017-12-18 2018-05-15 深圳前海微众银行股份有限公司 Answer system of selection, device and computer-readable recording medium
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN108228568B (en) * 2018-01-24 2021-06-04 上海互教教育科技有限公司 Mathematical problem semantic understanding method
WO2019174422A1 (en) * 2018-03-16 2019-09-19 北京国双科技有限公司 Method for analyzing entity association relationship, and related apparatus
CN110276066A (en) * 2018-03-16 2019-09-24 北京国双科技有限公司 The analysis method and relevant apparatus of entity associated relationship
CN108416058A (en) * 2018-03-22 2018-08-17 北京理工大学 A kind of Relation extraction method based on the enhancing of Bi-LSTM input informations
CN108416058B (en) * 2018-03-22 2020-10-09 北京理工大学 Bi-LSTM input information enhancement-based relation extraction method
CN108920448A (en) * 2018-05-17 2018-11-30 南京大学 A method of the comparison based on shot and long term memory network extracts
CN108920448B (en) * 2018-05-17 2021-09-14 南京大学 Comparison relation extraction method based on long-term and short-term memory network
CN109033068B (en) * 2018-06-14 2022-07-12 北京慧闻科技(集团)有限公司 Method and device for reading and understanding based on attention mechanism and electronic equipment
CN109033068A (en) * 2018-06-14 2018-12-18 北京慧闻科技发展有限公司 It is used to read the method, apparatus understood and electronic equipment based on attention mechanism
CN109062897A (en) * 2018-07-26 2018-12-21 苏州大学 Sentence alignment method based on deep neural network
CN109062910A (en) * 2018-07-26 2018-12-21 苏州大学 Sentence alignment method based on deep neural network
CN109460434A (en) * 2018-10-25 2019-03-12 北京知道创宇信息技术有限公司 Data extract method for establishing model and device
CN109815456A (en) * 2019-02-13 2019-05-28 北京航空航天大学 A method of it is compressed based on term vector memory space of the character to coding
WO2021073254A1 (en) * 2019-10-18 2021-04-22 平安科技(深圳)有限公司 Knowledge graph-based entity linking method and apparatus, device, and storage medium
CN110826320B (en) * 2019-11-28 2023-10-13 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN110826320A (en) * 2019-11-28 2020-02-21 上海观安信息技术股份有限公司 Sensitive data discovery method and system based on text recognition
CN111046180A (en) * 2019-12-05 2020-04-21 竹间智能科技(上海)有限公司 Label identification method based on text data
CN111209751A (en) * 2020-02-14 2020-05-29 全球能源互联网研究院有限公司 Chinese word segmentation method, device and storage medium
CN111209751B (en) * 2020-02-14 2023-07-28 全球能源互联网研究院有限公司 Chinese word segmentation method, device and storage medium
CN111339250A (en) * 2020-02-20 2020-06-26 北京百度网讯科技有限公司 Mining method of new category label, electronic equipment and computer readable medium
CN111339250B (en) * 2020-02-20 2023-08-18 北京百度网讯科技有限公司 Mining method for new category labels, electronic equipment and computer readable medium
US11755654B2 (en) 2020-02-20 2023-09-12 Beijing Baidu Netcom Science Technology Co., Ltd. Category tag mining method, electronic device and non-transitory computer-readable storage medium
CN111914547A (en) * 2020-07-17 2020-11-10 深圳宜搜天下科技股份有限公司 Improved semantic intention recognition method and LSTM framework system
CN112270179B (en) * 2020-10-15 2021-11-09 和美(深圳)信息技术股份有限公司 Entity identification method and device and electronic equipment
CN112270179A (en) * 2020-10-15 2021-01-26 和美(深圳)信息技术股份有限公司 Entity identification method and device and electronic equipment
CN113011170A (en) * 2021-02-25 2021-06-22 万翼科技有限公司 Contract processing method, electronic equipment and related products
CN113011170B (en) * 2021-02-25 2022-10-14 万翼科技有限公司 Contract processing method, electronic equipment and related products
CN113468309A (en) * 2021-06-30 2021-10-01 竹间智能科技(上海)有限公司 Answer extraction method in text and electronic equipment
CN113468309B (en) * 2021-06-30 2023-12-22 竹间智能科技(上海)有限公司 Answer extraction method in text and electronic equipment

Similar Documents

Publication Publication Date Title
CN106886516A (en) The method and device of automatic identification statement relationship and entity
CN107133224B (en) Language generation method based on subject word
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN107153642A (en) A kind of analysis method based on neural network recognization text comments Sentiment orientation
CN110427616B (en) Text emotion analysis method based on deep learning
CN104598611B (en) The method and system being ranked up to search entry
CN110222178A (en) Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN106503192A (en) Name entity recognition method and device based on artificial intelligence
CN107480143A (en) Dialogue topic dividing method and system based on context dependence
CN107180023A (en) A kind of file classification method and system
CN108268539A (en) Video matching system based on text analyzing
CN107316654A (en) Emotion identification method based on DIS NV features
CN107798624A (en) A kind of technical label in software Ask-Answer Community recommends method
CN112101040A (en) Ancient poetry semantic retrieval method based on knowledge graph
CN112989033B (en) Microblog emotion classification method based on emotion category description
CN108255813A (en) A kind of text matching technique based on term frequency-inverse document and CRF
CN106257455A (en) A kind of Bootstrapping algorithm based on dependence template extraction viewpoint evaluation object
CN107506377A (en) This generation system is painted in interaction based on commending system
CN112559734A (en) Presentation generation method and device, electronic equipment and computer readable storage medium
CN109933787B (en) Text key information extraction method, device and medium
CN109543176A (en) A kind of abundant short text semantic method and device based on figure vector characterization
CN114036246A (en) Commodity map vectorization method and device, electronic equipment and storage medium
CN107894975A (en) A kind of segmenting method based on Bi LSTM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170623