CN113255295B - Automatic generation method and system for formal protocol from natural language to PPTL - Google Patents
Automatic generation method and system for formal protocol from natural language to PPTL Download PDFInfo
- Publication number
- CN113255295B CN113255295B CN202110457578.XA CN202110457578A CN113255295B CN 113255295 B CN113255295 B CN 113255295B CN 202110457578 A CN202110457578 A CN 202110457578A CN 113255295 B CN113255295 B CN 113255295B
- Authority
- CN
- China
- Prior art keywords
- pptl
- natural language
- formula
- sentence
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000004458 analytical method Methods 0.000 claims abstract description 28
- 238000007781 pre-processing Methods 0.000 claims abstract description 22
- 238000000605 extraction Methods 0.000 claims description 16
- 150000001875 compounds Chemical class 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 11
- 230000008707 rearrangement Effects 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 238000003860 storage Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 230000008094 contradictory effect Effects 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 3
- 238000009472 formulation Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 17
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000011960 computer-aided design Methods 0.000 abstract description 2
- 238000003058 natural language processing Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 11
- 238000012795 verification Methods 0.000 description 6
- 238000011156 evaluation Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention belongs to the technical field of computer aided design, and discloses an automatic generation method and an automatic generation system for a natural language to PPTL formal protocol, wherein the automatic generation method for the natural language to PPTL formal protocol comprises the following steps: analyzing the natural language text by using a natural language processing technology, generating a grammar tree, traversing the tree to perform preprocessing operations such as extracting, rearranging, marking and the like of sentence components, and generating a marked text; carrying out grammar and semantic analysis on the marked text by using a JavaCC tool to generate a syntax tree containing clauses, connective words and time sequence information, traversing the syntax tree to generate an atomic proposition and a combined PPTL formula; the satisfaction of the generated formula is determined using the PPTLSAT tool. The invention can help the user extract the formal specification from the nature of the natural language description for model detection, and convert the nature of the natural language text described by the user into the PPTL formula, thereby providing convenience for the common user to use the model detection technology.
Description
Technical Field
The invention belongs to the technical field of computer aided design, and particularly relates to a method and a system for automatically generating a formal specification from natural language to PPTL.
Background
At present, with the development of computer software technology and the coming of artificial intelligence age and the proposal of 5G and Internet of things (IOT) technology, the social development is advancing toward the direction of everything interconnection and intercommunication. With this, the system of the internet of everything is silent and is integrated into the daily life of people, and more individuals are linked to the system of the internet of everything in the future. Computer software has been silently penetrated into the aspects of society from the fields of security key systems such as national defense, aviation, aerospace and the like to the fields of industrial production and personal life. As computer software becomes more and more indispensable in various industries, the connection between system crashes or losses of software caused by vulnerabilities existing in design and national security and personal privacy, property and life security is also becoming more and more tight, and how to solve the reliability and security loss of software caused by vulnerabilities and errors is regarded as a hot problem to be solved in academia and industry.
The formalization method is a method for improving the safety and reliability of the software and hardware system, uses formalized language to describe the requirements and characteristics of the software and hardware system, and ensures that the final product meets the requirements and has the characteristics through strict mathematical reasoning verification. Model detection is an important method in formalization methods, and is taken as a formalization technology for automatically verifying a finite state system, and the principle is to search a state space of a model to be verified to verify whether the model can meet expected properties or not.
In model detection, the formalized specification for defining the property to be verified is difficult for users, and common users do not have knowledge background of formalized methods and are more prone to describing the property to be verified of the system by natural language. In order to define a property specification formula conforming to the formal grammar and the expected semantics, a user needs to be subjected to long-term learning and strict training, which limits the application of the model detection technology to the security verification of software and hardware systems in various industries.
Through the above analysis, the problems and defects existing in the prior art are as follows:
(1) In model detection, the formalized specification for defining the property to be verified is difficult for users, and common users do not have knowledge background of formalized methods and are more prone to describing the property to be verified of the system by natural language.
(2) In order to define a property specification formula conforming to the formal grammar and the expected semantics, a user needs to be subjected to long-term learning and strict training, which limits the application of the model detection technology to the security verification of software and hardware systems in various industries.
The difficulty of solving the problems and the defects is as follows: the difficulty of the two problems is mainly that the grammar structure of the user descriptive text is changeable and possibly has ambiguity, if the grammar structure of the user input text is not constrained, the grammar structure flexible in natural language can cause structural ambiguity errors of the grammar tree, and finally, the wrong PPTL formal specification is generated. On the other hand, PPTL formalization protocol is composed of a combination of atomic propositions, various logical connectives operators and sequential operators, but the relation between the logical connectives semantics and sequential operators semantics and the semantics expressed by natural language text in PPTL is not a simple one-to-one correspondence.
The meaning of solving the problems and the defects is as follows: the professional limit of a common user in defining formal protocols when verifying a software and hardware system by using model detection is eliminated, the application of the model detection technology in various industries is enlarged, and the reliability and safety in the software and hardware development process are improved.
Disclosure of Invention
Aiming at the problems existing in the practical application of the existing model detection, the invention provides a method and a system for automatically generating a formal protocol from natural language to PPTL.
The invention is realized in such a way that a natural language to PPTL formal specification automatic generation method comprises the following steps:
preprocessing an input natural language text by using a Stanford NLP, deleting redundant information, rearranging the identified sentence components and structures to generate a marked text, and facilitating the identification of the sentence components by a subsequent converter;
secondly, carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree, wherein the syntax tree is an intermediate representation form of a natural language text and a PPTL formalization protocol, and a data structure representation form is provided for the natural language text so as to facilitate the subsequent generation of the PPTL formalization protocol;
traversing the syntax tree according to the PPTL formula generation rule, combining all atomic propositions, logical connection word operators and time sequence operators, generating a complete PPTL formal specification according to the logical connection word meaning expressed by the syntax tree structure, converting the intermediate representation syntax tree into the PPTL formal specification, and providing input for the next formula to meet the judgment;
and step four, using a formula satisfaction judging tool PPTLSAT to judge the satisfaction of the generated PPTL formula, analyzing the generated file to obtain a judging result, and only performing model detection on the formalized protocol after the formula satisfaction judging is successful to make sense, so as to avoid nonsensical detection on unsatisfied and remarked formulas.
Further, in the first step, the preprocessing of the input natural language text includes:
(1) Processing the input natural language text using Stanford NLP to generate a grammar tree;
(2) Traversing the grammar tree to extract sentence components, rearranging the sentence components, and removing structural ambiguity;
(3) And extracting part-of-speech tags in the grammar tree and integrating the part-of-speech tags into the text to generate a tagged text.
Further, the sentence component extraction and rearrangement includes:
general form of sentence component extraction: the trunk of English sentence is subject, predicate and object, then adding adverbs or preposition phrase for modifying sentence; the sentence structure extraction form for the master-slave compound sentence is as follows: the general forms are the forms of master and slave conjunctions, clauses and master sentences, and the general forms of parallel compound sentences are the forms of clauses, parallel conjunctions and clauses.
The sentence component rearrangement process comprises: firstly, taking a leaf node under each tree structure and a branch deepest position node as corresponding sentence components, wherein sentence component marks are defined by node meanings of father nodes of the leaf nodes; then, further optimizing the extracted sentence components, and deleting redundant information in the sentence components; and finally, rearranging the extracted sentence components, the master-slave compound sentences and the parallel compound sentence structures according to the syntactic form of the sentence component extraction definition.
Further, in the second step, the syntax semantic analysis includes:
carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in the marked text, extracting sentence component information, including simple sentence component extraction for constituting atomic propositions, extraction of a whole syntactic structure, extraction of logic connective word semantics and time sequence semantics and the like, and recording the extracted information and constructing a syntactic tree.
Further, in the third step, the PPTL formula generating process includes:
traversing the syntax tree, generating atomic propositions for each clause by combining the PPTL formula generation specification; combining all the atomic propositions with the logical connector operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connector meaning expressed by the syntactic tree structure; wherein the traversing comprises:
(1) If a proposition node is encountered, adding a time sequence operator for the proposition according to the mark, taking out proposition description information, and storing the proposition description information into a hash table according to the proposition argument and the proposition description;
(2) If the connector node is encountered, judging the description information of the father node, determining a logic operator according to the description information, and connecting corresponding propositions.
Further, in the third step, the generating formula satisfaction determining process includes:
the method comprises the steps of carrying out formula satisfaction judgment by using a satisfaction judging tool PPTLSAT of an external call, carrying out formula satisfaction judgment by using a generated formula and non-input parameters of the formula, analyzing a file generated by the tool, and extracting a judging result of the tool; wherein the determining process comprises:
(1) It is determined that a formulation P is generated, if P does not construct an LNFG corresponding thereto, it means that P is contradictory and unsatisfiable; otherwise P is satisfied, then executing step (2);
(2) Determining that Q is not a Q of the generated formula, if Q does not have LNFG, it indicates that Q is unsatisfiable, and P is a restated formula, and determining that Q is failed; otherwise, the judgment P is neither contradictory nor the remarking equation, and the judgment is successful.
Another object of the present invention is to provide a natural language-to-PPTL formal specification automatic generation system to which the natural language-to-PPTL formal specification automatic generation method is applied, the natural language-to-PPTL formal specification automatic generation system comprising:
the preprocessing module is used for preprocessing the nature of the input natural language description, deleting redundant meaningless information, removing the part which causes interference to the subsequent text analysis and processing, extracting the sentence components and rearranging sentence components and the whole master-slave sentence structure in a defined syntax form;
the grammar and semantic analysis module is used for carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
the PPTL formula generation module is used for traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connection word operators and the time sequence operators, and generating a complete PPTL formalization protocol according to the logical connection word meaning expressed by the syntax tree structure;
and the PPTL satisfaction judging module is used for judging and generating the satisfaction of the PPTL formula by using a formula satisfaction judging tool PPTLSAT, and analyzing the generated file to obtain a judging result.
It is a further object of the present invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
preprocessing an input natural language text by using Stanford NLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text; carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connecting word operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connecting word meaning expressed by the syntax tree structure; and determining and generating the satisfaction of the PPTL formula by using a formula satisfaction determination tool PPTLSAT, and analyzing the generated file to obtain a determination result.
Another object of the present invention is to provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
preprocessing an input natural language text by using Stanford NLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text; carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connecting word operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connecting word meaning expressed by the syntax tree structure; and determining and generating the satisfaction of the PPTL formula by using a formula satisfaction determination tool PPTLSAT, and analyzing the generated file to obtain a determination result.
Another object of the present invention is to provide an information data processing terminal for implementing the natural language to PPTL formal protocol automatic generation system.
By combining all the technical schemes, the invention has the advantages and positive effects that: the automatic generation method of the formal specification from the natural language to the PPTL can help a user extract the formal specification from the nature of the natural language description for model detection, effectively converts the nature of the natural language text described by the user into a PPTL formula, and provides convenience for the common user to use a model detection technology.
The invention provides a method for converting the nature of the natural language description of a user into the PPTL formal specification, and the user only needs to provide the nature text of the natural language description, does not need long-term learning and strict training, and can obtain a nature specification formula conforming to formal grammar and expected semantics.
The method has no strict constraint on the property text grammar structure provided by the user, supports the user to carry out property description by using the modified phrase and the master-slave compound sentence, and defines the special grammar structure aiming at the monitoring semantic description projected by the core operator prj of the PPTL.
The method integrates a public satisfaction judging tool PPTLSAT, judges whether the formula can be satisfied or not on the basis of formula generation, and prompts a user whether to carry out next model detection work or carry out formula secondary modification according to a judging result.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for automatically generating a natural language to PPTL formal specification in accordance with an embodiment of the invention.
FIG. 2 is a block diagram of an automatic generation system for a natural language to PPTL formal specification provided by an embodiment of the invention;
in the figure: 1. a preprocessing module; 2. a grammar semantic analysis module; 3. a PPTL formula generation module; 4. the PPTL formula satisfaction determination module.
Fig. 3 is an exemplary diagram of a syntax tree provided in an embodiment of the present invention after a pre-processed sentence component extraction and rearrangement operation.
Fig. 3 (a) is a schematic diagram of an original syntax tree provided in an embodiment of the present invention.
Fig. 3 (b) is a schematic diagram of a rearranged syntax tree according to an embodiment of the present invention.
Fig. 4 is an exemplary diagram of a syntax tree generated after a markup text provided by an embodiment of the present invention is subjected to syntactic and semantic analysis.
FIG. 5 is an exemplary diagram of an LNFG generated by the generated PPTL formula satisfaction determination provided by an embodiment of the invention.
FIG. 6 is an exemplary diagram of an LNFG for non-making satisfiability decision generation of formulas provided by embodiments of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides a method and a system for automatically generating a formal specification from natural language to PPTL, which aims at the problems in the prior art, and is described in detail below with reference to the drawings.
As shown in fig. 1, the method for automatically generating the formal specification from natural language to PPTL provided by the embodiment of the present invention includes the following steps:
s101, preprocessing an input natural language text by using Stanford NLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text;
s102, carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
s103, traversing a syntax tree according to a PPTL formula generation rule, combining all atomic propositions, logical connection word operators and time sequence operators, and generating a complete PPTL formal specification according to logical connection word meanings expressed by a syntax tree structure;
s104, the satisfaction of the PPTL formula is judged and generated by using a formula satisfaction judging tool PPTLSAT, and the generated file is analyzed to obtain a judging result.
As shown in fig. 2, the automatic generation system of the natural language to PPTL formal specification provided by the embodiment of the present invention includes:
the preprocessing module 1 is used for preprocessing the nature of the input natural language description, deleting redundant nonsensical information, removing the part which causes interference to the subsequent text analysis and processing, extracting the sentence components and rearranging sentence components and the whole master-slave sentence structure in a syntax form defined by the sentence components;
the grammar and semantic analysis module 2 is used for carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
the PPTL generation module 3 is used for traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connection word operators and the time sequence operators, and generating a complete PPTL formalization protocol according to the logical connection word meaning expressed by the syntax tree structure;
and the PPTL satisfaction judging module 4 is used for judging and generating the satisfaction of the PPTL formula by using a formula satisfaction judging tool PPTLSAT and analyzing the generated file to obtain a judging result.
The technical scheme of the invention is further described below by combining the embodiments.
PPTL is the removal of partial propositions components in PTL, such as: the adjectives, predicates and variables, etc., are therefore a determinable subset of the PTL, and have a complete set of axiom systems, and are equivalent in expression capability to regular expressions, so PPTL is very suitable for describing the nature to be verified in model detection, and is now widely used in software system security verification.
The following english text is an example of natural language property text described by the user:
When an anchor vehicle appears in front, the system sets the time in the next state,then the system will send the message,then when the driver presses the switch button, the vehicle switches modes.
this text defines the safety nature of the take-over behavior of an L3 autonomous vehicle, and take-over request time is considered to be the primary data indicator affecting the take-over behavior of the driver, and is generally described as the expected time of collision of the vehicle with an obstacle or pedestrian identified by the system calculated by the system when the take-over request warning cue signal is issued by the vehicle's autopilot system. The academy regards take-over time as an important parameter for assessing the effect of the driver's take-over sensitivity, generally defined as a specific time interval, which is the time taken from the system to start sending a prompt warning signal until the driver turns off the automatic driving mode. For the driver, the driver releases the automatic driving mode mainly by means of manual braking and steering of the driver or clicking a button on the steering wheel (the system presets that the driver rotates the steering wheel by a certain angle or steps on a brake pedal to exceed a certain angle threshold to trigger an automatic driving closing signal). The invention processes the text as follows:
first, pretreatment: the natural language increases the difficulty of software processing due to the flexible grammar structure, word ambiguity and other characteristics, and the use group facing the method provided by the invention is a common user without formal verification background knowledge, and the natural language property description provided by the user does not have more or less redundant information and fuzzy description, so that the error probability of generating formal reduction is also improved, and the text preprocessing is carried out on the processed natural language sentence, and the information helpful for generating the formal reduction is extracted.
The natural language preprocessing flow comprises the following steps: firstly, performing Stanford NLP grammar tree conversion on natural language sentences, wherein the preprocessing operation is performed on the basis of a data structure of a generated grammar tree; then analyzing the grammar structure of the sentence, extracting sentence components, rearranging sentence components according to the sentence components, extracting the defined grammar form, and removing the interference of structural ambiguity on the analysis of subsequent sentences; and finally, extracting and integrating part-of-speech marks in the grammar tree to generate a mark text.
Fig. 3 shows an effect diagram after sentence component extraction and rearrangement operation, and the invention can compare (a) and (b) to obtain that, after the original syntax tree is subjected to pretreatment operation, the main component of the original sentence is extracted and rearranged, the main sentence and the auxiliary sentence are rearranged according to the sequence of subject, predicate, object and modifier components, the main and auxiliary composite sentence structures are also analyzed and identified, the sub-tree structure of the auxiliary sentence is extracted from the sub-structure of prepositive phrase, and structural rearrangement is performed according to the sequence of leading words, clauses and main sentences of the auxiliary sentence. In this way, the disambiguation operation of the grammar tree is completed, and structural ambiguity is removed from the generated markup text as the recognition problem of the nested substructures is solved.
The generated markup text is as follows:
WRB When DT an NN anchor NN vehicle VBZ appears IN in NN front,DT the NN system VBZ sets DT the NN time IN in DT the JJ next NN state ,RB then DT the NN system MD will VB send DT the NN message ,RB then WRB when DT the NN driver VBZ presses DT the NN switch NN button ,DT the NN vehicle VBZ switches NNS modes .
it can be seen that in the output string, each word has been segmented and the part of speech of each word is marked before.
Secondly, grammar and semantic analysis: the preprocessed markup text is grammatically canonical and structured and is easily accepted and processed by the converter. Therefore, the invention uses Java CC to write a converter which can identify the text conforming to the structured English grammar, converts the input marked text into a syntactic tree with clear structure, connects and generates corresponding specific clause tree nodes according to the syntactic components identified by the converter, and constructs the identified composite sentence components into corresponding specific composite sentence tree structures.
The structured english grammar is defined based on the constituent rearrangement syntax structure in the preprocessing operation, and for nouns, verbs, segmentations, adjectives, and adverbs, the present invention does not redefine them. The structured english grammar rules allow extracting timing semantic information from tenses, adverbs and preposition phrases. Logical connection word sense information is extracted from the master-slave compound sentence and the parallel compound sentence, and a special syntax structure serving prj sequential operator identification is constructed. According to the definition of the grammar rules, the grammar structure of simple sentences is defined by the markup clase of clauses, and a sentence is composed of at least one clause, which means that at least one simple sentence exists in the markup text. A clause is composed of at least two components, namely, a main meaning and a secondary meaning, and is used for expressing the core meaning of the sentence, and the components for expressing the simple sentence can be as follows: the meaning of the sentence can be expanded by selecting complex syntactic structures such as principal and subordinate compound sentences or parallel compound sentences in the form of the principal and predicate or the principal and predicate and the object, and the structured syntactic rule corresponds to the preprocessed sentence components and the syntactic structure in the marked text after the structure rearrangement operation.
FIG. 4 is a structure-clear syntax tree generated from an original syntax tree after preprocessing and syntax semantic analysis, where the syntax tree has been structurally optimized compared with the original syntax tree, and performs sub-structure redefinition for structural semantics between corresponding clauses. It can be found from the structure of the syntax tree that the leaf nodes store atomic propositions and time-sequence description information, and the internal nodes and the root nodes store subordinate conjunctions and parallel conjunctions. And, the present invention uses the type and node labels of the different nodes to distinguish the roles of the nodes. According to the semantics of the when's conjunctions: the method has the semantics of the pre-state triggering the post-state to correspond to logic symbols- > (implication), and the triggering of the pre-state and the post-state has a sequence, and a main sentence and a clause are connected to be used as left and right subtrees of a when node corresponding to a time sequence operator () (next). According to the semantics of the adverb then: the trigger of the prepositive state and the postpositive state has sequential semantics, and the left child node and the right child node of the current node of the grammar tree are connected by using a sequential logic symbol (;) and a (chop). According to the semantics of the phrase "in the next state": the semantic acquisition is that in the mark recognition stage, when the converter recognizes the phrase, synonym matching operation is performed, the semantic containing 'next' is obtained in the synonym set, the meaning that the current proposition is established in the next state is obtained, and the sequential operator () (next) modifies the current atomic proposition. According to the semantics of the temporal flag auxiliary verb 'will': the current proposition holds in a certain state in the future, and the current atomic proposition is modified using a timing operator < > (somemes).
Third, PPTL generation: after preprocessing and syntactic semantic analysis, key syntactic semantic information is obtained from the natural language text and stored in a clearly structured syntactic tree, from which a PPTL formula is then generated. The basic composition of PPTL is an atomic proposition, so the PPTL formula generation process is divided into two steps: traversing the tree structure, firstly, converting clause information of each clause node in the syntax tree into atomic propositions; then, the time sequence information in the tree nodes, the logical relation represented by the tree structure and the atomic proposition are combined to generate a complete PPTL formula.
Traversing the generated syntax tree based on a subsequent traversal algorithm. When traversing the syntax tree, the "ehen" conjunctive nodes are mapped into a combination of logical connectors and sequential operators- > (). The "then" node maps to a logical symbol; (chop). The phrase "in the next state" maps to () (next). The tense help verb will maps to < > (somemes). A sentence is split into a plurality of atomic propositions, the atomic propositions are formed into a PPTL formula according to logical connection word meanings of sentence structure mapping, time sequence semantic repetition exists between 'when' and 'in the next state' in the text, and the semantic simplification operation is performed when the PPTL formula is generated. The atomic proposition generated and the PPTL formula are as follows:
P: an anchor vehicle appears
Q: the system sets the time
R: the system send the message
S: the driver presses the switch button
T: the vehicle switches modes
PPTLformula: ( ( ( P ) -> ()( Q ) );( <>R ) );( ( S ) -> ()( T ) )
the generated atomic proposition has no conjunctions and time sequence description information, and is marked with proposition arguments, and the generated formula also accords with node information in a syntax tree.
Fourth, PPTL satisfaction determination: after the formula is successfully generated, an external call formula satisfaction judging tool PPTLSAT judges the satisfaction of the generated PPTL formula, and outputs a judging result and generates a marked pattern diagram LNFG, namely a state transition diagram corresponding to the formula. The LNFG of the tool configuration, which is shown in fig. 5, is the basis for determining the satisfaction of the formula, which is negated, and the LNFG is generated as shown in fig. 6. The LNFG is constructed successfully through the discovery of the formulas and the inequality of the formulas by the two diagrams, and the generated formulas are neither contradictory nor remarking, so that the formulas can be successfully judged, and the next model detection work can be continued.
Demonstration part (specific examples/experiments/simulations/front experimental data capable of proving the inventive aspects of the present invention, etc.)
The experimental part is processed on a total of 678 property datasets described by natural language in four fields. And processing the natural language texts one by one, generating a PPTL formal protocol, and manually evaluating the correctness of the generated PPTL formula. Five structurally representative natural language property descriptions summarized from the dataset and corresponding PPTL formula generation results are shown in the following table:
and (3) enabling the professional in the field of 5-bit formal verification to carry out cross evaluation on the formulas generated by the experiment, ensuring that each formula is evaluated by more than two evaluation persons, and counting the evaluation results. Using the correctness of the generated formula as an evaluation availability index, the evaluation experiment results are shown in the following table:
total number of | Correct number | Error count | Accuracy rate of |
678 | 635 | 43 | 93.66% |
Of 678 natural language texts, 635 texts are correctly converted into PPTL formal specifications, the rest texts are wrong or failed in conversion of a conversion formula, and the tool achieves acceptable formula generation accuracy.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When used in whole or in part, is implemented in the form of a computer program product comprising one or more computer instructions. When loaded or executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the invention is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present invention will be apparent to those skilled in the art within the scope of the present invention.
Claims (7)
1. An automatic generation method of a natural language to PPTL formalized protocol, which is characterized by comprising the following steps:
preprocessing an input natural language text by using StanfordNLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text;
carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connecting word operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connecting word meaning expressed by the syntax tree structure;
using a formula satisfaction judging tool PPTLSAT to judge the satisfaction of the generated PPTL formula, and analyzing the generated file to obtain a judging result;
the syntax semantic analysis comprises: carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in the marked text, extracting sentence component information, including simple sentence component extraction for forming atomic propositions, extraction of a whole syntactic structure, extraction of logic connecting word semantics and time sequence semantics and the like, recording the extracted information and constructing a syntactic tree;
the PPTL formula generation process includes: traversing the syntax tree, generating atomic propositions for each clause by combining the PPTL formula generation specification; combining all the atomic propositions with the logical connector operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connector meaning expressed by the syntactic tree structure; wherein the traversing comprises:
(1) If a proposition node is encountered, adding a time sequence operator for the proposition according to the mark, taking out proposition description information, and storing the proposition description information into a hash table according to the proposition argument and the proposition description;
(2) If the connector node is encountered, judging the description information of the father node, determining a logic operator according to the description information, and connecting corresponding propositions;
the satisfaction judging process generated by the PPTL formula comprises the following steps: the method comprises the steps of carrying out formula satisfaction judgment by using a satisfaction judging tool PPTLSAT of an external call, carrying out formula satisfaction judgment by using a generated formula and non-input parameters of the formula, analyzing a file generated by the tool, and extracting a judging result of the tool; wherein the determining process comprises:
(1) It is determined that a formulation P is generated, if P does not construct an LNFG corresponding thereto, it means that P is contradictory and unsatisfiable; otherwise P is satisfied, then executing step (2);
(2) Determining that Q is not a Q of the generated formula, if Q does not have LNFG, it indicates that Q is unsatisfiable, and P is a restated formula, and determining that Q is failed; otherwise, the judgment P is neither contradictory nor the remarking equation, and the judgment is successful.
2. The automatic generation method of natural language to PPTL formal specification of claim 1, wherein said input natural language text is preprocessed, comprising:
(1) Processing the input natural language text using Stanford NLP to generate a grammar tree;
(2) Traversing the grammar tree to extract sentence components, rearranging the sentence components, and removing structural ambiguity;
(3) And extracting part-of-speech tags in the grammar tree and integrating the part-of-speech tags into the text to generate a tagged text.
3. The automatic generation method of natural language to PPTL formal specification of claim 2, wherein said extraction and rearrangement of sentence components comprises: general form of sentence component extraction: the trunk of English sentence is subject, predicate and object, then adding adverbs or preposition phrase for modifying sentence; the sentence structure extraction form for the master-slave compound sentence is as follows: the general forms are the forms of master and slave conjunctions, clauses and master sentences, and the general forms of parallel compound sentences are the forms of clauses, parallel conjunctions and clauses;
the sentence component rearrangement process comprises the following steps: firstly, taking a leaf node under each tree structure and a branch deepest position node as a corresponding sentence component, wherein sentence component marks are defined by node meanings of father nodes of the leaf nodes; then, further optimizing the extracted sentence components, and deleting redundant information in the sentence components; and finally, rearranging the extracted sentence components, the master-slave compound sentences and the parallel compound sentence structures according to the syntactic form of the sentence component extraction definition.
4. A natural language to PPTL formal specification automatic generation system that implements the natural language to PPTL formal specification automatic generation method of any one of claims 1 to 3, characterized in that said natural language to PPTL formal specification automatic generation system comprises:
the preprocessing module is used for preprocessing the nature of the input natural language description, deleting redundant meaningless information, removing the part which causes interference to the subsequent text analysis and processing, extracting the sentence components and rearranging sentence components and the whole master-slave sentence structure in a defined syntax form;
the grammar and semantic analysis module is used for carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
the PPTL formula generation module is used for traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connection word operators and the time sequence operators, and generating a complete PPTL formalization protocol according to the logical connection word meaning expressed by the syntax tree structure;
and the PPTL satisfaction judging module is used for judging and generating the satisfaction of the PPTL formula by using a formula satisfaction judging tool PPTLSAT, and analyzing the generated file to obtain a judging result.
5. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the method of automatic generation of natural language to PPTL formal protocols of any one of claims 1 to 3, comprising the steps of:
preprocessing an input natural language text by using Stanford NLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text; carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connecting word operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connecting word meaning expressed by the syntax tree structure; and determining and generating the satisfaction of the PPTL formula by using a formula satisfaction determination tool PPTLSAT, and analyzing the generated file to obtain a determination result.
6. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the natural language to PPTL-formal automatic generation method of any one of claims 1 to 3, comprising the steps of:
preprocessing an input natural language text by using Stanford NLP, deleting redundant information, and rearranging the identified sentence components and structures to generate a marked text; carrying out grammar and semantic analysis on the marked text by using a Java CC tool, identifying marks in sentences, extracting sentence component information and constructing a syntax tree;
traversing the syntax tree according to the PPTL formula generation rule, combining all the atomic propositions with the logical connecting word operators and the time sequence operators, and generating a complete PPTL formal specification according to the logical connecting word meaning expressed by the syntax tree structure; and determining and generating the satisfaction of the PPTL formula by using a formula satisfaction determination tool PPTLSAT, and analyzing the generated file to obtain a determination result.
7. An information data processing terminal for implementing the natural language to PPTL formal specification automatic generation system of claim 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110457578.XA CN113255295B (en) | 2021-04-27 | 2021-04-27 | Automatic generation method and system for formal protocol from natural language to PPTL |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110457578.XA CN113255295B (en) | 2021-04-27 | 2021-04-27 | Automatic generation method and system for formal protocol from natural language to PPTL |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113255295A CN113255295A (en) | 2021-08-13 |
CN113255295B true CN113255295B (en) | 2024-04-09 |
Family
ID=77221894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110457578.XA Active CN113255295B (en) | 2021-04-27 | 2021-04-27 | Automatic generation method and system for formal protocol from natural language to PPTL |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113255295B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114282530B (en) * | 2021-12-24 | 2024-06-07 | 厦门大学 | Complex sentence emotion analysis method based on grammar structure and connection information trigger |
CN114896973A (en) * | 2022-03-10 | 2022-08-12 | 北京有竹居网络技术有限公司 | Text processing method and device and electronic equipment |
CN116629272B (en) * | 2023-07-24 | 2023-10-10 | 山东大学 | Text generation method and system controlled by natural language |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0744560A (en) * | 1993-08-02 | 1995-02-14 | Hitachi Ltd | Logical structure recognition processing system in document processor |
US5966686A (en) * | 1996-06-28 | 1999-10-12 | Microsoft Corporation | Method and system for computing semantic logical forms from syntax trees |
KR20120079930A (en) * | 2011-01-06 | 2012-07-16 | 에스케이 텔레콤주식회사 | Method for converting composite sentence including natural language and mathematical formula into logical expression, apparatus and computer-readable recording medium with program therefor |
CN102663190A (en) * | 2012-04-09 | 2012-09-12 | 西安电子科技大学 | PPTL (propositional projection temporal logic) symbolic model checking method |
JP2017111411A (en) * | 2015-12-20 | 2017-06-22 | 株式会社マルセイ | Method for education, study or research, computer program for the same and processor |
CN110705316A (en) * | 2019-09-29 | 2020-01-17 | 南京大学 | Method and device for generating linear time sequence logic protocol of smart home |
CN111581953A (en) * | 2019-01-30 | 2020-08-25 | 武汉慧人信息科技有限公司 | Method for automatically analyzing grammar phenomenon of English text |
CN111767739A (en) * | 2020-05-26 | 2020-10-13 | 西安电子科技大学 | Based on PPTL3WeChat cluster online monitoring method and system |
-
2021
- 2021-04-27 CN CN202110457578.XA patent/CN113255295B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0744560A (en) * | 1993-08-02 | 1995-02-14 | Hitachi Ltd | Logical structure recognition processing system in document processor |
US5966686A (en) * | 1996-06-28 | 1999-10-12 | Microsoft Corporation | Method and system for computing semantic logical forms from syntax trees |
KR20120079930A (en) * | 2011-01-06 | 2012-07-16 | 에스케이 텔레콤주식회사 | Method for converting composite sentence including natural language and mathematical formula into logical expression, apparatus and computer-readable recording medium with program therefor |
CN102663190A (en) * | 2012-04-09 | 2012-09-12 | 西安电子科技大学 | PPTL (propositional projection temporal logic) symbolic model checking method |
JP2017111411A (en) * | 2015-12-20 | 2017-06-22 | 株式会社マルセイ | Method for education, study or research, computer program for the same and processor |
CN111581953A (en) * | 2019-01-30 | 2020-08-25 | 武汉慧人信息科技有限公司 | Method for automatically analyzing grammar phenomenon of English text |
CN110705316A (en) * | 2019-09-29 | 2020-01-17 | 南京大学 | Method and device for generating linear time sequence logic protocol of smart home |
CN111767739A (en) * | 2020-05-26 | 2020-10-13 | 西安电子科技大学 | Based on PPTL3WeChat cluster online monitoring method and system |
Non-Patent Citations (2)
Title |
---|
一种汉语语句依存关系网协动生成方法研究;郭艳华, 周昌乐;杭州电子工业学院学报(04);全文 * |
英汉机器翻译中基于规则的句子结构分析与转换;吴保民;郭永辉;王炳锡;;信息工程大学学报(01);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113255295A (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113255295B (en) | Automatic generation method and system for formal protocol from natural language to PPTL | |
WO2016050066A1 (en) | Method and device for parsing interrogative sentence in knowledge base | |
Zhang et al. | SG-Net: Syntax guided transformer for language representation | |
CN110321563B (en) | Text emotion analysis method based on hybrid supervision model | |
CN112163681B (en) | Equipment fault cause determining method, storage medium and electronic equipment | |
CN112613326B (en) | Tibetan language neural machine translation method fusing syntactic structure | |
US10460028B1 (en) | Syntactic graph traversal for recognition of inferred clauses within natural language inputs | |
CN110888943B (en) | Method and system for assisted generation of court judge document based on micro-template | |
Buzhinsky | Formalization of natural language requirements into temporal logics: a survey | |
RU2640297C2 (en) | Definition of confidence degrees related to attribute values of information objects | |
WO2018174816A1 (en) | Method and apparatus for semantic coherence analysis of texts | |
Li et al. | Neural factoid geospatial question answering | |
Mezghanni et al. | Deriving ontological semantic relations between Arabic compound nouns concepts | |
Acharjee et al. | Sequence-to-sequence learning-based conversion of pseudo-code to source code using neural translation approach | |
Yin | Fuzzy information recognition and translation processing in English interpretation based on a generalized maximum likelihood ratio algorithm | |
Reshadat et al. | Confidence measure estimation for open information extraction | |
Chen | Identification of Grammatical Errors of English Language Based on Intelligent Translational Model | |
CN114548113A (en) | Event-based reference resolution system, method, terminal and storage medium | |
Yin | Fuzzy information recognition and translation processing in English interpretation based on artificial intelligence recognition technology | |
Saji et al. | Natural Language Inference using Neural Network and Tableau Method | |
Patel et al. | Resolve the uncertainity in requirement specification to generate the UML diagram | |
CN113283250B (en) | Automatic machine translation test method based on syntactic component analysis | |
CN114281940B (en) | Computer cognition method and system based on semantic engineering and case learning | |
Wang et al. | Automatic generation of specification from natural language based on temporal logic | |
Zhao | Design of Intelligent Proofreading System Based on Artificial Intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |