CN114548057A

CN114548057A - Method, system, device, processor and storage medium for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard

Info

Publication number: CN114548057A
Application number: CN202210155676.2A
Authority: CN
Inventors: 沈一峰; 魏宇梅; 盛钦润; 李华芳
Original assignee: Shanghai Mental Health Center Shanghai Psychological Counselling Training Center
Current assignee: Shanghai Mental Health Center Shanghai Psychological Counselling Training Center
Priority date: 2022-02-21
Filing date: 2022-02-21
Publication date: 2022-05-27

Abstract

The invention relates to a method for realizing automatic annotation processing of interview texts based on diagnosis and treatment standard information of depression, which comprises the following steps of: automatically labeling the interview text, segmenting and word property through color identification, and labeling a label of an entity at a corresponding position of the text; selecting a label from a part-of-speech list to be selected; and judging whether the segmentation and the labeling of the text are wrong or not during the automatic labeling processing, and if so, manually labeling. The invention also relates to a system, a device, a processor and a computer readable storage medium for realizing the automatic annotation processing of the interview text based on the diagnosis and treatment standard information of the depression. The method, the system, the device, the processor and the computer readable storage medium for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information integrate manual labeling and automatic labeling, are simple to operate, and can realize various functions such as text pre-labeling, data analysis and the like. By utilizing the platform, entity and relation labeling in the medical knowledge map in the depression field is carried out.

Description

Method, system, device, processor and storage medium for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard

Technical Field

The invention relates to the field of artificial intelligence, in particular to the field of natural language processing, and specifically relates to a method, a system, a device, a processor and a computer readable storage medium for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard information.

Background

With the rapid development of society and the popularization of psychological knowledge, depression is widely concerned as a common psychological disease. Aiming at the condition that the traditional depression diagnosis is greatly influenced by subjective factors and seriously depends on the level of doctors at present, a plurality of researchers begin to research the relationship between factors such as depression, language interview, expression mode and the like by using methods such as machine learning, deep learning and the like. The language symptoms presented by the depression as a disease are diversified, and certain correlation may exist among the language symptoms, so that the diagnosis by only using a single empirical diagnosis symptom ICD-10 is difficult to meet the high precision requirement of diagnosis. Therefore, the patent considers that the diagnosis process of doctors can be simulated, and the patient condition can be comprehensively analyzed from a plurality of language angles, so that the diagnosis accuracy is improved, and the diagnosis of doctors is assisted or verified.

The knowledge graph is used as a networked knowledge base, domain knowledge and the relation between the knowledge can be effectively extracted, the current main application fields comprise intelligent search, intelligent recommendation, intelligent question and answer and the like, and the application of the knowledge graph to the field of mental health and medical care is also a research hotspot at present. Based on the existing data construction and the artificial historical annotation of depression diagnosis, the knowledge map can be used for constructing the intelligent depression diagnosis, various information concerned in the depression diagnosis is effectively integrated by constructing the knowledge map, the diagnosis process of a doctor is effectively simulated, and the accurate and comprehensive evaluation is made for a patient from multiple angles. In the popularization of the intelligent diagnosis scheme, the system is usually relied on to be convenient for patients and doctors to use, and the intelligent depression diagnosis labeling system is designed by using knowledge of a knowledge map. The patent firstly constructs entities and attributes of depression diagnosis according to an ICD-10 scale and expert opinions, and then realizes the overall design and operation flow of a labeling system.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a method, a system, a device, a processor and a computer readable storage medium for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard information, which have the advantages of high accuracy, simple and convenient operation and wide application range.

In order to achieve the above object, the method, system, apparatus, processor and computer readable storage medium for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information of the present invention are as follows:

the method for realizing automatic annotation processing of interview texts based on the diagnosis and treatment standard information of the depression is mainly characterized by comprising the following steps of:

(1) automatically labeling the interview text, segmenting and word property through color identification, and labeling a label of an entity at a corresponding position of the text;

(2) selecting a label from a part-of-speech list to be selected;

(3) judging whether the segmentation and the labeling of the text are wrong or not in the automatic labeling process, and if so, continuing the step (4); otherwise, exiting the step;

(4) carrying out manual labeling;

(5) and saving the text content subjected to automatic labeling processing.

Preferably, the step (1) specifically comprises the following steps:

(1.1) carrying out word segmentation and part-of-speech tagging on the interview text;

(1.2) performing text characteristic analysis;

(1.3) labeling the label of the entity at the corresponding position above the character;

and (1.3) automatically identifying disease vocabulary labels, and labeling vocabularies through colors.

Preferably, the step (4) specifically includes the following steps:

(4.1) carrying out event annotation, selecting an event type and a sub-type, and loading a template of the event;

(4.2) selecting a corresponding event element, selecting a corresponding entity or deleting a selected entity.

Preferably, the step (3) is specifically:

judging whether the part-of-speech list to be selected does not have a correct label, or the segmentation is wrong during automatic labeling processing, or the entity is wrongly labeled, if so, continuing the step (4); otherwise, the step is exited.

Preferably, the method further comprises the steps of:

and carrying out data analysis on the labeled content and generating a labeled comparison report.

The system for realizing the automatic annotation processing of the interview text based on the depression diagnosis and treatment standard information is mainly characterized by comprising the following steps:

the original text display module is used for displaying the text content of the doctor-patient interview;

the automatic labeling processing module is connected with the original text display module and is used for carrying out automatic entity identification on the interview text and carrying out automatic labeling;

and the manual marking module is connected with the original text display module and the automatic marking processing module and is used for modifying and perfecting the marked content after automatic marking.

Preferably, the automatic labeling processing module includes:

the segmentation and labeling unit is connected with the original text display module and is used for performing word segmentation and part-of-speech labeling on the interview text;

the color labeling unit is connected with the original text display module and is used for labeling the segmented and part-of-speech recognized vocabulary through colors;

and the label marking unit is connected with the original text display module and is used for marking the label of the entity at the corresponding position above the character.

Preferably, the manual labeling module comprises:

the event marking unit is connected with the original text display module and the automatic marking processing module and is used for selecting a content template of event marking and marking the text according to the template;

and the label modifying unit is connected with the original text display module and the automatic labeling processing module and is used for editing or deleting the labeled content.

Preferably, the manual labeling module comprises a data analysis unit connected with the original text display module and used for performing data analysis on the labeled content and generating a labeling comparison report.

The device for realizing the automatic annotation processing of the interview text based on the depression diagnosis and treatment standard information is mainly characterized by comprising the following steps of:

a processor configured to execute computer-executable instructions;

and the memory stores one or more computer executable instructions, and when the computer executable instructions are executed by the processor, the steps of the method for realizing the automatic interview text labeling processing based on the depression diagnosis and treatment standard information are realized.

The processor for realizing the interview text automatic labeling processing based on the depression diagnosis and treatment standard information is mainly characterized in that the processor is configured to execute computer executable instructions, and the computer executable instructions are executed by the processor to realize the steps of the method for realizing the interview text automatic labeling processing based on the depression diagnosis and treatment standard information.

The computer readable storage medium is mainly characterized in that a computer program is stored on the computer readable storage medium, and the computer program can be executed by a processor to realize the steps of the method for realizing the automatic interview text labeling processing based on the depression diagnosis and treatment standard information.

By adopting the method, the system, the device, the processor and the computer-readable storage medium for realizing the automatic marking processing of the interview text based on the depression diagnosis and treatment standard information, the interview text can automatically process the summarizing function of the ICD-10 label of the interview text after passing through the automatic marking system, and the diagnosis label can be automatically assigned. The invention integrates manual labeling and automatic labeling, has simple operation and can realize multiple functions of text pre-labeling, data analysis and the like. By utilizing the platform, entity and relation labeling in the medical knowledge map in the depression field is carried out.

Drawings

Fig. 1 is a flow chart of automatic and manual labeling of the method for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information.

Fig. 2 is a schematic diagram of an interface display of the system for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information.

Fig. 3 is a schematic diagram of the internal structure of the device for realizing the interview text automatic labeling processing based on the depression diagnosis and treatment standard information of the invention.

Fig. 4 is a labeling schematic diagram of an embodiment of the system for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information according to the invention.

Detailed Description

In order to more clearly describe the technical contents of the present invention, the following further description is given in conjunction with specific embodiments.

The invention discloses a method for realizing automatic annotation processing of interview texts based on diagnosis and treatment standard information of depression, which comprises the following steps of:

(2) selecting a label from a part-of-speech list to be selected;

(4) carrying out manual labeling;

(5) and saving the text content subjected to automatic labeling processing.

As a preferred embodiment of the present invention, the step (1) specifically comprises the following steps:

(1.2) performing text feature analysis;

As a preferred embodiment of the present invention, the step (4) specifically comprises the following steps:

As a preferred embodiment of the present invention, the step (3) specifically comprises:

As a preferred embodiment of the present invention, the method further comprises the steps of:

the original text display module is used for displaying the text content of doctor-patient interviews;

As a preferred embodiment of the present invention, the automatic labeling processing module includes:

As a preferred embodiment of the present invention, the manual labeling module includes:

As a preferred embodiment of the present invention, the manual annotation module includes a data analysis unit, connected to the original text display module, and configured to perform data analysis on the annotated content and generate an annotation comparison report.

The device for realizing the automatic annotation processing of the interview text based on the depression diagnosis and treatment standard information comprises the following steps:

a processor configured to execute computer-executable instructions;

The processor for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information is configured to execute computer executable instructions, and when the computer executable instructions are executed by the processor, the steps of the method for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information are realized.

The computer readable storage medium of the invention stores thereon a computer program executable by a processor to implement the steps of the above method for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information.

In the specific embodiment of the invention, an automatic labeling model supporting the field of mental health is provided, and innovation research is easily supported. An ICD-10 label system is automatically processed and labeled in the intelligent depression diagnosis and case generation system, and structured analysis is carried out on interview results. The user can automatically label the character records through the web visual interface and edit the formed labels.

The automatic labeling module mainly comprises a patient list and a labeling detail page. Wherein the patient list page is used for displaying the information of the patient, and is convenient for the doctor to select the patient. In order to quickly select a patient, the interface is added with information such as organization name, telephone number and the like for inquiry. In order to conveniently view the unmarked and marked patient information, the page is divided into a left part and a right part, wherein the left part is a personnel list needing marking, the right part is a personnel list marked with viewable reports, and the doctor can mark the marked personnel again.

The labeling page can be divided into three parts, the first part is a personal information and result display area, and a doctor can modify the detection result; the second part is a label area which can display the result of the attribute value calculated by the algorithm, and a doctor proposes manual editing and verification by watching the automatically labeled state.

The main labeled content of the labeling page is the attribute value of the depression ICD-10 diagnosis entity, and the calculation of the entity attribute value covers a plurality of algorithms, including the basic manual labeling data and the dictionary identification algorithm of the patent research. Patient descriptive textual data in a medical interview is semi-structured or unstructured data that is difficult to apply directly to. Labeling entities and entity relationships contained in the text of the depression interview is an important means for text structuring and is also the basis for named entity recognition and relationship automatic extraction research. Before the automatic labeling algorithm is used, big data must be introduced for relation training, the traditional manual labeling method is labor-consuming and time-consuming, and no history medical record exists, so that the method is difficult to adapt to the requirement of big data development. The invention takes the task of constructing Chinese depression medical terms as drive, and constructs a semi-automatic entity and a relation labeling platform. The platform integrates manual labeling and automatic labeling, is simple to operate, and can realize multiple functions of text pre-labeling, data analysis and the like. By utilizing the platform, entity and relation labeling in the medical knowledge map in the depression field is carried out. Because the algorithm of the patent is realized by Python script programming, the Python script can be deployed and operated on a server by using a child process module of node. In order to accelerate the establishment of the depression knowledge map, the system is put into use under the condition of perfecting all algorithms. When marking, the doctor can mark the attribute value of the unaccessed algorithm and correct the attribute value of the accessed algorithm. The data after doctor labeling and modification can be used for training and further optimizing the model. Based on the last two operations, when the calculation algorithm of the attribute value of the depression diagnosis labeling system constructed by the entity and the attribute is available, automatic labeling can be realized.

The operation steps of the invention are as follows:

step 1: entering a visual operation interface, maintaining the label of the entity ICD index, the entity and the relationship and attribute, and having simple operation;

step 2: opening an interview text, manually selecting keys to match entities, and embedding various automatic identification and extraction algorithms to assist in labeling;

and step 3: the annotation page can be clicked, the data analysis function is browsed, and meanwhile, the generation of an annotation comparison report is supported;

and 4, step 4: index management is carried out after the labeling is finished, all keywords can be summarized in the browsing and labeling process, and the keywords can be checked conveniently to ensure the labeling quality;

and 5: an index system can be newly established for the new diseases, has good customizability, is not only suitable for medical texts of depression, but also supports similar operations of various other diseases; all the systems are developed based on the Web framework of Net, and are simple in configuration and strong in portability.

In the operation process of the embodiment of the invention, the label page of the automatic ICD-10 label is clicked, and the page is divided into three areas: an original text display area, a labeling operation area (part of speech analysis), and an ICD-10 tag display area, as shown in FIG. 2.

The automatic labeling focuses on the labeling operation area, and here, it can be seen that the pre-labeling model has dyed and displayed the segmentation and part-of-speech automatic labeling areas, and the labels of the entities are labeled at the corresponding positions on the upper layer of the characters. In the automatic labeling processing state, the coloring area marks words with segmentation and part-of-speech labeled history records, the functions of the words are the same as those of part-of-speech labeling (standard version) of history ICD-10 labels, and possible labeling labels can be selected in a pull-down list of a popped part-of-speech selection pop-up box by clicking the corresponding area. If the labels to be selected are not correct labeling labels, or the segmentation of the model preprocessing is wrong, or the labeling of the entity is wrong, the 'editing mode' on the right side needs to be clicked, and the page for processing the problem is entered for further processing. After the tasks such as cutting and entity marking are finished, an event marking option box can be selected, an event marking function (including 'course of disease', 'exclusion criterion', 'severity') is carried out, and the content template of the event marking is divided into 12 types and 2 levels, and the definition is the same as that of ICD-10. After the event type and the sub-type are selected, the template of the event is automatically loaded. Clicking on the corresponding event element and the role box activates the corresponding element, and then the corresponding entity can be selected in the left operation area, or the selected entity can be deleted. After the elements of the event are selected, the 'save' can be clicked to complete the automatic labeling function of the current round.

For a specific implementation of this embodiment, reference may be made to the relevant description in the above embodiments, which is not described herein again.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by suitable instruction execution devices. For example, if implemented in hardware, as in another embodiment, any one or combination of the following technologies, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried out in the method for implementing the above embodiment may be implemented by hardware related to instructions of a program, and the corresponding program may be stored in a computer readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

By adopting the method, the system, the device, the processor and the computer-readable storage medium for realizing the automatic tagging processing of the interview text based on the depression diagnosis and treatment standard information, the interview text can automatically process the summarizing function of the ICD-10 label of the interview text after passing through the automatic tagging system, and the diagnosis label can be automatically assigned. The invention integrates manual labeling and automatic labeling, has simple operation and can realize multiple functions of text pre-labeling, data analysis and the like. By utilizing the platform, entity and relation labeling in the medical knowledge map in the depression field is carried out.

In this specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method for realizing automatic annotation processing of interview texts based on diagnosis and treatment standard information of depression is characterized by comprising the following steps:

(2) selecting a label from a part-of-speech list to be selected;

(4) carrying out manual labeling;

(5) and saving the text content subjected to automatic labeling processing.

2. The method for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information according to claim 1, wherein the step (1) specifically comprises the following steps:

(1.2) performing text characteristic analysis;

3. The method for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information according to claim 1, wherein the step (4) specifically comprises the following steps:

(4.1) event annotation is carried out, event types and sub-types are selected, and event templates are loaded;

4. The method for realizing interview text automatic labeling processing based on the diagnosis and treatment standard information of depression according to claim 1, wherein the step (3) is specifically as follows:

5. The method for realizing the automatic annotation processing of the interview text based on the diagnosis and treatment standard information of depression according to claim 1, wherein the method further comprises the following steps:

6. A system for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard information for realizing the method of any one of claims 1 to 5, wherein the system comprises:

7. The system for realizing interview text automatic labeling processing based on the diagnosis and treatment standard information of depression according to claim 6, wherein the automatic labeling processing module comprises:

8. The system for realizing interview text automatic labeling processing based on depression diagnosis and treatment standard information according to claim 6, wherein the manual labeling module comprises:

9. The system for realizing interview text automatic labeling processing based on the depression diagnosis and treatment standard information according to claim 1, wherein the manual labeling module comprises a data analysis unit which is connected with the original text display module and is used for carrying out data analysis on labeled contents and generating a labeling comparison report.

10. An apparatus for implementing an interview text automatic labeling process based on depression diagnosis and treatment standard information, the apparatus comprising:

a processor configured to execute computer-executable instructions;

a memory storing one or more computer-executable instructions that, when executed by the processor, perform the steps of the method of any one of claims 1 to 5 for performing interview text automatic labeling processing based on depression diagnosis and treatment standard information.

11. A processor for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information, characterized in that the processor is configured to execute computer executable instructions, and when the computer executable instructions are executed by the processor, the steps of the method for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information are implemented according to any one of claims 1 to 5.

12. A computer-readable storage medium, on which a computer program is stored, the computer program being executable by a processor to implement the steps of the method for implementing interview text automatic labeling processing based on depression diagnosis and treatment standard information according to any one of claims 1 to 5.