文本语义识别方法、装置、计算机设备和存储介质Text semantic recognition method, device, computer equipment and storage medium
本申请要求于2019年08月13日提交中国专利局、申请号为201910744603.5,发明名称为“文本语义识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 13, 2019, the application number is 201910744603.5, and the invention title is "Text Semantic Recognition Method, Device, Computer Equipment and Storage Medium", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及人工智能技术领域,特别是涉及文本语义识别方法、装置、计算机设备和存储介质。This application relates to the field of artificial intelligence technology, in particular to text semantic recognition methods, devices, computer equipment and storage media.
背景技术Background technique
随着互联网的发展,文本语义识别技术得到了越来越广泛的应用。尤其是在智能问答领域,为了准确回答用户咨询的问题,通常需要将用户输入的语音转成文本数据,进一步对文本数据进行语义识别,判断文本数据所表达的真实含义,从而准确快速的回答用户所咨询的问题。With the development of the Internet, text semantic recognition technology has been more and more widely used. Especially in the field of intelligent question answering, in order to accurately answer the user's inquiry, it is usually necessary to convert the user's input voice into text data, and further semantically recognize the text data to determine the true meaning of the text data, so as to answer the user accurately and quickly Questions consulted.
在网络平台方面,为了维护网络用语的文明,提升用户的使用体验,通常采用文本语义识别技术对发布在网络上的文本进行语义识别,以此识别出暴力、低俗、敏感话题、商业广告等语义信息的文本。In terms of online platforms, in order to maintain the civilization of online language and improve the user experience, text semantic recognition technology is usually used to identify the text published on the Internet to identify the semantics of violence, vulgar, sensitive topics, commercial advertisements, etc. The text of the information.
目前,大部分文本语义分析技术采用关键词匹配方法进行处理,需要预先构建关键词数据库,将待识别的文本与已构建的数据库中的关键词进行匹配以此识别出敏感词。发明人意识到,对于数据库中未记录的关键词则无法准确识别其语义,也就是说关键词的覆盖范围限制了文本语义识别的准确率,从而使得文本语义识别的准确率较低。At present, most text semantic analysis technologies use keyword matching methods for processing, and a keyword database needs to be constructed in advance, and the text to be recognized is matched with keywords in the constructed database to identify sensitive words. The inventor realizes that the semantics of keywords that are not recorded in the database cannot be accurately recognized, that is, the coverage of keywords limits the accuracy of text semantic recognition, so that the accuracy of text semantic recognition is low.
发明内容Summary of the invention
基于此,有必要针对上述技术问题,提供一种文本语义识别方法、装置、计算机设备和存储介质。Based on this, it is necessary to provide a text semantic recognition method, device, computer equipment, and storage medium for the above technical problems.
一种文本语义识别方法,所述方法包括:A text semantic recognition method, the method includes:
计算目标文本中每个文本字符的字向量及每个文本分词的词向量;Calculate the word vector of each text character in the target text and the word vector of each text segmentation;
将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量;Splicing the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character;
按照文本字符在所述目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到所述目标文本基于正向出现顺序的第一文本特征;According to the forward appearance order of text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the first neural network to obtain the first order of the target text based on the forward appearance order. Text feature
按照文本字符在所述目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到所述目标文本基于逆向出现顺序的第二文本特征;According to the reverse appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the second neural network to obtain the second text feature of the target text based on the reverse appearance order ;
将由所述第一文本特征与所述第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到所述目标文本的语义类型。The integrated text feature obtained by splicing the first text feature and the second text feature is input to a third neural network to obtain the semantic type of the target text.
一种文本语义识别装置,所述装置包括:A text semantic recognition device, the device includes:
向量计算模块,用于计算目标文本中每个文本字符的字向量及每个文本分词的词向量;The vector calculation module is used to calculate the word vector of each text character in the target text and the word vector of each text word segmentation;
向量拼接模块,用于将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量;The vector splicing module is used to splice the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character;
第一文本特征获取模块,用于按照文本字符在所述目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到所述目标文本基于正向出现顺序的第一文本特征;The first text feature acquisition module is used to sequentially input the word vectors and splicing vectors corresponding to multiple text characters into different hidden layers of the first neural network according to the forward appearance order of the text characters in the target text to obtain the target The text is based on the first text feature in the forward order of appearance;
第二文本特征获取模块,用于按照文本字符在所述目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到所述目标文本 基于逆向出现顺序的第二文本特征;The second text feature acquisition module is used to input word vectors and splicing vectors corresponding to multiple text characters into different hidden layers of the second neural network according to the reverse appearance order of text characters in the target text to obtain the target text The second text feature based on the reverse appearance order;
语义类型获取模块,用于将由所述第一文本特征与所述第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到所述目标文本的语义类型。The semantic type acquisition module is configured to input the integrated text feature obtained by splicing the first text feature and the second text feature into a third neural network to obtain the semantic type of the target text.
一种计算机设备,包括存储器和处理器,所述处理器、和所述存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器用于执行所述存储器的所述程序指令,其中:A computer device includes a memory and a processor, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is used to execute the The program instructions of the memory, wherein:
计算目标文本中每个文本字符的字向量及每个文本分词的词向量;Calculate the word vector of each text character in the target text and the word vector of each text segmentation;
将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量;Splicing the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character;
按照文本字符在所述目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到所述目标文本基于正向出现顺序的第一文本特征;According to the forward appearance order of text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the first neural network to obtain the first order of the target text based on the forward appearance order. Text feature
按照文本字符在所述目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到所述目标文本基于逆向出现顺序的第二文本特征;According to the reverse appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the second neural network to obtain the second text feature of the target text based on the reverse appearance order ;
将由所述第一文本特征与所述第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到所述目标文本的语义类型。The integrated text feature obtained by splicing the first text feature and the second text feature is input to a third neural network to obtain the semantic type of the target text.
一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时,用于实现以下步骤:A computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, they are used to implement the following steps:
计算目标文本中每个文本字符的字向量及每个文本分词的词向量;Calculate the word vector of each text character in the target text and the word vector of each text segmentation;
将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量;Splicing the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character;
按照文本字符在所述目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到所述目标文本基于正向出现顺序的第一文本特征;According to the forward appearance order of text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the first neural network to obtain the first order of the target text based on the forward appearance order. Text feature
按照文本字符在所述目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到所述目标文本基于逆向出现顺序的第二文本特征;According to the reverse appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the second neural network to obtain the second text feature of the target text based on the reverse appearance order ;
将由所述第一文本特征与所述第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到所述目标文本的语义类型。The integrated text feature obtained by splicing the first text feature and the second text feature is input to a third neural network to obtain the semantic type of the target text.
上述文本语义识别方法、装置、计算机设备和存储介质,通过计算获得每个文本字符对应的字向量和所属文本分词的词向量并进行向量拼接,得到与文本字符对应的拼接向量,通过对文本字符进行向量拼接,通过多种特征向量来表征文本,增强了文本语言表示的特征维度。进一步,通过将字向量和拼接向量按照正向顺序和逆向顺序输入至不同神经网络的不同隐层,可以更充分的获取文本字符的相关信息,挖掘文本字符之间的上下文语义,使得经第一神经网络输出的第一文本特征和第二文本特征进行拼接得到综合特征,可以更充分的表达目标文本的语义特征,提高了文本语义识别的准确率。The above-mentioned text semantic recognition method, device, computer equipment and storage medium obtain the word vector corresponding to each text character and the word vector of the word segmentation of the text and perform vector splicing to obtain the splicing vector corresponding to the text character. Vector splicing is used to characterize text through multiple feature vectors, which enhances the feature dimension of text language representation. Furthermore, by inputting word vectors and splicing vectors into different hidden layers of different neural networks in forward and reverse order, relevant information of text characters can be obtained more fully, and the contextual semantics between text characters can be mined, so that the first The first text feature and the second text feature output by the neural network are spliced to obtain a comprehensive feature, which can more fully express the semantic feature of the target text and improve the accuracy of text semantic recognition.
附图说明Description of the drawings
图1为一个实施例中文本语义识别方法的应用场景图;Figure 1 is an application scenario diagram of a text semantic recognition method in an embodiment;
图2为一个实施例中文本语义识别方法的流程示意图;FIG. 2 is a schematic flowchart of a text semantic recognition method in an embodiment;
图3为一个实施例中预设文件生成的流程示意图;FIG. 3 is a schematic diagram of a process of generating a preset file in an embodiment;
图4为一个实施例中文本语义识别装置的结构框图;Figure 4 is a structural block diagram of a text semantic recognition device in an embodiment;
图5为另一个实施例中文本语义识别装置的结构框图;Figure 5 is a structural block diagram of a text semantic recognition device in another embodiment;
图6为一个实施例中计算机设备的内部结构图。Fig. 6 is an internal structure diagram of a computer device in an embodiment.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.
本申请提供的文本语义识别方法,可以应用于如图1所示的应用环境中。该文本语义识别方法应用于文本语义系统中。该文本语义系统包括终端102和服务器104。其中,终端102与服务器104通过网络进行通信。文本语义识别方法可以在终端102或服务器104完成,终端102可以采集待识别的目标文本并在终端102上采用上述文本语义识别方法进行语义类型的识别。或者终端102可以获取待识别的目标文本后,通过网络连接将目标文本传输至服务器104,服务器104采用上述文本语义识别方法对目标文本进行语义类型的识别。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The text semantic recognition method provided in this application can be applied to the application environment as shown in FIG. 1. The text semantic recognition method is applied to the text semantic system. The text semantic system includes a terminal 102 and a server 104. Wherein, the terminal 102 and the server 104 communicate through the network. The text semantic recognition method can be completed in the terminal 102 or the server 104, and the terminal 102 can collect the target text to be recognized and use the above-mentioned text semantic recognition method on the terminal 102 to perform semantic type recognition. Or the terminal 102 may obtain the target text to be recognized, and then transmit the target text to the server 104 through a network connection, and the server 104 uses the above-mentioned text semantic recognition method to recognize the target text in semantic type. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented as an independent server or a server cluster composed of multiple servers.
在一个实施例中,如图2所示,提供了一种文本语义识别方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:In one embodiment, as shown in FIG. 2, a method for text semantic recognition is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
步骤S202,计算目标文本中每个文本字符的字向量及每个文本分词的词向量。Step S202: Calculate the word vector of each text character in the target text and the word vector of each text segmentation.
其中,文本字符是由目标文本切分得到的多个独立字符。文本字符具体可以是字母、数字、字或符号等。文本分词是指将目标文本切分成一个一个单独的词,即将连续的字序列按照一定的规范重新组合成词序列的过程。文本分词可以采用基于字符串匹配的分词方法、基于语义的分词方法和基于统计的分词方法进行分词。字向量和词向量用于表征目标文本的多维度表示形式。Among them, the text characters are multiple independent characters obtained by segmenting the target text. The text characters can specifically be letters, numbers, words or symbols. Text segmentation refers to the process of dividing the target text into individual words, that is, recombining consecutive word sequences into word sequences according to certain specifications. Text segmentation can be segmented using string matching-based word segmentation methods, semantic-based word segmentation methods, and statistics-based word segmentation methods. The word vector and word vector are used to represent the multi-dimensional representation of the target text.
具体地,服务器根据获取到的目标文本,确定目标文本中所包含的每个文本字符以及每个文本字符所属的文本分词,通过预先训练好的字向量库或词向量库,匹配得到目标文本中每个文本字符对应的字向量和文本分词对应的词向量。服务器也可以通过预设的向量编码规则对获得的文本字符和文本分词进行编码,得到对应的字向量和词向量。Specifically, the server determines each text character contained in the target text and the text segmentation to which each text character belongs according to the acquired target text, and matches the pre-trained word vector library or word vector library to obtain the target text The word vector corresponding to each text character and the word vector corresponding to the text segmentation. The server may also encode the obtained text characters and text word segmentation through preset vector encoding rules to obtain corresponding word vectors and word vectors.
在其中一个实施例中,获取目标文本的具体步骤包括:终端获取目标文本,其中目标文本可以有多个,该目标文本可以是经语音识别得到的识别文本,也可以是用户直接在终端输入的文本。终端将获取的目标文本传输至服务器。目标文本也可以是从网络平台上获取的,通过爬虫技术从网络上获取相关目标文本。In one of the embodiments, the specific steps of obtaining the target text include: the terminal obtains the target text, where there may be multiple target texts, and the target text may be recognized text obtained by voice recognition, or directly input by the user at the terminal text. The terminal transmits the acquired target text to the server. The target text can also be obtained from the network platform, and the relevant target text is obtained from the network through crawler technology.
在其中一个实施例中,确定目标文本中所包含的每个文本字符以及每个文本字符所属的文本分词的步骤包括:服务器根据接收到的目标文本按字符进行分字处理,得到目标文本所包含的文本字符;将得到的各个文本字符按照文本字符在目标文本中出现的先后顺序进行排列,得到目标文本的字符序列,从该字符序列中删除属于停用词表的文本字符,获得经过预处理后的字符序列。其中停用词是指在自然语言处理任务中需要被过滤的不具有处理价值的词或字;停用词包括英文字符、数字、数学字符、标点符号及使用频率较高的单汉字等。In one of the embodiments, the step of determining each text character contained in the target text and the text segmentation to which each text character belongs includes: the server performs character segmentation processing according to the received target text to obtain the target text contained The text characters; arrange the obtained text characters according to the order in which the text characters appear in the target text to obtain the character sequence of the target text, delete the text characters belonging to the stop vocabulary from the character sequence, and obtain the preprocessed After the character sequence. Among them, stop words refer to words or characters with no processing value that need to be filtered in natural language processing tasks; stop words include English characters, numbers, mathematical characters, punctuation marks, and single Chinese characters that are frequently used.
服务器对字符序列中的每个字符进行检测,将相同的字符进行字符标识,以区分相同字符所对应的不同词;利用预先构建的分词词库,对具有字符标识的字符序列进行分词处理,得到具有字符标识的词序列;基于经过预处理后的字符序列,服务器从词序列中确定每个字符所属的文本分词。The server detects each character in the character sequence, and identifies the same character to distinguish different words corresponding to the same character; uses the pre-built word segmentation thesaurus to segment the character sequence with the character identifier to obtain A word sequence with character identifiers; based on the preprocessed character sequence, the server determines the text segmentation to which each character belongs from the word sequence.
在其中一个实施例中,构建分词词库时可以通过《新华词典》或者其他类似的出版书籍为基础而建立的词库,也可以根据智能客服场景构建分词词库。已构建的分词词库可存储在服务器的数据库内或发送至云端。In one of the embodiments, the word-segmentation thesaurus can be built based on "Xinhua Dictionary" or other similar published books, or the word-segmentation thesaurus can be constructed according to the intelligent customer service scenario. The constructed word segmentation database can be stored in the database of the server or sent to the cloud.
在其中一个实施例中,目标文本也可以通过服务器获取,比如,服务器可以从网页中 获取所需文本数据作为目标文本,进一步确定目标文本的各个文本字符和每个文本字符所属的文本分词。In one of the embodiments, the target text may also be obtained through a server. For example, the server may obtain required text data from a web page as the target text, and further determine each text character of the target text and the text word segmentation to which each text character belongs.
例如,获取的目标文本为“深圳市的市政府在市民中心。”,首先服务器对目标文本进行分字处理,得到字符序列“深/圳/市/的/市/政/府/在/市/民/中/心/。”,删除字符序列中属于停用词表的字符,得到经过预处理后的字符序列“深/圳/市/市/政/府/市/民/中/心”;进一步,将相同的字符进行字符标识,即得到具有字符标识的字符序列“深/圳/市01/市02/政/府/市03/民/中/心”,对该字符序列进行分词,得到词序列“深圳市01/市02政府/市03民中心”,虽然文本字符为“市”对应有三个词,但是可以依据字符标识区分文本字符所属的文本分词。For example, the obtained target text is "Shenzhen city government is in the civic center." First, the server performs word segmentation processing on the target text to obtain the character sequence "Shenzhen/Shen/City/的/市/政/府/在/市/民/中/心/.", delete the characters belonging to the stop vocabulary in the character sequence, and obtain the preprocessed character sequence "Shen/Shen/City/City/Government/Fun/City/民/中/心"; further, the same characters are identified by the character, and the character sequence "Shen/Shen/City 01/市02/Government/Fun/City 03/Min/Zhong/Xin" is obtained. Word segmentation, the word sequence "Shenzhen 01/City 02 Government/City 03 People Center" is obtained. Although the text character is "市", there are three words, but the text segmentation to which the text character belongs can be distinguished according to the character identifier.
步骤S204,将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量。Step S204, splicing the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character.
其中,拼接向量是将多个文本向量按预设规则拼接而形成的一个向量,拼接向量表征了多个向量的表示维度。Among them, the splicing vector is a vector formed by splicing multiple text vectors according to a preset rule, and the splicing vector represents the representation dimension of the multiple vectors.
具体地,基于获取的目标文本的字向量和词向量,服务器将每个文本字符对应的字向量和该文本字符所属的词向量进行拼接,得到与该文本字符对应的拼接向量,以此获得目标文本所包含的所有文本字符的拼接向量,其中字向量和词向量进行拼接的顺序不做要求。Specifically, based on the acquired word vector and word vector of the target text, the server splices the word vector corresponding to each text character and the word vector to which the text character belongs to obtain the splicing vector corresponding to the text character, thereby obtaining the target The splicing vector of all text characters contained in the text, in which the order in which the word vector and word vector are spliced is not required.
在其中一个实施例中,服务器将每个文本字符对应的字向量以及与文本字符所属文本分词的词向量进行相加或相乘,得到相应文本字符的拼接向量。In one of the embodiments, the server adds or multiplies the word vector corresponding to each text character and the word vector of the text segmentation to which the text character belongs to obtain a splicing vector of the corresponding text character.
步骤S206,按照文本字符在目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到目标文本基于正向出现顺序的第一文本特征。Step S206: According to the forward appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to the multiple text characters are sequentially input into different hidden layers of the first neural network to obtain the first text of the target text based on the forward appearance order feature.
其中,第一神经网络主要用于将按照文本字符在目标文本的正向出现顺序输入的目标文本所包含的特征,生成携带有基于正向出现顺序的目标文本的上下文语义的特征;第一神经网络包括多个隐层,每个隐层可以具有相同或不同的神经元节点。第一神经网络是递归神经网络,可以是长短期记忆网络LSTM和循环神经网络RNN。Among them, the first neural network is mainly used to generate features that carry the contextual semantics of the target text based on the forward appearance order from the features contained in the target text entered in the forward appearance order of the text characters in the target text; the first neural network The network includes multiple hidden layers, and each hidden layer can have the same or different neuron nodes. The first neural network is a recurrent neural network, which can be a long and short-term memory network LSTM and a recurrent neural network RNN.
具体地,目标文本包含多个文本字符,服务器计算每个文本字符对应的字向量和拼接向量,将计算得到的字向量和拼接向量按照文本字符在目标文本的正向出现顺序进行排序,进一步,将按照正向出现顺序排序的字向量和拼接向量依次输入至第一神经网络的不同隐层中进行特征提取,获取不同文本字符的相互信息,得到基于正向出现顺序的第一文本特征。Specifically, the target text contains multiple text characters, and the server calculates the word vector and splicing vector corresponding to each text character, and sorts the calculated word vector and splicing vector according to the forward appearance order of the text characters in the target text. Further, The word vectors and splicing vectors sorted in the forward order of appearance are sequentially input into different hidden layers of the first neural network for feature extraction, mutual information of different text characters is obtained, and the first text feature based on the forward order of appearance is obtained.
在其中一个实施例中,可以将多个文本字符对应的字向量、词向量和拼接向量依次输入至第一神经网络的不同隐层,得到目标文本基于正向出现顺序的第一文本特征。In one of the embodiments, the word vectors, word vectors, and splicing vectors corresponding to multiple text characters can be sequentially input to different hidden layers of the first neural network to obtain the first text feature of the target text based on the forward appearance order.
步骤S208,按照文本字符在目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到目标文本基于逆向出现顺序的第二文本特征。Step S208, according to the reverse appearance order of the text characters in the target text, input the word vectors and splicing vectors corresponding to the multiple text characters into different hidden layers of the second neural network in order to obtain the second text feature of the target text based on the reverse appearance order.
其中,第二神经网络主要用于将按照文本字符在目标文本的逆向出现顺序输入的目标文本所包含的特征,生成携带有基于逆向出现顺序的目标文本的上下文语义的特征;第二神经网络包括多个隐层,每个隐层可以具有相同或不同的神经元节点。第二神经网络是递归神经网络,可以是长短期记忆网络LSTM和循环神经网络RNN。Among them, the second neural network is mainly used to generate features that carry the contextual semantics of the target text based on the reverse appearance order from the features contained in the target text entered in the reverse appearance order of the text characters in the target text; the second neural network includes Multiple hidden layers, each hidden layer can have the same or different neuron nodes. The second neural network is a recurrent neural network, which can be a long short-term memory network LSTM and a recurrent neural network RNN.
具体地,基于得到的目标文本中所有文本字符各自对应的字向量和拼接向量,根据文本字符在目标文本中出现的逆向顺序,将文本字符对应的字向量和拼接向量依次输入至第二神经网络的不同隐层中,通过第二神经网络对输入的字向量和拼接向量进行特征提取,得到基于逆向出现顺序的第二文本特征。Specifically, based on the obtained word vectors and splicing vectors corresponding to all text characters in the target text, according to the reverse order of the text characters appearing in the target text, the word vectors and splicing vectors corresponding to the text characters are sequentially input to the second neural network In the different hidden layers of, the input word vector and splicing vector are extracted through the second neural network, and the second text feature based on the reverse appearance order is obtained.
在其中一个实施例中,可以预先设定第一神经网络或第二神经网络可以输入的最大文 本字符的数量,若当前输入的目标文本的文本字符数量小于最大文本字符的数量,则在目标文本所形成的字向量矩阵中用0向量补齐,将补齐后的字向量矩阵作为第一神经网络或第二神经网络的输入。In one of the embodiments, the maximum number of text characters that can be input by the first neural network or the second neural network can be preset. If the number of text characters of the currently input target text is less than the maximum number of text characters, the target text The word vector matrix formed is filled with 0 vectors, and the filled word vector matrix is used as the input of the first neural network or the second neural network.
步骤S210,将由第一文本特征与第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到目标文本的语义类型。In step S210, the integrated text feature obtained by splicing the first text feature and the second text feature is input to the third neural network to obtain the semantic type of the target text.
其中,综合文本特征是将经第一神经网络的输出和第二神经网络输出,按预设规则拼接而形成的一个文本特征。第三神经网络主要用于根据输入的目标文本对应的综合文本特征按语义类型进行分类,以此得到目标文本的语义类型。语义类型是指根据目标文本的语义关系确定目标文本所属的类型。Among them, the integrated text feature is a text feature formed by splicing the output of the first neural network and the output of the second neural network according to preset rules. The third neural network is mainly used to classify the semantic type according to the comprehensive text features corresponding to the input target text, so as to obtain the semantic type of the target text. The semantic type refers to determining the type of the target text according to the semantic relationship of the target text.
具体地,基于得到的目标文本的第一文本特征和第二文本特征,服务器将第一文本特征和第二文本特征进行拼接得到目标文本的综合文本特征;进一步,将目标文本的综合文本特征传输至第三神经网络,通过第三神经网络对综合文本特征按语义类进行分类,得到目标文本的语义类型,充分考虑了对文本上下文和隐含词汇的语义理解。比如,在识别目标文本的脏话和礼貌用语时,可以相应将语义类型设置为两类,即类别1:文本是脏话,类别0:文本是礼貌用语。Specifically, based on the obtained first text feature and second text feature of the target text, the server splices the first text feature and the second text feature to obtain the integrated text feature of the target text; further, the integrated text feature of the target text is transmitted To the third neural network, the comprehensive text features are classified according to semantic categories through the third neural network, and the semantic type of the target text is obtained, fully considering the semantic understanding of the text context and implicit vocabulary. For example, when recognizing the swear words and polite words of the target text, the semantic types can be set to two categories accordingly, namely category 1: text is swear words, and category 0: text is polite words.
在上述实施例中,通过计算获得每个文本字符对应的字向量和所属文本分词的词向量并进行向量拼接,得到与文本字符对应的拼接向量,通过对文本字符进行向量拼接,通过多种特征向量来表征文本,增强了文本语言表示的特征维度。进一步,通过将字向量和拼接向量按照正向顺序和逆向顺序输入至不同神经网络的不同隐层,可以更充分的获取文本字符的相关信息,挖掘文本字符之间的上下文语义,使得经第一神经网络输出的第一文本特征和第二文本特征进行拼接得到综合特征,可以更充分的表达目标文本的语义特征,提高了文本语义识别的准确率。In the above embodiment, the word vector corresponding to each text character and the word vector of the corresponding text segmentation are obtained by calculation and vector splicing is performed to obtain the splicing vector corresponding to the text character. The vector characterizes the text, which enhances the feature dimension of the text language. Furthermore, by inputting word vectors and splicing vectors into different hidden layers of different neural networks in forward and reverse order, relevant information of text characters can be obtained more fully, and the contextual semantics between text characters can be mined, so that the first The first text feature and the second text feature output by the neural network are spliced to obtain a comprehensive feature, which can more fully express the semantic feature of the target text and improve the accuracy of text semantic recognition.
在一个实施例中,如图3所示,上述方法还包括预设文件生成的步骤:In one embodiment, as shown in FIG. 3, the above method further includes the step of generating a preset file:
步骤S302,获取样本文本。Step S302: Obtain sample text.
步骤S304,基于预训练的第一神经网络提取样本文本的字向量及词向量。Step S304: Extract the word vectors and word vectors of the sample text based on the pre-trained first neural network.
步骤S306,对字向量和词向量分别进行字符编号。In step S306, the character vector and the word vector are respectively numbered.
步骤S308,将字向量、词向量以及分别对应的字符编号写入到预设文件。Step S308: Write the word vector, the word vector, and the corresponding character numbers to the preset file.
计算文本字符的字向量和文本分词对应的词向量包括:对每个文本字符以及文本分词进行字符编号;基于字符编号,在预设文件中读取得到每个文本字符对应的字向量以及每个文本分词对应的词向量。Calculating the word vector of text characters and the word vector corresponding to text word segmentation includes: numbering each text character and text word segmentation; based on the character number, reading the word vector corresponding to each text character and each The word vector corresponding to the text segmentation.
其中,预设文件是预先构建的具有索引的文本,包括字向量及其索引,词向量及其索引。Among them, the preset file is a pre-built text with an index, including a word vector and its index, and a word vector and its index.
具体地,在计算文本字符的字向量和文本分词对应的词向量之前,需构建包含字向量和词向量的索引查询的预设文件。服务器从终端或网页上获取到样本文本以及对应的已知语义类型,基于预先训练好的第一神经网络提取样本文本的字向量和词向量,并对提取的字向量和词向量分别进行字符编号,得到字向量与数字的映射关系,以及词向量和数字的映射关系。服务器将字向量、词向量以及分别对应的字符编号写到预设文件中,形成具有字符编号索引的字向量和词向量。Specifically, before calculating the word vector of the text character and the word vector corresponding to the text segmentation, a preset file containing the word vector and the index query of the word vector needs to be constructed. The server obtains the sample text and the corresponding known semantic type from the terminal or web page, extracts the word vector and word vector of the sample text based on the pre-trained first neural network, and performs character numbers on the extracted word vector and word vector respectively , Get the mapping relationship between word vectors and numbers, and the mapping relationship between word vectors and numbers. The server writes the word vector, the word vector, and the corresponding character numbers into the preset file to form the word vector and the word vector with a character number index.
基于得到的目标文本所包含的各个文本字符以及与文本字符所属的文本分词,服务器对每个文本字符和文本分词分别进行字符编号,获得文本字符与字符编号间的映射关系,以及文本分词与字符编号间的映射关系。根据每个文本字符的字符编号从预设文件中查询得到相应的文本字符的字向量,以及根据每个文本分词的字符编号从预设文件中查询得到相应文本分词的词向量。Based on each text character contained in the target text and the text segmentation to which the text character belongs, the server performs character numbering for each text character and text segmentation, and obtains the mapping relationship between text characters and character numbers, and text segmentation and character segmentation The mapping relationship between numbers. The word vector of the corresponding text character is queried from the preset file according to the character number of each text character, and the word vector of the corresponding text word segment is queried from the preset file according to the character number of each text word segmentation.
在其中一个实施例中,字符编号可以包括编号类型。根据编号类型对字向量和词向量 分别进行字符编号,字向量和词向量的编号类型可以相同也可以不同。比如,对字向量按照自然数进行字符编号;对于词向量可以按照自然数进行编号,也可以按照英文字母进行编号。In one of the embodiments, the character number may include a number type. The character vectors and word vectors are respectively numbered according to the numbering type, and the numbering types of the word vector and the word vector can be the same or different. For example, the character vectors are numbered according to natural numbers; the word vectors can be numbered according to natural numbers or English letters.
比如,目标文本为“深圳市”,假如文本字符“深”的字符编号为01,从预设文件中查询得到字符编号为01所对应的字向量是(1,1,2,2)。For example, if the target text is "Shenzhen", if the character number of the text character "Deep" is 01, the word vector corresponding to the character number 01 obtained from the preset file is (1,1,2,2).
在本实施例中,通过预先构建的包含字向量和词向量的预设文件,在计算目标文本的字向量和词向量时,根据文本字符与文本分词各自对应的字符编号,从预设文件中查询得到各自对应的字向量和词向量,可以准确快速的得到文本字符的字向量以及文本分词的词向量,从而提高了获取目标文本的语义类型的速率和准确率。In this embodiment, through a pre-built preset file containing word vectors and word vectors, when calculating the word vectors and word vectors of the target text, according to the character numbers corresponding to the text characters and text segmentation, the preset files are The query to obtain the corresponding word vector and word vector can accurately and quickly obtain the word vector of the text character and the word vector of the text word segmentation, thereby improving the rate and accuracy of obtaining the semantic type of the target text.
在一个实施例中,按照文本字符在目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到目标文本基于正向出现顺序的第一文本特征包括:按照文本字符在目标文本的正向出现顺序,将当前顺序文本字符对应的字向量及拼接向量输入第一神经网络的当前隐层;将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第一神经网络的下一隐层;将下一隐层作为当前隐层进行迭代,直至最后顺序文本字符,得到目标文本基于正向出现顺序的第一文本特征。In one embodiment, according to the forward appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to the multiple text characters are sequentially input into different hidden layers of the first neural network to obtain the target text based on the forward appearance order The first text feature includes: input the word vector and splicing vector corresponding to the text character in the current sequence to the current hidden layer of the first neural network according to the forward appearance order of the text characters in the target text; The word vector and splicing vector corresponding to a sequence of text characters are input into the next hidden layer of the first neural network; the next hidden layer is used as the current hidden layer to iterate until the last sequential text characters, and the first order of the target text based on the forward appearance order is obtained. A text feature.
具体地,根据文本字符在目标文本中的正向出现顺序,服务器将位于第一顺序的文本字符对应的字向量和拼接向量输入至第一神经网络的第一隐层中,通过第一隐层对输入的字向量和拼接向量进行投影得到该文本字符对应的字符特征。其中,第一顺序文本字符是指目标文本中的所有文本字符按预设的出现顺序进行排序后,位于第一位的文本字符即为第一顺序文本字符。最后顺序文本字符是指指目标文本中的所有文本字符按预设的出现顺序进行排序后,位于最后一位的文本字符。Specifically, according to the forward appearance order of the text characters in the target text, the server inputs the word vectors and splicing vectors corresponding to the text characters in the first order into the first hidden layer of the first neural network, through the first hidden layer Project the input word vector and splicing vector to obtain the character feature corresponding to the text character. Among them, the first-order text character means that after all text characters in the target text are sorted according to a preset appearance order, the text character in the first position is the first-order text character. The last-order text character refers to the last text character after all text characters in the target text are sorted according to the preset order of appearance.
服务器将第一隐层作为当前隐层,按照文本字符在目标文本的正向出现顺序,得到当前隐层对应的字向量和拼接向量。预先设置当前隐层内各个神经元节点的权重,其中,每个神经元节点的权重可以相同也可以不同。进而,服务器根据预先设置的各个神经元节点的权重对当前隐层输入的特征进行非线性映射,得到当前隐层输出的字符特征。其中,非线性映射可以采用sigmoid(S型)函数、tanh(双曲正切)函数、relu(修正线性单元)函数等激活函数。The server takes the first hidden layer as the current hidden layer, and obtains the word vector and the splicing vector corresponding to the current hidden layer according to the forward appearance sequence of the text characters in the target text. The weight of each neuron node in the current hidden layer is preset, wherein the weight of each neuron node can be the same or different. Furthermore, the server performs a nonlinear mapping on the features of the current hidden layer input according to the preset weights of each neuron node to obtain the character features output by the current hidden layer. Among them, the non-linear mapping can use activation functions such as a sigmoid (S-type) function, a tanh (hyperbolic tangent) function, and a relu (modified linear unit) function.
服务器将当前隐层输出的字符特征与下一顺序文本字符对应的字向量和拼接向量输入至第一神经网络的下一隐层,将下一隐层作为当前隐层,循环执行以下步骤:按照文本字符在目标文本的正向出现顺序,得到当前隐层对应的字向量和拼接向量。预先设置当前隐层内各个神经元节点的权重。服务器根据当前隐层对应的字向量和拼接向量,当前隐层对应的各个神经元节点的权重和上一隐层输出的字符特征,采用非线性映射得到当前隐层输出的字符特征。即服务器将上一隐层输出的字符特征、当前隐层对应的字向量和拼接向量输入至当前隐层,根据预先设置的各个神经元节点的权重对输入的特征进行非线性映射,以此得到下一隐层的字符特征。将下一隐层作为当前隐层,重复进行迭代,执行上述步骤,直至将最后顺序文本字符对应的字向量和拼接向量输入至当前隐层,得到当前隐层输出的字符特征,将该输出的字符特征作为目标文本基于正向出现顺序的第一文本特征。The server inputs the character feature output by the current hidden layer and the word vector and splicing vector corresponding to the next sequential text character to the next hidden layer of the first neural network, and the next hidden layer is used as the current hidden layer, and the following steps are executed in a loop: The text characters appear in the forward direction of the target text, and the word vector and splicing vector corresponding to the current hidden layer are obtained. Preset the weight of each neuron node in the current hidden layer. The server uses nonlinear mapping to obtain the character features output by the current hidden layer according to the word vector and the splicing vector corresponding to the current hidden layer, the weight of each neuron node corresponding to the current hidden layer, and the character features output by the previous hidden layer. That is, the server inputs the character features output by the previous hidden layer, the word vector and the splicing vector corresponding to the current hidden layer into the current hidden layer, and performs nonlinear mapping on the input features according to the preset weights of each neuron node to obtain Character characteristics of the next hidden layer. Use the next hidden layer as the current hidden layer, repeat the iterations, and execute the above steps until the word vector and splicing vector corresponding to the last sequential text characters are input to the current hidden layer to obtain the character features output by the current hidden layer, and the output The character feature is the first text feature of the target text based on the forward appearance order.
在其中一个实施例中,当前隐层包括第一子隐层和第二子隐层;将当前顺序文本字符对应的字向量及拼接向量输入第一神经网络的当前隐层包括:将字向量和上一隐层的输出作为第一子隐层的输入,第一子隐层用于根据第一子隐层对应的各个神经元节点的权重对字向量进行投影得到第一子字符特征;将第一子字符特征和拼接向量作为第二子隐层的输入,第二子隐层用于根据第二子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第二子字符特征,作为当前隐层的输出。In one of the embodiments, the current hidden layer includes a first sub-hidden layer and a second sub-hidden layer; inputting the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network includes: combining the word vector and The output of the previous hidden layer is used as the input of the first sub-hidden layer. The first sub-hidden layer is used to project the word vector according to the weight of each neuron node corresponding to the first sub-hidden layer to obtain the first sub-character feature; A sub-character feature and a splicing vector are used as the input of the second sub-hidden layer. The second sub-hidden layer is used to project the splicing vector according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain the second sub-character feature as The output of the current hidden layer.
具体地,当前隐层包括第一子隐层和第二子隐层;服务器将出现顺序位于第一位的文 本字符对应的字向量输入至第一子隐层,并通过第一子隐层内预先设置的各个神经元节点的权重对该字向量进行投影,得到第一子隐层输出的第一字符特征;进一步,服务器将第一字符特征与出现顺序位于第一位的文本字符对应的拼接向量作为第二子隐层的输入,根据预先设置的第二子隐层中各个神经元节点的权重对输入的特征进行非线性映射,以此得到第二子隐层输出的字符特征,作为第一隐层的输出。Specifically, the current hidden layer includes a first sub-hidden layer and a second sub-hidden layer; the server inputs the word vector corresponding to the text character in the first order of appearance into the first sub-hidden layer, and passes it through the first sub-hidden layer The preset weights of each neuron node project the word vector to obtain the first character feature output by the first sub-hidden layer; further, the server splices the first character feature with the text character whose appearance order is in the first place. The vector is used as the input of the second sub-hidden layer, and the input features are non-linearly mapped according to the preset weights of each neuron node in the second sub-hidden layer to obtain the character features output by the second sub-hidden layer as the first The output of a hidden layer.
在其中一个实施例中,当前隐层包括第一子隐层和第二子隐层;将当前顺序文本字符对应的字向量及拼接向量输入第一神经网络的当前隐层包括:服务器将拼接向量和上一隐层的输出作为第一子隐层的输入,第一子隐层用于根据第一子隐层对应的各个神经元节点的权重对字向量进行投影得到第一子字符特征;将第一子字符特征和字向量作为第二子隐层的输入,第二子隐层用于根据第二子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第二子字符特征,作为当前隐层的输出。In one of the embodiments, the current hidden layer includes a first sub-hidden layer and a second sub-hidden layer; inputting the word vector and splicing vector corresponding to the current sequential text character into the current hidden layer of the first neural network includes: the server will splice the vector And the output of the previous hidden layer as the input of the first sub-hidden layer, the first sub-hidden layer is used to project the word vector according to the weight of each neuron node corresponding to the first sub-hidden layer to obtain the first sub-character feature; The first sub-character feature and the word vector are used as the input of the second sub-hidden layer. The second sub-hidden layer is used to project the splicing vector according to the weight of each neuron node corresponding to the second sub-hidden layer to obtain the second sub-character feature. As the output of the current hidden layer.
在其中一个实施例中,按照文本字符在目标文本的正向出现顺序,服务器将多个文本字符对应的字向量、词向量和拼接向量依次输入至第一神经网络的不同隐层,得到目标文本基于正向出现顺序的第一文本特征。In one of the embodiments, according to the forward appearance order of the text characters in the target text, the server sequentially inputs the word vectors, word vectors, and splicing vectors corresponding to multiple text characters into different hidden layers of the first neural network to obtain the target text The first text feature based on the forward order of appearance.
具体地,按照文本字符在目标文本的正向出现顺序,服务器将当前顺序文本字符对应的字向量、词向量及拼接向量输入第一神经网络的当前隐层,根据当前隐层对应的字向量、词向量和拼接向量,当前隐层对应的各个神经元节点的权重和上一隐层输出的字符特征,采用非线性映射得到当前隐层输出的字符特征;服务器将当前隐层输出的字符特征与下一顺序文本字符对应的字向量、词向量及拼接向量输入第一神经网络的下一隐层,将下一隐层作为当前隐层,重复进入将当前隐层输出的字符特征与下一顺序文本字符对应的字向量、词向量及拼接向量输入第一神经网络的下一隐层的步骤,直至最后顺序文本字符,得到目标文本基于正向出现顺序的第一文本特征。Specifically, according to the forward appearance order of the text characters in the target text, the server inputs the word vectors, word vectors, and splicing vectors corresponding to the text characters in the current order into the current hidden layer of the first neural network, and according to the word vectors, The word vector and splicing vector, the weight of each neuron node corresponding to the current hidden layer and the character feature output by the previous hidden layer, use nonlinear mapping to obtain the character feature output by the current hidden layer; the server compares the character feature output by the current hidden layer with The word vector, word vector, and splicing vector corresponding to the next sequence of text characters are input to the next hidden layer of the first neural network, the next hidden layer is used as the current hidden layer, and the character features output by the current hidden layer are repeated to the next order The word vectors, word vectors, and splicing vectors corresponding to the text characters are input into the next hidden layer of the first neural network, until the final sequence of text characters, and the first text feature of the target text based on the forward appearance order is obtained.
在其中一个实施例中,第一隐层包括第一子隐层、第二子隐层和第三子隐层;将当前顺序文本字符对应的字向量、词向量及拼接向量输入第一神经网络的当前隐层包括:将字向量和上一隐层的输出作为第一子隐层的输入,第一子隐层用于根据第一子隐层对应的各个神经元节点的权重对字向量进行投影得到第一子字符特征;将第一子字符特征和词向量作为第二子隐层的输入,第二子隐层用于根据第二子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第二子字符特征;将第二子字符特征和拼接向量作为第三子隐层的输入,第三子隐层用于根据第三子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第三子字符特征,作为第一隐层的输出。其中,文本字符对应的字向量、词向量以及拼接向量进行投影的顺序不做规定,可以任意设置。比如,可以将词向量进行投影得到第一子隐层对应的第一子字符特征。In one of the embodiments, the first hidden layer includes a first sub-hidden layer, a second sub-hidden layer, and a third sub-hidden layer; the word vector, word vector, and splicing vector corresponding to the current sequential text character are input into the first neural network The current hidden layer includes: taking the word vector and the output of the previous hidden layer as the input of the first sub-hidden layer, the first sub-hidden layer is used to perform the word vector according to the weight of each neuron node corresponding to the first sub-hidden layer The first sub-character feature is obtained by projection; the first sub-character feature and word vector are used as the input of the second sub-hidden layer, and the second sub-hidden layer is used to stitch the vector according to the weight of each neuron node corresponding to the second sub-hidden layer Perform projection to obtain the second sub-character feature; use the second sub-character feature and the splicing vector as the input of the third sub-hidden layer, and the third sub-hidden layer is used to splice according to the weight of each neuron node corresponding to the third sub-hidden layer The vector is projected to obtain the third sub-character feature as the output of the first hidden layer. Among them, the projection order of the word vector, word vector, and splicing vector corresponding to the text character is not specified and can be set arbitrarily. For example, the word vector can be projected to obtain the first sub-character feature corresponding to the first sub-hidden layer.
在其中一个实施例中,服务器根据当前隐层对应的字向量和拼接向量、当前隐层对应的各个神经元节点的权重和上一隐层输出的字符特征,采用非线性映射得到下一隐层的字符特征具体可以通过下述例子说明。In one of the embodiments, the server uses nonlinear mapping to obtain the next hidden layer according to the word vector and splicing vector corresponding to the current hidden layer, the weight of each neuron node corresponding to the current hidden layer, and the character features output by the previous hidden layer. The character characteristics of can be illustrated by the following examples.
例如,假设当前隐层对应的各个神经元节点记为W
f,当前隐层对应的字向量为x
t,当前隐层对应的拼接向量y
t,上一隐层输出的字符特征为h
t-1,非线性函数为tanh,则下一隐层输出的字符特征f
t可以用以下公式计算得到:
For example, suppose that each neuron node corresponding to the current hidden layer is denoted as W f , the word vector corresponding to the current hidden layer is x t , the stitching vector corresponding to the current hidden layer is y t , and the character feature output by the previous hidden layer is h t- 1. The nonlinear function is tanh, then the character feature f t output by the next hidden layer can be calculated by the following formula:
f
t=tanh(W
c[h
t-1,x
t,y
t]+b
f);其中,b
f为当前隐层的偏置。
f t =tanh(W c [h t-1 ,x t ,y t ]+b f ); where b f is the bias of the current hidden layer.
在其中一个实施例中,第一神经网络还包括随机失活层,方法还包括:将第一文本特征作为随机失活层的输入,随时失活层用于将第一文本特征中每个数据按照预设的稀疏概率进行投影得到稀疏特征向量,作为第一神经网络的输出。In one of the embodiments, the first neural network further includes a random inactivation layer, and the method further includes: using the first text feature as the input of the random inactivation layer, and the inactivation layer is used to use each data in the first text feature at any time. The sparse feature vector is obtained by projection according to the preset sparse probability, which is used as the output of the first neural network.
其中,随机失活层(dropout)主要用于对输入的第一文本特征进行稀疏处理,将第 一文本特征的部分元素进行归零处理,防止神经网络的过拟合,同时也降低神经网络的计算量。Among them, the random inactivation layer (dropout) is mainly used to sparse the input first text feature, and to zero some elements of the first text feature to prevent the neural network from overfitting, and also to reduce the neural network's Calculation amount.
具体地,服务器将第一文本特征输入至随机失活层,随机失活层根据设定的稀疏概率对第一文本特征进行稀疏处理,将第一文本特征中每个数据按照稀疏概率进行投影,以此得到稀疏特征向量,其中稀疏概率是指数据进行投影后所出现的概率。比如,第一文本特征为一维序列[1,2,3,4]
T,设定稀疏概率是0.5,对应的一维序列中的每个数字投影后出现的概率都是0.5,即经过随机失活层所输出的结果可以是[0,2,0,4]
T,也可以是[0,0,0,4]
T。
Specifically, the server inputs the first text feature to the random inactivation layer, and the random inactivation layer performs sparse processing on the first text feature according to the set sparse probability, and projects each data in the first text feature according to the sparse probability, In this way, the sparse feature vector is obtained, where the sparse probability refers to the probability that the data appears after projection. For example, the first text feature is a one-dimensional sequence [1,2,3,4] T , the sparsity probability is set to 0.5, and the probability of each number in the corresponding one-dimensional sequence after projection is 0.5, that is, after random The output result of the deactivation layer can be [0,2,0,4] T or [0,0,0,4] T.
在一个实施例中,按照文本字符在目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到目标文本基于逆向出现顺序的第二文本特征包括:按照文本字符在所述目标文本的逆向出现顺序,将所述第一文本特征、第一顺序文本字符对应的字向量和拼接向量输入至第二神经网络的第一隐层,得到所述第一隐层输出的字符特征;将第二神经网络的第二隐层作为当前隐层,将第二顺序文本字符作为当前顺序文本字符;将所述前一隐层输出的字符特征、当前顺序文本字符对应的字向量及拼接向量输入第二神经网络的当前隐层;将下一隐层作为当前隐层进行迭代,直至最后顺序文本字符,得到所述目标文本基于逆向出现顺序的第二文本特征。In one embodiment, according to the reverse appearance order of the text characters in the target text, the word vectors and splicing vectors corresponding to multiple text characters are sequentially input into different hidden layers of the second neural network to obtain the second target text based on the reverse appearance order. The text feature includes: according to the reverse appearance order of the text characters in the target text, input the first text feature, the word vector and the splicing vector corresponding to the text character in the first order to the first hidden layer of the second neural network to obtain The character features output by the first hidden layer; the second hidden layer of the second neural network is used as the current hidden layer, and the second sequential text characters are used as the current sequential text characters; the character features output by the previous hidden layer, The word vector and the splicing vector corresponding to the current sequence of text characters are input into the current hidden layer of the second neural network; the next hidden layer is used as the current hidden layer to iterate until the last sequential text characters, and the target text based on the reverse appearance order is obtained. 2. Text features.
具体地,将第一神经网络和第二神经网络串联,将第一神经网络的输出作为第二神经网络的输入。按照文本字符在目标文本的逆向出现顺序,获得第一顺序文本字符对应的字向量和拼接向量;服务器将获得的第一神经网络输出的第一文本特征、以及该字向量和拼接向量输入至第二神经网络的第一隐层,得到第一隐层输出的字符特征;将第二神经网络的第二隐层作为当前隐层,将第二顺序文本字符作为当前顺序文本字符;进一步,将前一隐层输出的字符特征、当前顺序文本字符对应的字向量和拼接向量输入至第二神经网络的当前隐层,通过当前隐层内设置的各个神经元的权重对输入的特征进行非线性映射,得到当前隐层输出的字符特征;服务器将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第二神经网络的下一隐层,将下一隐层作为当前隐层进行迭代,直至最后顺序文本字符,得到目标文本基于逆向出现顺序的第二文本特征。Specifically, the first neural network and the second neural network are connected in series, and the output of the first neural network is used as the input of the second neural network. According to the reverse appearance order of the text characters in the target text, the word vector and the splicing vector corresponding to the first-order text characters are obtained; the server inputs the obtained first text feature output by the first neural network and the word vector and splicing vector to the first The first hidden layer of the second neural network is used to obtain the character features output by the first hidden layer; the second hidden layer of the second neural network is used as the current hidden layer, and the second sequential text character is used as the current sequential text character; further, the previous The character features output by a hidden layer, the word vector and the splicing vector corresponding to the current sequential text characters are input to the current hidden layer of the second neural network, and the input features are non-linearly mapped through the weight of each neuron set in the current hidden layer , Get the character features output by the current hidden layer; the server inputs the character vector and the splicing vector corresponding to the next sequential text character and the character feature output by the current hidden layer into the next hidden layer of the second neural network, and uses the next hidden layer as the current The hidden layer iterates until the last sequence of text characters, and obtains the second text feature of the target text based on the reverse appearance order.
在其中一个实施例中,第二神经网络的当前隐层包括第一子隐层和第二子隐层;将所述前一隐层输出的字符特征、当前顺序文本字符对应的字向量及拼接向量输入第二神经网络的当前隐层包括:将字向量和前一隐层输出的字符特征作为第一子隐层的输入,第一子隐层用于根据第一子隐层对应的各个神经元节点的权重对字向量进行投影得到第一子字符特征;将第一子字符特征和拼接向量作为第二子隐层的输入,第二子隐层用于根据第二子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第二子字符特征,作为当前隐层的输出。In one of the embodiments, the current hidden layer of the second neural network includes a first sub-hidden layer and a second sub-hidden layer; the character feature output by the previous hidden layer, the word vector corresponding to the current sequential text character, and the splicing The vector input to the current hidden layer of the second neural network includes: taking the word vector and the character features output by the previous hidden layer as the input of the first sub-hidden layer. The first sub-hidden layer is used for each neural network corresponding to the first sub-hidden layer. The weight of the meta node is used to project the word vector to obtain the first sub-character feature; the first sub-character feature and the splicing vector are used as the input of the second sub-hidden layer, and the second sub-hidden layer is used for each corresponding to the second sub-hidden layer. The weight of the neuron node projects the splicing vector to obtain the second sub-character feature as the output of the current hidden layer.
在其中一个实施例中,将第二文本特征输入至第三神经网络,得到目标文本的语义类型包括:将第二文本特征作为注意力机制层的输入,注意力机制层用于将第二文本特征中每个数据进行加权处理得到加权特征;将加权特征作为随机失活层的输入,随机失活层用于将加权特征中每个数据按照预设的稀疏概率进行投影得到稀疏特征;将稀疏特征作为全连接层的输入,全连接层用于对稀疏特征进行分类运算得到每个语义类型对应的预测概率;选取预测概率最大的语义类型作为目标文本的语义类型。In one of the embodiments, inputting the second text feature to the third neural network to obtain the semantic type of the target text includes: using the second text feature as the input of the attention mechanism layer, and the attention mechanism layer is used to use the second text Each data in the feature is weighted to obtain the weighted feature; the weighted feature is used as the input of the random inactivation layer, and the random inactivation layer is used to project each data in the weighted feature according to the preset sparse probability to obtain the sparse feature; The feature is used as the input of the fully connected layer, and the fully connected layer is used to classify the sparse features to obtain the prediction probability corresponding to each semantic type; select the semantic type with the largest prediction probability as the semantic type of the target text.
在其中一个实施例中,服务器按照文本字符在目标文本的逆向出现顺序,将多个文本字符对应的字向量、词向量和拼接向量依次输入至第二神经网络的不同隐层,得到目标文本基于逆向出现顺序的第二文本特征。In one of the embodiments, the server sequentially inputs the word vectors, word vectors, and splicing vectors corresponding to multiple text characters into different hidden layers of the second neural network according to the reverse appearance order of the text characters in the target text, to obtain the target text based on The second text feature in the reverse order of appearance.
在其中一个实施例中,第二神经网络还包括随机失活层,方法还包括:服务器将第二文本特征作为随机失活层的输入,随时失活层用于将第二文本特征中每个数据按照预设的 稀疏概率进行投影得到稀疏特征向量,作为第二神经网络的输出。In one of the embodiments, the second neural network further includes a random inactivation layer, and the method further includes: the server uses the second text feature as the input of the random inactivation layer, and the inactivation layer is used to use each of the second text features at any time. The data is projected according to the preset sparse probability to obtain the sparse feature vector, which is used as the output of the second neural network.
下面,以一个具体的实施例来描述第一神经网络和第二神经网络对目标文本的处理过程。例如,对于目标文本为“深圳市”,文本字符“深”对应的字向量为“(1,1,2,2)”、拼接向量为“(1,1,1,1)”,相应的文本字符“圳”对应“(1,2,3,4)”、拼接向量为“(2,1,1,1)”,文本字符“市”对应“(0,0,2,5)”、拼接向量为“(3,1,1,1)”;In the following, a specific embodiment is used to describe the process of processing the target text by the first neural network and the second neural network. For example, for the target text "Shenzhen", the word vector corresponding to the text character "Deep" is "(1,1,2,2)", and the splicing vector is "(1,1,1,1)", and the corresponding The text character "Zhen" corresponds to "(1,2,3,4)", the splicing vector is "(2,1,1,1)", and the text character "City" corresponds to "(0,0,2,5)" , The splicing vector is "(3,1,1,1)";
按照文本字符在目标文本的正向出现顺序,形成的目标文本的字向量为
拼接向量为
According to the forward appearance sequence of the text characters in the target text, the word vector of the target text formed is The stitching vector is
将第一顺序文本字符对应的字向量和拼接向量输入至第一隐层中,并将第一隐层作为当前隐层,当前隐层包括第一子隐层和第二子隐层;即将字向量(1,1,2,2)输入至当前隐层的第一子隐层中,得到第一子隐层输出的第一子字符特征;进一步,将第一子字符特征和拼接向量(1,1,1,1)输入至第二子隐层,根据预先设置的第二子隐层中各个神经元节点的权重对输入的特征进行非线性映射,以此得到第二子隐层输出的字符特征,作为当前隐层的输出。Input the word vector and splicing vector corresponding to the first-order text characters into the first hidden layer, and use the first hidden layer as the current hidden layer. The current hidden layer includes the first sub-hidden layer and the second sub-hidden layer; The vector (1,1,2,2) is input into the first sub-hidden layer of the current hidden layer to obtain the first sub-character feature output by the first sub-hidden layer; further, the first sub-character feature and the splicing vector (1 ,1,1,1) Input to the second hidden layer, and perform nonlinear mapping on the input features according to the preset weight of each neuron node in the second hidden layer to obtain the output of the second hidden layer Character features, as the output of the current hidden layer.
进一步,将文本字符按正向出现顺序推移得到相应的字向量(1,2,3,4)和拼接向量(2,1,1,1),并将第二隐层作为当前隐层。将前一隐层的输出和字向量(1,2,3,4)输入至当前隐层的第一子隐层中,得到当前隐层的第一子隐层的输出;将当前隐层的第一子隐层的输出和拼接向量(2,1,1,1)输入至当前隐层的第二子隐层,根据预先设置的当前隐层的第二子隐层中各个神经元节点的权重对输入的特征进行非线性映射,以此得到当前隐层的第二子隐层输出的字符特征,作为当前隐层的输出。直至将前一隐层的输出、最后顺序文本字符对应的字向量(0,0,2,5)和拼接向量(3,1,1,1)输入至当前隐层,得到当前隐层的输出,将该输出结果作为第一神经网络的输出的第一文本特征。Further, the text characters are moved in the order of appearance in the forward direction to obtain the corresponding word vector (1,2,3,4) and the splicing vector (2,1,1,1), and the second hidden layer is used as the current hidden layer. Input the output of the previous hidden layer and the word vector (1,2,3,4) into the first hidden layer of the current hidden layer to obtain the output of the first hidden layer of the current hidden layer; The output of the first sub-hidden layer and the stitching vector (2,1,1,1) are input to the second sub-hidden layer of the current hidden layer, according to the preset value of each neuron node in the second sub-hidden layer of the current hidden layer. The weight performs non-linear mapping on the input features to obtain the character features output by the second sub-hidden layer of the current hidden layer as the output of the current hidden layer. Until the output of the previous hidden layer, the word vector (0,0,2,5) and the stitching vector (3,1,1,1) corresponding to the last sequential text characters are input to the current hidden layer, the output of the current hidden layer is obtained , Use the output result as the first text feature of the output of the first neural network.
在其中一个实施例中,将第一神经网络和第二神经网络串联,将第二神经网络输出的第二文本特征输入至第三神经网络;其中,第一神经网络的输出作为第二神经网络的输入。In one of the embodiments, the first neural network and the second neural network are connected in series, and the second text feature output by the second neural network is input to the third neural network; wherein, the output of the first neural network is used as the second neural network input of.
基于文本字符在目标文本的逆向出现顺序,形成的目标文本的字向量为
拼接向量为
Based on the reverse appearance order of text characters in the target text, the word vector of the target text formed is The stitching vector is
按照文本字符的逆向出现顺序,将第一文本特征、第一顺序文本字符对应的字向量和拼接向量输入至第二神经网络的第一隐层中,得到所述第一隐层输出的字符特征,并将第 二神经网络的第一隐层作为当前隐层,将第二顺序文本字符作为当前顺序文本字符;即将第一文本特征、字向量(0,0,2,5)输入至第一子隐层中,得到当前隐层的第一子隐层的输出。将当前隐层的第一子隐层的输出和拼接向量(3,1,1,1)输入至当前隐层的第二子隐层,根据预先设置的当前隐层的第二子隐层中各个神经元节点的权重对输入的特征进行非线性映射,以此得到当前隐层的第二子隐层输出的字符特征,作为当前隐层的输出。According to the reverse appearance order of the text characters, the first text feature, the word vector and the splicing vector corresponding to the text character in the first order are input into the first hidden layer of the second neural network to obtain the character feature output by the first hidden layer , And use the first hidden layer of the second neural network as the current hidden layer, and the second sequential text character as the current sequential text character; that is, the first text feature, the word vector (0, 0, 2, 5) is input to the first In the sub-hidden layer, the output of the first sub-hidden layer of the current hidden layer is obtained. Input the output of the first sub-hidden layer of the current hidden layer and the stitching vector (3,1,1,1) into the second sub-hidden layer of the current hidden layer, according to the preset second sub-hidden layer of the current hidden layer The weight of each neuron node performs a nonlinear mapping on the input feature, so as to obtain the character feature output by the second sub-hidden layer of the current hidden layer, which is used as the output of the current hidden layer.
将文本字符按逆向出现顺序推移得到相应的字向量(1,2,3,4)和拼接向量(2,1,1,1),并将第二隐层作为当前隐层。将前一隐层的输出和字向量(1,2,3,4)输入至当前隐层的第一子隐层中,得到当前隐层的第一子隐层的输出;将当前隐层的第一子隐层的输出和拼接向量(2,1,1,1)输入至当前隐层的第二子隐层,根据预先设置的当前隐层的第二子隐层中各个神经元节点的权重对输入的特征进行非线性映射,以此得到当前隐层的第二子隐层输出的字符特征,作为当前隐层的输出。直至将前一隐层的输出、最后顺序文本字符对应的字向量(1,1,2,2)和拼接向量(1,1,1,1)输入至当前隐层,得到当前隐层的输出,将该输出结果作为第一神经网络输出的第二文本特征。将第二文本特征输入至第三神经网络,得到目标文本的语义类型。通过第一神经网络和第二神经网络的串行连接,将第一神经网络的输出作为第二神经网络的输入,即将基于正向顺序的第一文本特征、以及基于逆向出现顺序的字向量和拼接向量一起输入至第二神经网络,可以更充分的挖掘文本字符之间的相互信息,尤其是在文本字符间隔较远的情况下也能很好的获得字符间的上下文信息。Move the text characters in the reverse order to obtain the corresponding word vector (1,2,3,4) and the splicing vector (2,1,1,1), and use the second hidden layer as the current hidden layer. Input the output of the previous hidden layer and the word vector (1,2,3,4) into the first hidden layer of the current hidden layer to obtain the output of the first hidden layer of the current hidden layer; The output of the first sub-hidden layer and the stitching vector (2,1,1,1) are input to the second sub-hidden layer of the current hidden layer, according to the preset value of each neuron node in the second sub-hidden layer of the current hidden layer. The weight performs non-linear mapping on the input features to obtain the character features output by the second sub-hidden layer of the current hidden layer as the output of the current hidden layer. Until the output of the previous hidden layer, the word vector (1,1,2,2) and the stitching vector (1,1,1,1) corresponding to the last sequential text characters are input to the current hidden layer, the output of the current hidden layer is obtained , Use the output result as the second text feature output by the first neural network. The second text feature is input to the third neural network to obtain the semantic type of the target text. Through the serial connection of the first neural network and the second neural network, the output of the first neural network is used as the input of the second neural network, that is, the first text feature based on the forward order and the word vector sum based on the reverse appearance order The splicing vectors are input to the second neural network together, which can more fully mine the mutual information between text characters, especially when the text characters are far apart, it can also obtain the context information between the characters.
在其中一个实施例中,第一神经网络和第二神经网络并联,将第一神经网络输出的第一文本特征和第二神经网络输出的第二文本特征进行拼接输入至第三神经网络。通过第一神经网络和第二神经网络对目标文本进行并行处理,可以提高数据处理的速率。In one of the embodiments, the first neural network and the second neural network are connected in parallel, and the first text feature output by the first neural network and the second text feature output by the second neural network are spliced and input to the third neural network. Parallel processing of the target text through the first neural network and the second neural network can increase the data processing rate.
在本实施例中,通过将文本字符对应的字向量、词向量以及拼接向量等多种特征输入至第一神经网络和第二神经网络中,通过第一神经网络进和第二神经网络内各个隐层对输入的特征进行循环计算,以此得到表征目标文本语义的第一文本特征和第二文本特征,通过神经网络的循环计算,可以更好的抓取文本字符之间的相关性,尤其对于字符间隔较远的文本字符。也就是说当文本字符间隔较大时,也能很好的获取预测位置的相关信息。相关信息并不会随着循环次数递增,呈现衰减的趋势。In this embodiment, by inputting various features such as character vectors, word vectors, and splicing vectors corresponding to text characters into the first neural network and the second neural network, the first neural network enters each of the second neural network. The hidden layer performs cyclic calculation on the input features to obtain the first text feature and the second text feature representing the semantics of the target text. Through the cyclic calculation of the neural network, the correlation between text characters can be better captured, especially For text characters with far apart characters. That is to say, when the text character interval is large, the relevant information of the predicted position can also be obtained well. The relevant information does not increase with the number of cycles, showing a decay trend.
在一个实施例中,第三神经网络层包括注意力机制层和全连接层;将第一文本特征和第二文本特征进行拼接输入至第三神经网络,得到目标文本的语义类型包括:将第一文本特征和第二文本特征进行拼接,得到目标文本的综合文本特征;将综合文本特征作为注意力机制层的输入,注意力机制层用于将综合文本特征中每个数据进行加权处理得到加权特征;将加权特征作为随机失活层的输入,随机失活层用于将加权特征中每个数据按照预设的稀疏概率进行投影得到稀疏特征;将稀疏特征作为全连接层的输入,全连接层用于对稀疏特征进行分类运算得到每个语义类型对应的预测概率;选取预测概率最大的语义类型作为目标文本的语义类型。In one embodiment, the third neural network layer includes an attention mechanism layer and a fully connected layer; splicing and inputting the first text feature and the second text feature to the third neural network to obtain the semantic type of the target text includes: The first text feature and the second text feature are spliced to obtain the integrated text feature of the target text; the integrated text feature is used as the input of the attention mechanism layer, and the attention mechanism layer is used to weight each data in the integrated text feature to obtain the weight Features; the weighted feature is used as the input of the random inactivation layer, and the random inactivation layer is used to project each data in the weighted feature according to the preset sparse probability to obtain the sparse feature; use the sparse feature as the input of the fully connected layer, fully connected The layer is used to classify sparse features to obtain the prediction probability corresponding to each semantic type; select the semantic type with the largest prediction probability as the semantic type of the target text.
具体地,服务器将目标文本的第一文本特征和第二文本特征进行拼接,以此得到目标文本的综合文本特征。进一步,服务器将得到的综合文本特征输入至注意力机制层,注意力机制层根据预先训练好的系数权重参数计算综合文本特征中各个数据得到系数序列;采用非线性激活函数对系数序列进行激活,得到激活后的系数序列;通过逻辑回归(softmax)函数对激活后的系数序列进行归一化处理,得到与综合文本特征中各个数据对应的系数概率;其中,系数概率的范围位于[0,1]之间。将得到的系数概率与综合文本特征各自对应的数据进行相乘,得到加权处理后的加权特征,作为注意力机制层的输出。Specifically, the server splices the first text feature and the second text feature of the target text to obtain the comprehensive text feature of the target text. Further, the server inputs the obtained comprehensive text features to the attention mechanism layer, and the attention mechanism layer calculates each data in the comprehensive text features according to the pre-trained coefficient weight parameters to obtain the coefficient sequence; uses a nonlinear activation function to activate the coefficient sequence, Obtain the activated coefficient sequence; normalize the activated coefficient sequence through the logistic regression (softmax) function to obtain the coefficient probability corresponding to each data in the integrated text feature; among them, the range of the coefficient probability lies in [0, 1 ]between. The coefficient probabilities obtained are multiplied by the respective data corresponding to the integrated text features, and the weighted features after weighting processing are obtained as the output of the attention mechanism layer.
进一步,服务器将注意力机制层输出的加权特征输入至随机失活层,随机失活层根据设定的稀疏概率对加权特征进行稀疏处理,将加权特征中每个数据按照稀疏概率进行投影,以此得到稀疏特征,其中稀疏概率是指数据进行投影后所出现的概率。Further, the server inputs the weighted features output by the attention mechanism layer to the random inactivation layer, and the random inactivation layer sparsely processes the weighted features according to the set sparsity probability, and projects each data in the weighted features according to the sparsity probability to This results in sparse features, where the sparse probability refers to the probability that the data appears after projection.
进一步,服务器将稀疏特征输入至全连接层,通过全连接层对稀疏特征进行分类运算,根据训练好的全连接层的权重参数计算每个语义类型所对应的预测概率,其中全连接层输出的每个预测概率对应一个语义类型。服务器选取预测概率最大的语义类型作为目标文本的语义类型。Further, the server inputs the sparse features to the fully connected layer, performs classification operations on the sparse features through the fully connected layer, and calculates the prediction probability corresponding to each semantic type according to the weight parameters of the trained fully connected layer. Each predicted probability corresponds to a semantic type. The server selects the semantic type with the largest predicted probability as the semantic type of the target text.
在其中一个实施例中,第三神经网络层还包括逻辑回归层(softmax层),具体包括:将每个语义类型对应的预测概率作为softmax层的输入,其中softmax层用于对各个预测概率进行归一化处理得到每个语义类型对应的概率,选取概率最大的语义类型作为目标文本的语义类型。In one of the embodiments, the third neural network layer also includes a logistic regression layer (softmax layer), which specifically includes: taking the prediction probability corresponding to each semantic type as the input of the softmax layer, where the softmax layer is used to perform the The normalization process obtains the probability corresponding to each semantic type, and the semantic type with the highest probability is selected as the semantic type of the target text.
例如,经过全连接层输出预测概率为
a对应的语义类型是1,b对应的语义类型是0;采用softmax函数进行归一化后,得到每个语义类型经过归一化后输出的概率为
选取最大概率所对应的语义类型作为目标文本的语义类型。
For example, the output prediction probability of the fully connected layer is The semantic type corresponding to a is 1, and the semantic type corresponding to b is 0; after using the softmax function for normalization, the normalized output probability of each semantic type is The semantic type corresponding to the maximum probability is selected as the semantic type of the target text.
在其中一个实施例中,神经网络包括第一神经网络、第二神经网络和第三神经网络,神经网络模型的训练过程包括:获取样本文本以及已知标签,确定样本文本所包含的样本文本字符以及每个样本文本字符所属的样本文本分词,计算样本文本字符对应的样本字向量和样本文本分词对应的样本词向量,并将每个样本文本字符对应的样本字向量与所属的样本文本分词的样本词向量拼接,得到相应的样本文本字符的样本拼接向量;按照样本文本字符在样本文本的正向出现顺序,将多个样本文本字符对应的样本字向量和样本拼接向量依次输入至待训练的第一神经网络得到样本第一文本特征;按照样本文本字符在样本文本的逆向出现顺序,将多个样本文本字符对应的样本字向量和样本拼接向量依次输入至待训练的第二神经网络得到样本第二文本特征;In one of the embodiments, the neural network includes a first neural network, a second neural network, and a third neural network. The training process of the neural network model includes: obtaining sample text and known labels, and determining the sample text characters contained in the sample text And the sample text segmentation to which each sample text character belongs, calculate the sample word vector corresponding to the sample text character and the sample word vector corresponding to the sample text segmentation, and divide the sample word vector corresponding to each sample text character with the sample text segmentation. The sample word vector is spliced to obtain the sample splicing vector of the corresponding sample text characters; according to the forward appearance order of the sample text characters in the sample text, the sample word vectors and sample splicing vectors corresponding to multiple sample text characters are sequentially input to the sample to be trained The first neural network obtains the first text feature of the sample; according to the reverse appearance order of the sample text characters in the sample text, the sample word vectors and sample splicing vectors corresponding to multiple sample text characters are sequentially input to the second neural network to be trained to obtain samples Second text feature;
将样本第一文本特征与样本第二文本特征进行拼接得到的样本综合文本特征输入至待训练的第三神经网络,得到样本文本的预测语义类型;根据预测语义类型与已知标签计算损失值,将损失值通过反向梯度传播方法传输到神经网络模型的各层,获得各层参数的梯度;根据梯度调整神经网络模型中各层的参数,直至所确定的损失值达到训练停止条件。The sample comprehensive text feature obtained by splicing the sample first text feature and the sample second text feature is input to the third neural network to be trained to obtain the predicted semantic type of the sample text; the loss value is calculated according to the predicted semantic type and the known label, The loss value is transmitted to each layer of the neural network model through the backward gradient propagation method to obtain the gradient of each layer parameter; the parameters of each layer in the neural network model are adjusted according to the gradient until the determined loss value reaches the training stop condition.
其中,调整神经网络模型中各层的参数具体包括调整全连接层的权重参数,第一神经网络和第二神经网络中各隐层的权重参数和偏置参数。计算损失值的函数可以是交叉熵损失函数。反向梯度传播方法可以是批量梯度下降方法(BGD)、小批量梯度下降方法(MGBD)以及随机梯度下降方法(SGD)。Among them, adjusting the parameters of each layer in the neural network model specifically includes adjusting the weight parameters of the fully connected layer, the weight parameters and bias parameters of each hidden layer in the first neural network and the second neural network. The function for calculating the loss value may be a cross-entropy loss function. The back gradient propagation method may be a batch gradient descent method (BGD), a small batch gradient descent method (MGBD), and a stochastic gradient descent method (SGD).
在本实施例中,通过第三神经网络层的注意力机制层对综合文本特征进行加权处理,突出对文本字符之间相互信息较高的特征,并弱化相互信息较低的特征;进一步,通过随机失活层对加权后的特征进行稀疏处理,得到经过稀疏处理后的稀疏特征,并通过全连接层对稀疏特征进行分类运算得到每个语义类型对应的预测概率,选取最大预测概率对应的语义类型作为目标文本的语义类型,利用综合文本特征表征目标文本的语义特征,并通过加权和稀疏处理,增加了文本字符的上下文本语义,降低了计算机设备的计算量,同时提高了目标样本的分类准确率。In this embodiment, the comprehensive text features are weighted through the attention mechanism layer of the third neural network layer to highlight features with higher mutual information between text characters, and weaken features with lower mutual information; further, through The random inactivation layer performs sparse processing on the weighted features to obtain sparse features after sparse processing, and performs classification operations on the sparse features through the fully connected layer to obtain the prediction probability corresponding to each semantic type, and selects the semantics corresponding to the maximum prediction probability Type is used as the semantic type of the target text, using comprehensive text features to represent the semantic features of the target text, and through weighting and sparse processing, increasing the contextual semantics of text characters, reducing the amount of computer equipment calculations, and improving the classification of target samples Accuracy.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这 些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of Figs. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在一个实施例中,如图4所示,提供了一种文本语义识别装置400,包括:向量计算模块402、向量拼接模块404、第一文本特征获取模块406、第二文本特征获取模块408和语义类型获取模块410,其中:In one embodiment, as shown in FIG. 4, a text semantic recognition device 400 is provided, including: a vector calculation module 402, a vector splicing module 404, a first text feature acquisition module 406, a second text feature acquisition module 408, and The semantic type acquisition module 410, where:
向量计算模块402,用于计算目标文本中每个文本字符的字向量及每个文本分词的词向量。The vector calculation module 402 is used to calculate the word vector of each text character in the target text and the word vector of each text segmentation.
向量拼接模块404,用于将每个文本字符的字向量与所属文本分词的词向量进行拼接,得到相应文本字字符的拼接向量。The vector splicing module 404 is used for splicing the word vector of each text character with the word vector of the corresponding text segmentation to obtain the splicing vector of the corresponding text character.
第一文本特征获取模块406,用于按照文本字符在目标文本的正向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第一神经网络的不同隐层,得到目标文本基于正向出现顺序的第一文本特征。The first text feature acquisition module 406 is configured to input word vectors and splicing vectors corresponding to multiple text characters into different hidden layers of the first neural network in sequence according to the positive appearance order of the text characters in the target text, and obtain the target text based on the positive The first text feature in the order of appearance.
第二文本特征获取模块408,用于按照文本字符在目标文本的逆向出现顺序,将多个文本字符对应的字向量及拼接向量依次输入第二神经网络的不同隐层,得到目标文本基于逆向出现顺序的第二文本特征。The second text feature acquisition module 408 is used to input word vectors and splicing vectors corresponding to multiple text characters into different hidden layers of the second neural network according to the reverse appearance order of the text characters in the target text to obtain the target text based on the reverse appearance The second text feature of the order.
语义类型获取模块410,用于将由第一文本特征与第二文本特征拼接得到的综合文本特征输入至第三神经网络,得到目标文本的语义类型。The semantic type obtaining module 410 is configured to input the integrated text feature obtained by splicing the first text feature and the second text feature into the third neural network to obtain the semantic type of the target text.
在一个实施例中,第一神经网络是按照文本字符在目标文本的正向出现顺序,将当前顺序文本字符对应的字向量及拼接向量输入第一神经网络的当前隐层,将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第一神经网络的下一隐层,将下一隐层作为当前隐层进行迭代,直至最后顺序文本字符,得到目标文本基于正向出现顺序的第一文本特征。In one embodiment, the first neural network inputs the word vector and the splicing vector corresponding to the text characters in the current sequence to the current hidden layer of the first neural network according to the forward appearance order of the text characters in the target text, and outputs the current hidden layer The character vector corresponding to the next sequential text character and the splicing vector are input to the next hidden layer of the first neural network, and the next hidden layer is used as the current hidden layer to iterate until the last sequential text character, and the target text is obtained based on the positive The first text feature in the order of appearance.
在一个实施例中,如图5所示,上述还包括预设样本生成模块412,基于预训练的第一神经网络层提取样本文本的字向量及词向量;对字向量和词向量分别进行字符编号;将字向量、词向量以及分别对应的字符编号写入到预设文件;计算文本字符的字向量和文本分词对应的词向量包括:对每个文本字符以及文本分词进行字符编号;基于字符编号,在预设文件中读取得到每个文本字符对应的字向量以及每个文本分词对应的词向量。In one embodiment, as shown in FIG. 5, the above further includes a preset sample generation module 412, which extracts the word vector and word vector of the sample text based on the pre-trained first neural network layer; performs characterization on the word vector and the word vector respectively. Numbering; write the word vector, word vector and the corresponding character number to the preset file; calculating the word vector of the text character and the word vector corresponding to the text word segmentation includes: character numbering for each text character and text word segmentation; based on the character Number, read in the preset file the word vector corresponding to each text character and the word vector corresponding to each text word segmentation.
在一个实施例中,上述第一文本特征获取模块还用于按照文本字符在目标文本的正向出现顺序,将当前顺序文本字符对应的字向量及拼接向量输入第一神经网络的当前隐层;将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第一神经网络的下一隐层;将下一隐层作为当前隐层,返回将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第一神经网络的下一隐层的步骤,直至最后顺序文本字符,得到目标文本基于正向出现顺序的第一文本特征。In one embodiment, the above-mentioned first text feature acquisition module is further configured to input the word vector and the splicing vector corresponding to the text characters in the current sequence into the current hidden layer of the first neural network according to the forward appearance order of the text characters in the target text; Input the character feature output by the current hidden layer and the word vector and splicing vector corresponding to the next sequential text character into the next hidden layer of the first neural network; use the next hidden layer as the current hidden layer, and return the characters output by the current hidden layer The character vector and the splicing vector corresponding to the next sequence of text characters are input into the next hidden layer of the first neural network until the last sequence of text characters, and the first text feature of the target text based on the forward appearance sequence is obtained.
在一个实施例中,上述第二文本特征获取模块还用于按照文本字符在目标文本的逆向出现顺序,将第一文本特征、第一顺序文本字符对应的字向量和拼接向量输入至第二神经网络的第一隐层;将第一隐层的输出、当前顺序文本字符对应的字向量及拼接向量输入第二神经网络的当前隐层;将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第二神经网络的下一隐层;将下一隐层作为当前隐层进行,返回将当前隐层输出的字符特征与下一顺序文本字符对应的字向量及拼接向量输入第二神经网络的下一隐层的步骤,直至最后顺序文本字符,得到目标文本基于逆向出现顺序的第二文本特征。In one embodiment, the above-mentioned second text feature acquisition module is further configured to input the first text feature, the word vector and the splicing vector corresponding to the text character in the first order to the second nerve according to the reverse appearance order of the text character in the target text. The first hidden layer of the network; input the output of the first hidden layer, the word vector and the splicing vector corresponding to the current sequential text character into the current hidden layer of the second neural network; the character feature output by the current hidden layer and the next sequential text character The corresponding word vector and splicing vector are input to the next hidden layer of the second neural network; the next hidden layer is used as the current hidden layer, and the word vector and splicing corresponding to the character feature output by the current hidden layer and the next sequential text character are returned The vector is input to the next hidden layer of the second neural network until the final sequence of text characters, and the second text feature of the target text based on the reverse appearance order is obtained.
在一个实施例中,上述第一文本特征获取模块还用于将字向量和上一隐层的输出作为第一子隐层的输入,第一子隐层用于根据第一子隐层对应的各个神经元节点的权重对字向量进行投影得到第一子字符特征;将第一子字符特征和拼接向量作为第二子隐层的输入,第二子隐层用于根据第二子隐层对应的各个神经元节点的权重对拼接向量进行投影得到第二子字符特征,作为当前隐层的输出。In an embodiment, the above-mentioned first text feature acquisition module is further used to use the word vector and the output of the previous hidden layer as the input of the first sub-hidden layer, and the first sub-hidden layer is used for the corresponding The weight of each neuron node projects the character vector to obtain the first sub-character feature; the first sub-character feature and the splicing vector are used as the input of the second sub-hidden layer, and the second sub-hidden layer is used to correspond to the second sub-hidden layer The weight of each neuron node is projected on the splicing vector to obtain the second sub-character feature as the output of the current hidden layer.
在一个实施例中,上述第一文本特征获取模块还用于将第一文本特征作为随机失活层的输入,随时失活层用于将第一文本特征中每个数据按照预设的稀疏概率进行投影得到稀疏特征向量,作为第一神经网络的输出。In one embodiment, the above-mentioned first text feature acquisition module is also used to use the first text feature as the input of the random inactivation layer, and the inactivation layer at any time is used to set each data in the first text feature according to a preset sparse probability Perform projection to obtain a sparse feature vector as the output of the first neural network.
在一个实施例中,上述语义类型获取模块还用于将第一文本特征和第二文本特征进行拼接,得到目标文本的综合文本特征;将综合文本特征作为注意力机制层的输入,注意力机制层用于将综合文本特征中每个数据进行加权处理得到加权特征;将加权特征作为随机失活层的输入,随机失活层用于将加权特征中每个数据按照预设的稀疏概率进行投影得到稀疏特征;将稀疏特征作为全连接层的输入,全连接层用于对稀疏特征进行分类运算得到每个语义类型对应的预测概率;选取预测概率最大的语义类型作为目标文本的语义类型。In one embodiment, the above-mentioned semantic type acquisition module is also used to splice the first text feature and the second text feature to obtain the comprehensive text feature of the target text; the comprehensive text feature is used as the input of the attention mechanism layer, the attention mechanism The layer is used to weight each data in the comprehensive text feature to obtain the weighted feature; the weighted feature is used as the input of the random inactivation layer, and the random inactivation layer is used to project each data in the weighted feature according to the preset sparse probability Sparse features are obtained; the sparse features are used as the input of the fully connected layer, and the fully connected layer is used to classify the sparse features to obtain the prediction probability corresponding to each semantic type; select the semantic type with the largest prediction probability as the semantic type of the target text.
上述实施例中,通过计算获得每个文本字符对应的字向量和所属文本分词的词向量并进行向量拼接,得到与文本字符对应的拼接向量,通过对文本字符进行向量拼接,通过多种特征向量来表征文本,增强了文本语言表示的特征维度。进一步,通过将字向量和拼接向量按照正向顺序和逆向顺序输入至不同神经网络的不同隐层,可以更充分的获取文本字符的相关信息,挖掘文本字符之间的上下文语义,使得经第一神经网络输出的第一文本特征和第二文本特征进行拼接得到综合特征,可以更充分的表达目标文本的语义特征,提高了文本语义识别的准确率。In the above embodiment, the word vector corresponding to each text character and the word vector of the word segmentation of the text are obtained by calculation and vector splicing is performed to obtain the splicing vector corresponding to the text character. The splicing vector corresponding to the text character is obtained through vector splicing of the text character. To characterize the text and enhance the feature dimension of the text language. Furthermore, by inputting word vectors and splicing vectors into different hidden layers of different neural networks in forward and reverse order, relevant information of text characters can be obtained more fully, and the contextual semantics between text characters can be mined, so that the first The first text feature and the second text feature output by the neural network are spliced to obtain a comprehensive feature, which can more fully express the semantic feature of the target text and improve the accuracy of text semantic recognition.
关于文本语义识别装置的具体限定可以参见上文中对于文本语义识别方法的限定,在此不再赘述。上述文本语义识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the text semantic recognition device, please refer to the above definition of the text semantic recognition method, which will not be repeated here. Each module in the above-mentioned text semantic recognition device can be implemented in whole or in part by software, hardware and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储预设文件以及目标文本所包含的文本字符对应的字向量和文本分词对应的词向量。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种文本语义识别方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 6. The computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store the word vector corresponding to the text characters contained in the preset file and the target text and the word vector corresponding to the text word segmentation. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to realize a text semantic recognition method.
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or less parts than shown in the figure, or combining some parts, or having a different part arrangement.
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现本申请任意一个实施例中提供的文本语义识别方法的步骤。A computer device includes a memory and a processor, the memory stores a computer program, and is characterized in that when the processor executes the computer program, the steps of the text semantic recognition method provided in any one of the embodiments of the present application are implemented.
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现本申请任意一个实施例中提供的文本语义识别方法的步骤。A computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the steps of the text semantic recognition method provided in any embodiment of the present application are realized.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,的计算机程序可存储于计算机可读取存储介质中,其中,所述计算机可读存储介质可以是非易失性,也可以是易失性的。该计算机程序在执 行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a computer readable storage medium, where the The computer-readable storage medium may be nonvolatile or volatile. When the computer program is executed, it may include the procedures of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.