CN114758339B - Method and device for acquiring character recognition model, computer equipment and storage medium - Google Patents

Method and device for acquiring character recognition model, computer equipment and storage medium Download PDF

Info

Publication number
CN114758339B
CN114758339B CN202210671644.8A CN202210671644A CN114758339B CN 114758339 B CN114758339 B CN 114758339B CN 202210671644 A CN202210671644 A CN 202210671644A CN 114758339 B CN114758339 B CN 114758339B
Authority
CN
China
Prior art keywords
character
recognition model
pictures
fused
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210671644.8A
Other languages
Chinese (zh)
Other versions
CN114758339A (en
Inventor
杨帆
刘枢
陈帅
王杰
李耀
徐威
孙宇君
吕江波
沈小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Simou Intelligent Technology Co ltd
Shenzhen Smartmore Technology Co Ltd
Original Assignee
Suzhou Simou Intelligent Technology Co ltd
Shenzhen Smartmore Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Simou Intelligent Technology Co ltd, Shenzhen Smartmore Technology Co Ltd filed Critical Suzhou Simou Intelligent Technology Co ltd
Priority to CN202210671644.8A priority Critical patent/CN114758339B/en
Publication of CN114758339A publication Critical patent/CN114758339A/en
Application granted granted Critical
Publication of CN114758339B publication Critical patent/CN114758339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The application relates to a method and a device for acquiring a character recognition model, computer equipment and a storage medium. The method comprises the following steps: acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts; obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures; according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings; training is carried out based on the picture group to be processed, the multiple fused pictures, the multiple fused character position labels and a preset character recognition model, and a character recognition model is obtained. By adopting the method, the number of samples can be enlarged, and the character recognition accuracy of the obtained character recognition model is improved.

Description

Method and device for acquiring character recognition model, computer equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for acquiring a character recognition model, a computer device, and a storage medium.
Background
With the development of character recognition requirements, the requirements on the efficiency and accuracy of character recognition are higher and higher, and in the existing method for acquiring the character recognition model, the character recognition model is obtained by training about hundreds of marked pictures.
However, in an industrial scenario, most industrial characters to be recognized are part information or serial numbers, and it is often difficult to collect pictures including all characters to be recognized or only a small number of pictures for training, and a professional is required to complete training of the character recognition model, and too little training data may cause overfitting of the character recognition model, and cannot achieve sufficient generalization to cope with the situations of noise, blurring, scratches, uneven illumination, and the like, and it is also difficult to recognize characters that are not in the training set, so that the conventional character recognition model acquisition method has a problem of low character recognition accuracy in the case of only a small number of sample pictures or incomplete characters.
Disclosure of Invention
Based on this, it is necessary to provide a method, an apparatus, a computer device, a computer readable storage medium, and a computer program product for acquiring a character recognition model, which can improve the character recognition accuracy, in order to solve the problem that the character recognition accuracy is not high in the case that only a few sample pictures or characters are incomplete in the conventional method for acquiring a character recognition model.
In a first aspect, the present application provides a method for acquiring a character recognition model. The method comprises the following steps:
acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group;
obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures;
according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels;
training is carried out based on the picture group to be processed, the multiple fused pictures, the multiple fused character position labels and a preset character recognition model, and a character recognition model is obtained.
In one embodiment, obtaining a plurality of sets of expected character strings according to a character table and a character font includes:
acquiring a character format and the number of fused pictures, wherein the character format comprises at least one of the following: character length, character pixel size, character mirror, character underline, and character shadow;
arranging and combining characters in a character table to obtain a plurality of groups of character strings with different sequences;
according to character fonts and character formats, performing format processing on each group of character strings in a plurality of groups of character strings in different orders respectively to obtain a plurality of groups of character strings after format processing;
and extracting and fusing a plurality of groups of character strings with the number matched with the number of the pictures from the character strings processed by the plurality of groups of formats to serve as a plurality of groups of expected character strings.
In one embodiment, training is performed based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and the preset character recognition model, so as to obtain the character recognition model, including:
acquiring a preset training round number and a model adjusting instruction;
adjusting the preset character recognition model based on the model adjusting instruction to obtain an adjusted character recognition model;
and inputting the group of pictures to be processed, the plurality of fused pictures and the plurality of fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of the preset training round number.
In one embodiment, the model adjustment instructions include at least one of: a large model switching instruction and a small model switching instruction and a preset character range instruction;
adjusting the preset character recognition model based on the model adjustment instruction, wherein obtaining the adjusted character recognition model comprises:
if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an adjusted character recognition model;
if the model adjusting instruction comprises the preset character range instruction, adjusting the structure of the preset character recognition model according to the preset character range instruction to obtain an adjusted character recognition model;
if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction and the preset character range instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an initially adjusted character recognition model; and adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model.
In one embodiment, training is performed based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and the preset character recognition model, so as to obtain the character recognition model, including:
respectively carrying out image preprocessing on each to-be-processed picture in the to-be-processed picture group and each fused picture in the fused pictures to obtain a plurality of preprocessed pictures, wherein the image preprocessing comprises at least one of the following steps: picture cutting processing, image rotation processing and image binarization processing;
dividing a plurality of preprocessed pictures into a training set and a test set;
inputting the training set and the corresponding character position labels into a preset character recognition model for training to obtain an initial character recognition model;
and inputting the test set into the initial character recognition model for testing to obtain the character recognition model.
In one embodiment, inputting the test set into an initial character recognition model for testing to obtain a character recognition model, includes:
obtaining a model threshold, wherein the model threshold comprises at least one of the following: a confidence threshold and a character size threshold;
testing the initial character recognition model based on each sample in the test set to obtain a test result corresponding to each sample in the test set;
if the test result does not meet the preset accuracy, adjusting a confidence threshold, returning to the step of inputting the training set and the corresponding character position labels to a preset character recognition model for training to obtain an initial character recognition model;
if the test result does not meet the preset definition, adjusting character size thresholds corresponding to a plurality of groups of fused pictures to obtain a plurality of groups of threshold-adjusted pictures, assigning the plurality of groups of threshold-adjusted pictures to the plurality of groups of fused pictures, and returning to the step of respectively carrying out image preprocessing on each picture to be processed in the group of pictures to be processed and each fused picture in the plurality of fused pictures to obtain a plurality of preprocessed pictures;
and if the test result meets the preset definition and the preset accuracy, taking the initial character recognition model as the character recognition model.
In one embodiment, obtaining a plurality of fused character position labels based on the character position label group and the plurality of groups of expected character strings includes:
acquiring the length of each expected character string in a plurality of groups of expected character strings;
intercepting the length of the expected character string from each character position label in the character position label group to obtain a plurality of fused character position labels.
In one embodiment, the step of obtaining the annotation group of character positions includes:
judging whether the picture to be processed corresponds to a character position label or not for each picture to be processed in the picture group to be processed, and if so, adding at least one character position label corresponding to the picture to be processed into a character position label group; and if not, performing labeling processing on the picture to be processed by using a character labeling algorithm to obtain at least one character position label corresponding to the picture to be processed, and adding the at least one character position label corresponding to the picture to be processed into the character position label group.
In a second aspect, the application further provides an apparatus for acquiring a character recognition model. The device comprises:
the data acquisition module is used for acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group;
the character string picture acquisition module is used for acquiring a plurality of groups of expected character strings according to the character table and the character fonts and converting the plurality of groups of expected character strings into a plurality of character string pictures;
the data fusion module is used for fusing each picture to be processed in the picture group to be processed with corresponding character string pictures in the character string pictures respectively according to the character position labeling group to obtain a plurality of fused pictures, and acquiring a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels;
and the model training module is used for training based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels and a preset character recognition model to obtain the character recognition model.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group; obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures; according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels; training is carried out based on the picture group to be processed, the multiple fused pictures, the multiple fused character position labels and a preset character recognition model, and a character recognition model is obtained.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group; obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures; according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels; training is carried out based on the picture group to be processed, the multiple fused pictures, the multiple fused character position labels and a preset character recognition model, and a character recognition model is obtained.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:
acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group; obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures; according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels; training is carried out based on the picture group to be processed, the multiple fused pictures, the multiple fused character position labels and a preset character recognition model, and a character recognition model is obtained.
According to the method, the device, the computer equipment, the storage medium and the computer program product for acquiring the character recognition model, a plurality of groups of expected character strings are generated through the acquired character list and character fonts, and a plurality of fused pictures and a plurality of fused character position labels are acquired through a fusion method according to one or a plurality of pictures to be processed, character position labels corresponding to the pictures to be processed and the plurality of groups of expected character strings, so that a plurality of groups of fused pictures can be generated under the condition of only a small number of pictures or incomplete characters, and a large number of samples can be manufactured for model training; the method for obtaining the character recognition model based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and the preset character recognition model comprises the steps of training a large number of sample pictures obtained by fusing a small number of pictures and the corresponding fused character position labels to obtain the character recognition model, so that the number of samples is increased, and the character recognition accuracy of the obtained character recognition model can be improved.
Drawings
FIG. 1 is a diagram illustrating an exemplary implementation of a method for obtaining a character recognition model;
FIG. 2 is a flowchart illustrating a method for obtaining a character recognition model according to an embodiment;
FIG. 3 is a flowchart illustrating a method for obtaining a character recognition model according to another embodiment;
FIG. 4 is a flowchart illustrating a method for obtaining a character recognition model according to another embodiment;
FIG. 5 is a flowchart illustrating a method for acquiring a character recognition model according to still another embodiment;
FIG. 6 is a schematic diagram illustrating a sub-flow of S880 according to an embodiment;
FIG. 7 is a flowchart illustrating the steps for obtaining a character recognition model according to one embodiment;
FIG. 8 is a flowchart illustrating a character recognition model obtaining step in another embodiment;
FIG. 9 is a flowchart illustrating the character recognition model obtaining step in yet another embodiment;
FIG. 10 is a block diagram showing an arrangement for acquiring a character recognition model according to an embodiment;
FIG. 11 is a diagram illustrating an internal structure of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method for acquiring the character recognition model provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 acquires a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group; obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures; according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels; training is carried out based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and a preset character recognition model, a character recognition model is obtained, and the character recognition model is sent to the server 104. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In an embodiment, as shown in fig. 2, a method for obtaining a character recognition model is provided, which is described by taking the method as an example applied to the terminal 102 in fig. 1, and includes the following steps:
s200, acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group.
The to-be-processed picture can be a picture with characters, and can also be a picture after the characters are scratched, the to-be-processed picture group comprises at least one to-be-processed picture, the character position labels are obtained by labeling the to-be-processed picture with the characters or the picture before the characters are scratched by using a picture labeling tool, each character in the to-be-processed picture with the characters or a plurality of characters in the picture before the characters are scratched can be labeled during labeling, the character position label corresponding to each character is obtained, any number of character combinations in the to-be-processed picture with the characters or the picture before the characters are scratched can also be labeled, the character position label corresponding to each character combination is obtained, the character position labels form a character position label group, the character position label group comprises at least one character position label, and each to-be-processed picture corresponds to the at least one character position label in the character position label group. The character table may be a table composed of all possible characters corresponding to the picture to be processed, or a character table composed of characters according to the task requirements, the character table may specify a fixed sequence of the characters, or may also be an unlimited sequence of the characters, the character table includes at least one character, the character font is a font corresponding to the picture to be processed, or may also be a character font required by the task, and the type of the character font includes at least one.
S400, obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures.
Wherein, the characters in the character table are arbitrarily combined to generate a plurality of character string combinations, each character string combination in the plurality of character string combinations is respectively endowed with one character font in a plurality of character fonts, character strings corresponding to the plurality of character fonts can be generated, each character in each character string combination can be respectively endowed with one character font in the plurality of character fonts to generate a plurality of groups of character strings containing the plurality of character fonts, the generated character strings corresponding to the plurality of character fonts and a plurality of groups of character strings containing the plurality of character fonts are a plurality of groups of expected character strings, the plurality of groups of expected character strings are converted into a corresponding character string picture in a plurality of character string pictures by a picture conversion method, or any plurality of groups of expected character strings in the plurality of groups of expected character strings are converted into a corresponding character string picture in a plurality of character string pictures, each character string picture in the plurality of character string pictures comprises at least one group of expected character strings in the plurality of groups of expected character strings.
S600, according to the character position labeling group, fusing each to-be-processed picture in the to-be-processed picture group with a corresponding character string picture in a plurality of character string pictures respectively to obtain a plurality of fused pictures, and obtaining a plurality of fused character position labels based on the character position labeling group and a plurality of groups of expected character strings, wherein one fused picture corresponds to at least one fused character label in the plurality of fused character position labels.
The character position labeling comprises the position of a character or a character string in a picture to be processed, each picture to be processed in a picture group to be processed is respectively fused with a corresponding character string picture in a plurality of character string pictures according to the character position labeling group, namely, the corresponding character string pictures in the plurality of character string pictures are respectively embedded at the position of the character or the character string corresponding to each picture to be processed in the picture group to be processed, the adopted fusion method can be an image fusion algorithm, preferably, the adopted fusion method can be a Poisson fusion algorithm, and the Poisson fusion algorithm can accurately select a fusion area compared with the traditional image fusion algorithm to obtain a picture with seamless fusion at the boundary. The character position labels comprise character position label frames which can be in any shapes and have specific sizes, the character position labels after fusion are the character position labels corresponding to the fused pictures, and because the character string pictures are fused at the corresponding characters or character string positions of the pictures to be processed, the size or length of the character strings in the fused pictures can exceed the size of the character position label frames and can be far smaller than the size of the character position label frames, therefore, the fused character position labels corresponding to the fused pictures need to be obtained again according to the character position label groups and multiple groups of expected character strings, and one fused picture corresponds to at least one fused character label in the multiple fused character position labels.
And S800, training based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels and a preset character recognition model to obtain the character recognition model.
The character recognition model is a machine learning model constructed by adopting a character recognition algorithm, the character recognition model recognizes characters in an image, a recognition result generally comprises character content in the image, a common character recognition algorithm comprises a template matching character recognition algorithm, a neural network character recognition algorithm and a support vector machine character recognition algorithm, a preset character recognition model is a basic model which is prestored in a terminal and used for model training, the trained machine learning model can be obtained by training based on a picture group to be processed, a plurality of fused pictures, a plurality of fused character position labels and the preset character recognition model, and the trained machine learning model is the character recognition model.
In the method for acquiring the character recognition model, a plurality of groups of expected character strings are generated through the acquired character list and character fonts, and a plurality of fused pictures and a plurality of fused character position labels are acquired through a fusion method according to one or more pictures to be processed, character position labels corresponding to the pictures to be processed and the plurality of groups of expected character strings, so that a plurality of groups of fused pictures can be generated under the condition of only a small number of pictures or incomplete characters, and a large number of samples can be manufactured for model training; the method for obtaining the character recognition model based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and the preset character recognition model comprises the steps of training a large number of sample pictures obtained by fusing a small number of pictures and the corresponding fused character position labels to obtain the character recognition model, so that the number of samples is increased, and the character recognition accuracy of the obtained character recognition model can be improved.
In one embodiment, as shown in fig. 3, obtaining multiple sets of expected character strings according to the character table and the character font includes:
s420, acquiring a character format and the number of the fused pictures, wherein the character format comprises at least one of the following characters: character length, character pixel size, character mirror, character underline, and character shadow;
s440, arranging and combining the characters in the character table to obtain a plurality of groups of character strings with different sequences;
s460, according to the character font and the character format, performing format processing on each group of character strings in a plurality of groups of character strings with different sequences respectively to obtain a plurality of groups of character strings after format processing;
and S480, extracting a plurality of groups of character strings matched with the number of the fused pictures from the character strings processed by the plurality of groups of formats to serve as a plurality of groups of expected character strings.
In this embodiment, in order to obtain multiple sets of expected character strings, a character format and a number of fused pictures may also be obtained in advance, and multiple sets of expected character strings conforming to the character format and the number of fused pictures are further obtained based on the character format and the number of fused pictures, where the character format includes at least one of: the method comprises the following steps of character length, character pixel size, character mirror image, character underlines and character shadow, wherein the character length is the number of characters in a character string, the character pixel size is the pixel size of the characters in a character string picture, the character mirror image is to perform mirror image processing on the characters, the character underlines are character adding underlines, the character shadow is character adding shadow, and the number of fused pictures is the number of fused pictures correspondingly generated by one picture to be processed. The characters in the character table are arranged and combined, and a plurality of groups of character strings with different sequences can be obtained. According to character fonts and character formats, performing format processing on each group of character strings in a plurality of groups of character strings in different orders respectively to obtain character strings after the processing of the plurality of groups of formats, specifically, the character fonts comprise one or more character fonts, performing font conversion on each group of character strings in the plurality of groups of character strings in different orders according to the character fonts to obtain character strings corresponding to the plurality of groups of different character fonts, if the character formats comprise character lengths, extracting character strings with corresponding character lengths from the character strings corresponding to the plurality of groups of different character fonts to obtain a plurality of groups of character strings after length processing, and adding the character strings after length processing into the character strings after format processing; if the character format comprises the size of character pixels, converting each group of character strings in character strings corresponding to a plurality of groups of different character fonts into character strings corresponding to the size of the character pixels to obtain a plurality of groups of character strings after size processing, and adding the plurality of groups of character strings after size processing into the plurality of groups of character strings after format processing; if the character format comprises a character mirror image, carrying out mirror image processing on each group of character strings in the character strings corresponding to the multiple groups of different character fonts to obtain multiple groups of character strings subjected to mirror image processing, and adding the multiple groups of character strings subjected to mirror image processing into the multiple groups of character strings subjected to format processing; if the character format comprises character underlines, adding underlines to each group of character strings in the character strings corresponding to the multiple groups of different character fonts to obtain underline-processed character strings, and adding the multiple groups of underline-processed character strings to the multiple groups of format-processed character strings; and if the character format comprises character shadow, adding shadow to each group of character strings in the character strings corresponding to the multiple groups of different character fonts to obtain shadow-processed character strings, and adding the multiple groups of shadow-processed character strings to the multiple groups of format-processed character strings. And multiplying the number of the fused pictures by the number of the pictures to be processed to determine the number of the fused pictures, and extracting a plurality of groups of character strings matched with the number of the fused pictures from a plurality of groups of character strings processed in formats corresponding to each picture to be processed in the group of pictures to be processed respectively to serve as a plurality of groups of expected character strings.
According to the scheme of the embodiment, through the character arrangement combination in the character table, a plurality of groups of character strings with different arrangement sequences can be obtained, the number of the character strings is increased, the number of samples in model training is favorably increased, and the character recognition accuracy of the obtained character recognition model is improved; and then setting different character fonts and character formats for each group of character strings in a plurality of groups of character strings with different arrangement sequences, wherein the character fonts and the character formats can be flexibly selected and adjusted according to user requirements, so that the character strings meeting the user requirements can be generated, the number of fused pictures corresponding to one picture to be processed can be set, the number of the fused pictures can be determined, and then a plurality of groups of character strings matched with the number of the fused pictures can be extracted and used as a plurality of groups of expected character strings.
In an embodiment, as shown in fig. 4, training based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and the preset character recognition model to obtain the character recognition model includes:
s810, acquiring a preset training round number and a model adjusting instruction;
s830, adjusting a preset character recognition model based on the model adjusting instruction to obtain an adjusted character recognition model;
and S850, inputting the picture group to be processed, the plurality of fused pictures and the plurality of fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of a preset training round number.
In this embodiment, before training the preset character recognition model, the preset training round number and the model adjustment instruction may also be obtained, the model parameter or the model structure of the preset character recognition model is adjusted according to the content of the model adjustment instruction, the adjusted character recognition model is obtained, the group of pictures to be processed, the plurality of fused pictures and the plurality of fused character position labels are input to the adjusted character recognition model for training, and after training of the preset training round number, the character recognition model is obtained.
According to the scheme of the embodiment, the preset character recognition model is adjusted through the obtained model adjusting instruction, the number of preset training rounds is obtained, the picture group to be processed, the multiple fused pictures and the multiple fused character position labels are input into the adjusted character recognition model for training, and the character recognition model is obtained after training of the number of the preset training rounds.
In one embodiment, the model adjustment instructions include at least one of: a large model switching instruction and a small model switching instruction and a preset character range instruction; adjusting the preset character recognition model based on the model adjusting instruction, wherein the step of obtaining the adjusted character recognition model comprises the following steps: if the model adjusting instruction comprises a size model switching instruction, performing model switching on a preset character recognition model according to the size model switching instruction to obtain an adjusted character recognition model; if the model adjusting instruction comprises a preset character range instruction, adjusting the structure of the preset character recognition model according to the preset character range instruction to obtain an adjusted character recognition model; if the model adjusting instruction comprises a size model switching instruction and a preset character range instruction, performing model switching on a preset character recognition model according to the size model switching instruction to obtain an initially adjusted character recognition model; and adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model.
In this embodiment, the model adjustment instruction includes at least one of the following: a size model switching instruction and a preset character range instruction. The method comprises the steps that a size model switching instruction is used for switching a size model of a preset character recognition model, if a model adjusting instruction comprises the size model switching instruction, the preset character recognition model is switched according to the size model switching instruction to obtain an adjusted character recognition model, specifically, if the model adjusting instruction comprises the size model switching instruction and the size model switching instruction is used for switching the large model, the preset character recognition model is switched to a machine learning model with high power consumption, long training time and high precision, and the machine learning model with high power consumption, long training time and high precision is used as the adjusted character recognition model; and if the model adjusting instruction comprises a small model switching instruction, and the small model switching instruction is a small model switching instruction, switching the preset character recognition model into a machine learning model with low power consumption, short training time and low precision, and taking the machine learning model with low power consumption, short training time and low precision as the adjusted character recognition model. The preset character range instruction is used for setting the type of output data of the preset character recognition model, and the preset character range instruction comprises the following steps: character type, number type, special symbol type, character plus number type, character plus special symbol type, number plus special symbol type, and character plus number plus special symbol type, adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model, specifically, according to the preset character range command, the structure of the preset character recognition model is adjusted, so that the type of the output data of the adjusted character recognition model is the type corresponding to the preset character range command, for example, if the preset character range command is a character type, the structure of the predetermined character recognition model is adjusted so that the output data of the adjusted character recognition model is of character type, for example, if the predetermined character range command is of character plus number type, the structure of the preset character recognition model is adjusted to make the output data type of the adjusted character recognition model be a character plus number type. If the model adjusting instruction comprises a size model switching instruction and a preset character range instruction, performing model switching on a preset character recognition model according to the size model switching instruction to obtain an initially adjusted character recognition model; and adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model.
In an aspect of the foregoing embodiment, the model adjustment instruction includes at least one of: the method comprises the steps that a large-size model switching instruction and a small-size model range presetting instruction are used, model switching or structure adjustment is conducted on a preset character recognition model through a model adjusting instruction, the adjusted character recognition model is obtained, the customized requirements of a user can be matched, an operator can realize switching of the large-size model and adjustment of the model structure only by adjusting the model instruction, the operation method of the operator is simplified, the generation efficiency of the character recognition model is improved, meanwhile, parameter and structural adjustment is conducted on the preset character recognition model, and the character recognition accuracy of the obtained character recognition model is improved.
In one embodiment, training is performed based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and the preset character recognition model to obtain the character recognition model, and the method further includes: acquiring a data enhancement instruction; performing data enhancement processing on each fused picture in the multiple fused pictures according to the data enhancement instruction to obtain multiple data enhanced pictures; and inputting the multiple data-enhanced pictures and the multiple fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of a preset training round number.
In this embodiment, the data enhancement is a method for expanding the sample data scale in machine learning, so that the model has better generalization capability, the data enhancement instruction is used to perform data enhancement processing on each of the multiple fused pictures to obtain the data enhancement instruction, and the data enhancement instruction includes at least one of the following: cropping, flip transformation, rotation, color transformation, geometric transformation, noise injection, movement, random erasure, kernel filters, blending images, scaling transformation, feature space enhancement, countermeasure generation, neural style transformation, and meta-learning data enhancement, performing data enhancement processing on each fused picture in the plurality of fused pictures according to the data enhancement instruction to obtain a plurality of data enhanced pictures, for example, if the data enhancement instruction includes cropping, and cutting each fused picture in the multiple fused pictures to obtain multiple cut pictures, namely the multiple data-enhanced pictures, inputting the multiple data-enhanced pictures and the multiple fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training by presetting the number of training rounds.
According to the scheme of the embodiment, the data enhancement instruction is obtained, and the data enhancement instruction is adopted to perform data enhancement processing on each fused picture in the multiple fused pictures, so that the robustness of the obtained character recognition model can be improved by the obtained multiple data-enhanced pictures, and the character recognition accuracy of the obtained character recognition model can be improved.
In one embodiment, as shown in fig. 5, training based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and the preset character recognition model to obtain the character recognition model includes:
s820, each to-be-processed picture in the to-be-processed picture group and each fused picture in the fused pictures are subjected to image preprocessing respectively to obtain a plurality of preprocessed pictures, wherein the image preprocessing comprises at least one of the following steps: picture cutting processing, image rotation processing and image binarization processing;
s840, dividing the plurality of preprocessed pictures into a training set and a test set;
s860, inputting the training set and the corresponding character position labels into a preset character recognition model for training to obtain an initial character recognition model;
and S880, inputting the test set into the initial character recognition model for testing to obtain the character recognition model.
In this embodiment, the image preprocessing includes at least one of: the method comprises the steps of picture cutting processing, image rotation processing and image binarization processing, wherein each picture to be processed in a picture group to be processed and each fused picture in a plurality of fused pictures are respectively subjected to image preprocessing to obtain a plurality of preprocessed pictures, for example, if the image preprocessing comprises the image rotation processing, each picture to be processed in the picture group to be processed and each fused picture in the plurality of fused pictures are subjected to image rotation, the obtained plurality of rotated pictures are the plurality of preprocessed pictures, and for example, if the image preprocessing comprises the image rotation processing and the image binarization processing, each picture to be processed in the picture group to be processed and each fused picture in the plurality of fused pictures are subjected to image rotation to obtain a plurality of rotated pictures, and the plurality of rotated pictures are subjected to image binarization processing, the obtained binaryzation processed pictures are the multiple preprocessed pictures. The method comprises the steps of taking a plurality of preprocessed pictures as sample data for training a preset character recognition model, dividing the preprocessed pictures into a training set and a test set, wherein the division ratio can be 8:2 or 7: 3.
In the scheme of the embodiment, each picture to be processed in the picture group to be processed and each fused picture in the fused pictures are respectively subjected to image preprocessing to obtain a plurality of preprocessed pictures, the preprocessed pictures are taken as sample data and divided into a training set and a test set, a preset character recognition model is trained through the training set and the corresponding character position labels to obtain an initial character recognition model, the initial character recognition model is tested and optimized through the test set to obtain the character recognition model, the method for preprocessing the pictures to be processed and the fused pictures can effectively eliminate irrelevant information in the pictures, enhance the detectability of the pictures, and is favorable for improving the character recognition accuracy of the obtained character recognition model, and the method for dividing the preprocessed pictures into the training set and the test set, the obtained character recognition model can be optimized, and the character recognition accuracy of the obtained character recognition model is further improved.
In one embodiment, as shown in fig. 6, inputting the test set to the initial character recognition model for testing to obtain the character recognition model includes:
s881, obtaining a model threshold, where the model threshold includes at least one of: a confidence threshold and a character size threshold;
s882, testing the initial character recognition model based on each sample in the test set to obtain a test result corresponding to each sample in the test set;
s883, if the test result does not meet the preset accuracy, adjusting a confidence threshold, returning to the step of inputting the training set and the corresponding character position labels to a preset character recognition model for training to obtain an initial character recognition model;
s884, if the test result does not meet the preset definition, adjusting character size thresholds corresponding to the multiple groups of fused pictures to obtain multiple groups of threshold-adjusted pictures, assigning the multiple groups of threshold-adjusted pictures to the multiple groups of fused pictures, and returning to the step of respectively carrying out image preprocessing on each picture to be processed in the group of pictures to be processed and each fused picture in the multiple groups of fused pictures to obtain multiple preprocessed pictures;
and S885, if the test result meets the preset definition and the preset accuracy, taking the initial character recognition model as a character recognition model.
In this embodiment, the test result includes a character content and a character position labeling box, the model threshold is a threshold set for a model parameter or sample data according to the test result of the character recognition model, and is used to optimize the character recognition model and obtain the model threshold, and the model threshold includes at least one of the following: and if the test result does not meet the preset accuracy, adjusting the confidence threshold, returning to the step of inputting the training set and the corresponding character position labels to the preset character recognition model for training, and obtaining the initial character recognition model, wherein the higher the confidence threshold is, the higher the accuracy of the test result is correspondingly, but the corresponding recall rate is correspondingly reduced. The character size threshold is a threshold set for the size of a character pixel in a character string picture, if more noise is recognized in a test result, a plurality of noise frames can appear, the character size threshold is increased, the noise frames in the test result can be eliminated, if the character position marking frame in the test result is too small, the character size threshold is reduced, and the output character position marking frame can be increased. Testing an initial character recognition model based on each sample in a test set to obtain a test result corresponding to each sample in the test set, optimizing the initial character recognition model according to the test result to obtain a character recognition model, specifically, if the test result does not meet the preset accuracy, adjusting a confidence threshold, returning to input a training set and a corresponding character position label to the preset character recognition model for training to obtain the initial character recognition model, if the test result does not meet the preset definition, adjusting a character size threshold corresponding to a plurality of groups of fused pictures to obtain a plurality of groups of pictures with adjusted thresholds, assigning the plurality of groups of pictures with adjusted thresholds to the plurality of groups of fused pictures, returning to respectively pre-process each picture to be processed in the group of pictures to be processed and each fused picture in the plurality of fused pictures, and a step of obtaining a plurality of preprocessed pictures, adjusting the reliability threshold value and the character size threshold value until the test result meets the preset definition and the preset accuracy, and taking the initial character recognition model as the character recognition model.
According to the scheme of the embodiment, the reliability threshold value and the character size threshold value are respectively adjusted according to the accuracy and the definition of the test result corresponding to each sample in the test set, so that the purpose of optimizing the initial character recognition model is achieved, and the character recognition accuracy of the obtained character recognition model is improved.
In one embodiment, as shown in fig. 7, obtaining a plurality of fused character position labels based on the character position label group and the plurality of groups of expected character strings includes:
s620, acquiring the length of each expected character string in the multiple groups of expected character strings;
s640, intercepting the length of the expected character string in each character position label in the character position label group to obtain a plurality of fused character position labels.
In this embodiment, the length of each expected character string in a plurality of groups of expected character strings is obtained by a character string length obtaining method, the length of the expected character string is intercepted from each character position label in a character position label group, a plurality of fused character position labels are obtained, specifically, a starting position of the character position label is obtained from each character position label in the character position label group, the length of the expected character string is intercepted from the starting position in a corresponding character position label frame, the starting position is used as the starting position of the corresponding fused character position label, the starting position is added with the length of the expected character string, and a corresponding ending position of the fused character position label is determined, so that the corresponding fused character position label is obtained, and a plurality of fused character position labels are obtained.
According to the scheme of the embodiment, the length of each expected character string in the multiple groups of expected character strings is obtained, the length of the expected character string is intercepted from each character position label in the character position label group, the multiple fused character position labels are obtained, the character position label before the image fusion can be converted into the character position label more fitting the characters in the fused image, the optimization of sample data is facilitated, and the character recognition accuracy of the obtained character recognition model is further improved.
In one embodiment, as shown in fig. 8, the step of obtaining the annotation group of character positions includes:
s220, judging whether the picture to be processed corresponds to a character position label or not aiming at each picture to be processed in the picture group to be processed, and if so, adding at least one character position label corresponding to the picture to be processed into the character position label group; and if not, performing labeling processing on the picture to be processed by using a character labeling algorithm to obtain at least one character position label corresponding to the picture to be processed, and adding the at least one character position label corresponding to the picture to be processed into the character position label group.
In this embodiment, the to-be-processed picture may be a picture with characters or a picture after characters are scratched, the character position label may be obtained by labeling a to-be-processed picture group with characters or by labeling a picture group before characters are scratched, in order to obtain a character position label group, it is further necessary to determine whether the to-be-processed picture corresponds to a character position label for each to-be-processed picture in the to-be-processed picture group, and if so, add at least one character position label corresponding to the to-be-processed picture to the character position label group; if not, performing labeling processing on the picture to be processed by using a character labeling algorithm to obtain at least one character position label corresponding to the picture to be processed, and adding the at least one character position label corresponding to the picture to be processed into the character position label group, wherein the character labeling algorithm is an algorithm for adding labels to characters in the picture to be processed, a label can be added to one character in the picture to be processed, labels can also be added to a plurality of characters in the picture to be processed, the obtained labels are character position labels, and it needs to be noted that the method for performing labeling processing on the picture to be processed can also be software adopting the character labeling algorithm, for example, Lableme image labeling software is used for performing labeling processing on the picture to be processed to obtain at least one character position label corresponding to the picture to be processed.
According to the scheme of the embodiment, whether the picture to be processed corresponds to the character position label is judged, the picture to be processed without the corresponding character position label is subjected to label processing by adopting a character label algorithm, at least one character position label corresponding to the picture to be processed is obtained, at least one character position label corresponding to each picture to be processed in the picture group to be processed is added into the character position label group, and the character position label group is obtained.
To explain the method and effect of obtaining the character recognition model in this embodiment in detail, the following description is made with a most detailed embodiment:
the method for acquiring the character recognition model is applied to a production platform of the character recognition model, as shown in fig. 9, the production platform of the character recognition model is a schematic flow diagram of the steps for acquiring the character recognition model, and the production platform of the character recognition model comprises an image fusion module, an image processing module, a model training module and a model testing module, and a to-be-processed image, 3 groups of character position labels, a character table { a, b, c, 1, 2} and a Song style TTF (truetypefent) file are selected and input in the image fusion module, wherein the 3 groups of character position labels are obtained by labeling the to-be-processed image by Lableme image labeling software, and the character format is selected and input: the character length is 4 and the character pixel size is 30, the number of the fused pictures is 100, 300 groups of character strings with the font of Song and the character pixel size of 30 are obtained, the character strings comprise abc1, abc2, bc12, cab2, 2b1a and the like, 300 groups of character strings are respectively converted into corresponding character string pictures, 300 character string pictures are respectively subjected to Poisson fusion with pictures to be processed, 100 fused pictures are obtained, 3 character string pictures are respectively fused at the position marks of 3 character positions corresponding to each fused picture, the length of the character with the length of 4 is intercepted from each group of character position marks in the 3 groups of character position marks, 3 fused character position marks are obtained, each fused picture in the 100 fused pictures respectively corresponds to the 3 fused character position marks, the picture processing module selects the images to be preprocessed into image rotation and binaryzation images, respectively carrying out image rotation and image binarization on 100 fused images and 1 to-be-processed image to obtain 101 preprocessed images, and taking 81 of the preprocessed images as a training set and 20 preprocessed images as a test set; the model training module can input a preset training round number of 50, a model adjusting instruction of a small model, a preset character range instruction of a character plus number type and a data enhancing instruction of image inversion, the images in the training set are subjected to image inversion to obtain data enhanced images, the preset character recognition model is adjusted to be a model with low power consumption, short training time and low precision, the output of the model is adjusted to be the character plus number type, an adjusted character recognition model is obtained, the data enhanced images and fused character position labels are input to the adjusted character recognition model for 50 rounds of training, and an initial character recognition model is obtained; in a model test module, obtaining a model threshold value, wherein the model threshold value comprises a confidence threshold value and a character size threshold value, testing an initial character recognition model based on each sample picture in a test set to obtain a test result corresponding to each sample picture in the test set, if the test result does not meet the preset accuracy, adjusting the confidence threshold value to be 0.9, returning to label and input the data-enhanced picture and the fused character position to the adjusted character recognition model for 50 rounds of training to obtain the initial character recognition model, if the test result does not meet the preset definition, adjusting the character size threshold value corresponding to 100 fused pictures to be 25 to obtain 100 threshold-adjusted pictures, assigning 100 threshold-adjusted pictures to 100 fused pictures, returning to respectively perform image rotation and image binarization on 100 fused pictures and 1 picture to be processed, and obtaining 101 preprocessed pictures, and if the test result meets the preset definition and the preset accuracy, taking the initial character recognition model as the character recognition model.
According to the method for acquiring the character recognition model, a plurality of groups of expected character strings are generated through the acquired character list and character fonts, and a plurality of fused pictures and a plurality of fused character position labels are acquired through a fusion method according to one or more pictures to be processed, character position labels corresponding to the pictures to be processed and the plurality of groups of expected character strings, so that a plurality of groups of fused pictures can be generated under the condition of only a small number of pictures or incomplete characters, and a large number of samples can be manufactured for model training; the method for obtaining the character recognition model based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and the preset character recognition model comprises the steps of training a large number of sample pictures obtained by fusing a small number of pictures and the corresponding fused character position labels to obtain the character recognition model, expanding the number of samples, improving the accuracy of the obtained character recognition model, matching with the customization requirements of users, realizing the generation of the character recognition model by an operator only through a small number of operations and parameter settings, realizing the switching of the size model and the adjustment of the model structure only through adjusting a model instruction, simplifying the operation method of the operator, and being beneficial to improving the efficiency of the generation of the character recognition model.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an obtaining apparatus of a character recognition model for implementing the above obtaining method of a character recognition model. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so that specific limitations in the following embodiments of the device for acquiring one or more character recognition models may refer to the limitations on the method for acquiring the character recognition models, and are not described herein again.
In one embodiment, as shown in fig. 10, there is provided an apparatus 100 for acquiring a character recognition model, including: a data acquisition module 120, a string picture acquisition module 140, a data fusion module 160, and a model training module 180, wherein:
the data obtaining module 120 is configured to obtain a group of pictures to be processed, a character position labeling group, a character table, and a character font, where the group of pictures to be processed includes at least one picture to be processed, and each picture to be processed corresponds to at least one character position label in the character position labeling group.
The character string picture acquiring module 140 is configured to acquire a plurality of groups of expected character strings according to the character table and the character font, and convert the plurality of groups of expected character strings into a plurality of character string pictures.
And the data fusion module 160 is configured to fuse each to-be-processed picture in the to-be-processed picture group with a corresponding character string picture in the plurality of character string pictures respectively according to the character position label group to obtain a plurality of fused pictures, and obtain a plurality of fused character position labels based on the character position label group and the plurality of groups of expected character strings, where one fused picture corresponds to at least one fused character label in the plurality of fused character position labels.
The model training module 180 is configured to train based on the to-be-processed picture group, the plurality of fused pictures, the plurality of fused character position labels, and a preset character recognition model, so as to obtain a character recognition model.
The device for acquiring the character recognition model generates a plurality of groups of expected character strings through the acquired character list and character fonts, and obtains a plurality of fused pictures and a plurality of fused character position labels through a fusion method according to one or more pictures to be processed, the character position labels corresponding to the pictures to be processed and the plurality of groups of expected character strings, so that a plurality of groups of fused pictures can be generated under the condition of only a small number of pictures or incomplete characters, and a large number of samples can be manufactured for model training; the method for obtaining the character recognition model based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and the preset character recognition model comprises the steps of training a large number of sample pictures obtained by fusing a small number of pictures and the corresponding fused character position labels to obtain the character recognition model, so that the number of samples is increased, and the character recognition accuracy of the obtained character recognition model can be improved.
In one embodiment, the character string picture acquiring module 140 is further configured to acquire a character format and a fused picture number, where the character format includes at least one of: character length, character pixel size, character mirror, character underline, and character shadow; arranging and combining characters in a character table to obtain a plurality of groups of character strings with different sequences; according to character fonts and character formats, performing format processing on each group of character strings in a plurality of groups of character strings in different orders respectively to obtain a plurality of groups of character strings after format processing; and extracting and fusing a plurality of groups of character strings with the number matched with the number of the pictures from the character strings processed by the plurality of groups of formats to serve as a plurality of groups of expected character strings.
In one embodiment, the model training module 180 is further configured to obtain a preset number of training rounds and a model adjustment instruction; adjusting the preset character recognition model based on the model adjusting instruction to obtain an adjusted character recognition model; and inputting the group of pictures to be processed, the plurality of fused pictures and the plurality of fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of the preset training round number.
In one embodiment, the model training module 180 is further configured to instruct the model to adjust to include at least one of: a large model switching instruction and a small model switching instruction and a preset character range instruction; if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an adjusted character recognition model; if the model adjusting instruction comprises the preset character range instruction, adjusting the structure of the preset character recognition model according to the preset character range instruction to obtain an adjusted character recognition model; if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction and the preset character range instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an initially adjusted character recognition model; and adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model.
In an embodiment, the model training module 180 is further configured to perform image preprocessing on each to-be-processed picture in the to-be-processed picture group and each fused picture in the multiple fused pictures respectively to obtain multiple preprocessed pictures, where the image preprocessing includes at least one of: picture cutting processing, image rotation processing and image binarization processing; dividing a plurality of preprocessed pictures into a training set and a test set; inputting the training set and the corresponding character position labels into a preset character recognition model for training to obtain an initial character recognition model; and inputting the test set into the initial character recognition model for testing to obtain the character recognition model.
In one embodiment, the model training module 180 is further configured to obtain model thresholds, the model thresholds including at least one of: a confidence threshold and a character size threshold; testing the initial character recognition model based on each sample in the test set to obtain a test result corresponding to each sample in the test set; if the test result does not meet the preset accuracy, adjusting a confidence threshold, returning to the step of inputting the training set and the corresponding character position labels to a preset character recognition model for training to obtain an initial character recognition model; if the test result does not meet the preset definition, adjusting character size thresholds corresponding to a plurality of groups of fused pictures to obtain a plurality of groups of threshold-adjusted pictures, assigning the plurality of groups of threshold-adjusted pictures to the plurality of groups of fused pictures, and returning to the step of respectively carrying out image preprocessing on each picture to be processed in the group of pictures to be processed and each fused picture in the plurality of fused pictures to obtain a plurality of preprocessed pictures; and if the test result meets the preset definition and the preset accuracy, taking the initial character recognition model as the character recognition model.
In one embodiment, the data fusion module 160 is further configured to obtain a length of each expected character string in the plurality of sets of expected character strings; intercepting the length of the expected character string from each character position label in the character position label group to obtain a plurality of fused character position labels.
In an embodiment, the data obtaining module 120 is further configured to determine, for each to-be-processed picture in the to-be-processed picture group, whether the to-be-processed picture corresponds to a character position label, and if so, add at least one character position label corresponding to the to-be-processed picture to the character position label group; and if not, performing labeling processing on the picture to be processed by using a character labeling algorithm to obtain at least one character position label corresponding to the picture to be processed, and adding the at least one character position label corresponding to the picture to be processed into the character position label group.
The modules in the character recognition model acquisition device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device comprises a processor, a memory, an Input/Output (I/O) interface and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing a picture group to be processed, a character position marking group, a character table, a character font, a plurality of groups of expected character strings, a plurality of character string pictures, a plurality of fused character position marks, a preset character recognition model and a character recognition model. The input/output interface of the computer device is used for exchanging information between the processor and an external device. The communication interface of the computer device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a method of obtaining a character recognition model.
Those skilled in the art will appreciate that the architecture shown in fig. 11 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, displayed data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant country and region.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. A method for acquiring a character recognition model is characterized by comprising the following steps:
acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group;
obtaining a plurality of groups of expected character strings according to the character table and the character fonts, and converting the plurality of groups of expected character strings into a plurality of character string pictures;
according to the character position labeling group, fusing each picture to be processed in the picture group to be processed with a corresponding character string picture in the character string pictures respectively to obtain a plurality of fused pictures, wherein the adopted fusion method is a Poisson fusion algorithm, and a plurality of fused character position labels are obtained based on the character position labeling group and the plurality of groups of expected character strings, and the fused picture corresponds to at least one fused character label in the plurality of fused character position labels; the obtaining a plurality of fused character position labels based on the character position label group and the plurality of groups of expected character strings includes: acquiring the length of each expected character string in the multiple groups of expected character strings; intercepting the length of the expected character string from each character position label in the character position label group to obtain a plurality of fused character position labels;
training based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels and a preset character recognition model to obtain a character recognition model; training based on the to-be-processed picture group, the fused pictures, the fused character position labels and a preset character recognition model to obtain a character recognition model, comprising: acquiring a preset training round number and a model adjusting instruction; adjusting the model parameters or the model structure of the preset character recognition model according to the content of the model adjusting instruction to obtain an adjusted character recognition model; and inputting the group of pictures to be processed, the plurality of fused pictures and the plurality of fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of the preset training round number.
2. The method for acquiring the character recognition model according to claim 1, wherein the acquiring a plurality of groups of expected character strings according to the character table and the character font comprises:
acquiring a character format and the number of fused pictures, wherein the character format comprises at least one of the following: character length, character pixel size, character mirror, character underline, and character shadow;
arranging and combining the characters in the character table to obtain a plurality of groups of character strings with different sequences;
according to the character font and the character format, performing format processing on each group of character strings in the multiple groups of character strings in different orders respectively to obtain multiple groups of character strings after format processing;
and extracting a plurality of groups of character strings matched with the number of the fusion pictures from the character strings after the plurality of groups of format processing to be used as the plurality of groups of expected character strings.
3. The method for acquiring the character recognition model according to claim 1, wherein the model adjustment instruction includes at least one of: a large model switching instruction and a small model switching instruction and a preset character range instruction;
the adjusting the model parameter or the model structure of the preset character recognition model according to the content of the model adjusting instruction to obtain the adjusted character recognition model comprises:
if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an adjusted character recognition model;
if the model adjusting instruction comprises the preset character range instruction, adjusting the structure of the preset character recognition model according to the preset character range instruction to obtain an adjusted character recognition model;
if the model adjusting instruction comprises the large model switching instruction and the small model switching instruction and the preset character range instruction, performing model switching on the preset character recognition model according to the large model switching instruction to obtain an initially adjusted character recognition model; and adjusting the structure of the initially adjusted character recognition model according to the preset character range instruction to obtain the adjusted character recognition model.
4. The method for acquiring a character recognition model according to claim 1, wherein the training based on the group of pictures to be processed, the plurality of fused pictures, the plurality of fused character position labels, and a preset character recognition model to acquire the character recognition model comprises:
respectively carrying out image preprocessing on each to-be-processed picture in the to-be-processed picture group and each fused picture in the fused pictures to obtain a plurality of preprocessed pictures, wherein the image preprocessing comprises at least one of the following steps: picture cutting processing, image rotation processing and image binarization processing;
dividing the plurality of preprocessed pictures into a training set and a test set;
inputting the training set and the corresponding character position labels into the preset character recognition model for training to obtain an initial character recognition model;
and inputting the test set into the initial character recognition model for testing to obtain the character recognition model.
5. The method for acquiring the character recognition model according to claim 4, wherein the inputting the test set into the initial character recognition model for testing to obtain the character recognition model comprises:
obtaining a model threshold, the model threshold comprising at least one of: a confidence threshold and a character size threshold;
testing the initial character recognition model based on each sample in the test set to obtain a test result corresponding to each sample in the test set;
if the test result does not meet the preset accuracy, adjusting the confidence threshold, and returning to the step of inputting the training set and the corresponding character position labels to the preset character recognition model for training to obtain an initial character recognition model;
if the test result does not meet the preset definition, adjusting the character size threshold corresponding to the multiple fused pictures to obtain multiple groups of threshold-adjusted pictures, assigning the multiple groups of threshold-adjusted pictures to the multiple fused pictures, and returning to the step of respectively carrying out image preprocessing on each picture to be processed in the group of pictures to be processed and each fused picture in the multiple fused pictures to obtain multiple preprocessed pictures;
and if the test result meets the preset definition and the preset accuracy, taking the initial character recognition model as the character recognition model.
6. The method for acquiring a character recognition model according to claim 1, wherein the step of acquiring the character position label group includes:
judging whether the picture to be processed corresponds to a character position label or not aiming at each picture to be processed in the picture group to be processed, and if so, adding at least one character position label corresponding to the picture to be processed into the character position label group; and if not, performing labeling processing on the picture to be processed by using a character labeling algorithm to obtain at least one character position label corresponding to the picture to be processed, and adding the at least one character position label corresponding to the picture to be processed into the character position label group.
7. An apparatus for acquiring a character recognition model, the apparatus comprising:
the data acquisition module is used for acquiring a group of pictures to be processed, a character position marking group, a character table and character fonts, wherein the group of pictures to be processed comprises at least one picture to be processed, and each picture to be processed corresponds to at least one character position mark in the character position marking group;
the character string picture acquisition module is used for acquiring a plurality of groups of expected character strings according to the character table and the character fonts and converting the plurality of groups of expected character strings into a plurality of character string pictures;
the data fusion module is used for fusing each picture to be processed in the picture group to be processed with the corresponding character string pictures in the character string pictures respectively according to the character position marking group to obtain a plurality of fused pictures, the adopted fusion method is a Poisson fusion algorithm, a plurality of fused character position marks are obtained based on the character position marking group and the plurality of groups of expected character strings, and the fused picture corresponds to at least one fused character mark in the plurality of fused character position marks; based on the character position label group and the multiple groups of expected character strings, acquiring a plurality of fused character position labels, including: acquiring the length of each expected character string in the multiple groups of expected character strings; intercepting the length of the expected character string from each character position label in the character position label group to obtain a plurality of fused character position labels;
the model training module is used for training based on the to-be-processed picture group, the fused pictures, the fused character position labels and a preset character recognition model to obtain a character recognition model; training based on the to-be-processed picture group, the fused pictures, the fused character position labels and a preset character recognition model to obtain a character recognition model, comprising: acquiring a preset training round number and a model adjusting instruction; adjusting the model parameters or the model structure of the preset character recognition model according to the content of the model adjusting instruction to obtain an adjusted character recognition model; and inputting the group of pictures to be processed, the plurality of fused pictures and the plurality of fused character position labels into the adjusted character recognition model for training, and obtaining the character recognition model after training of the preset training round number.
8. The apparatus for acquiring a character recognition model according to claim 7, wherein the character string picture acquiring module is further configured to acquire a character format and a number of fused pictures, the character format including at least one of: character length, character pixel size, character mirror, character underline, and character shadow; arranging and combining the characters in the character table to obtain a plurality of groups of character strings with different sequences; according to the character font and the character format, carrying out format processing on each group of character strings in the multiple groups of character strings with different sequences respectively to obtain multiple groups of format-processed character strings; and extracting a plurality of groups of character strings matched with the number of the fusion pictures from the character strings after the plurality of groups of format processing to be used as the plurality of groups of expected character strings.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202210671644.8A 2022-06-15 2022-06-15 Method and device for acquiring character recognition model, computer equipment and storage medium Active CN114758339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210671644.8A CN114758339B (en) 2022-06-15 2022-06-15 Method and device for acquiring character recognition model, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210671644.8A CN114758339B (en) 2022-06-15 2022-06-15 Method and device for acquiring character recognition model, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114758339A CN114758339A (en) 2022-07-15
CN114758339B true CN114758339B (en) 2022-09-20

Family

ID=82336541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210671644.8A Active CN114758339B (en) 2022-06-15 2022-06-15 Method and device for acquiring character recognition model, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114758339B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402367A (en) * 2020-03-27 2020-07-10 维沃移动通信有限公司 Image processing method and electronic equipment
CN111767909A (en) * 2020-05-12 2020-10-13 合肥联宝信息技术有限公司 Character recognition method and device and computer readable storage medium
CN112085019A (en) * 2020-08-31 2020-12-15 深圳思谋信息科技有限公司 Character recognition model generation system, method and device and computer equipment
CN112508000A (en) * 2020-11-26 2021-03-16 上海展湾信息科技有限公司 Method and equipment for generating OCR image recognition model training data
CN112508003A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Character recognition processing method and device
CN113379001A (en) * 2021-07-16 2021-09-10 支付宝(杭州)信息技术有限公司 Processing method and device for image recognition model
CN113936187A (en) * 2021-10-14 2022-01-14 泰康保险集团股份有限公司 Text image synthesis method and device, storage medium and electronic equipment
CN113989484A (en) * 2021-11-02 2022-01-28 古联(北京)数字传媒科技有限公司 Ancient book character recognition method and device, computer equipment and storage medium
CN114004858A (en) * 2021-11-19 2022-02-01 清华大学 Method and device for identifying aviation cable surface code based on machine vision
CN114511041A (en) * 2022-04-01 2022-05-17 北京世纪好未来教育科技有限公司 Model training method, image processing method, device, equipment and storage medium
CN114565915A (en) * 2022-04-24 2022-05-31 深圳思谋信息科技有限公司 Sample text image acquisition method, text recognition model training method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958446B (en) * 2016-10-17 2023-04-07 索尼公司 Information processing apparatus, information processing method, and computer program
CN109685100B (en) * 2018-11-12 2024-05-10 平安科技(深圳)有限公司 Character recognition method, server and computer readable storage medium
CN110163285B (en) * 2019-05-23 2021-03-02 阳光保险集团股份有限公司 Ticket recognition training sample synthesis method and computer storage medium
CN114049501B (en) * 2021-11-22 2024-06-21 江苏科技大学 Image description generation method, system, medium and device for fusing bundle search
CN114549698A (en) * 2022-02-22 2022-05-27 上海云从企业发展有限公司 Text synthesis method and device and electronic equipment
CN114596566B (en) * 2022-04-18 2022-08-02 腾讯科技(深圳)有限公司 Text recognition method and related device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402367A (en) * 2020-03-27 2020-07-10 维沃移动通信有限公司 Image processing method and electronic equipment
CN111767909A (en) * 2020-05-12 2020-10-13 合肥联宝信息技术有限公司 Character recognition method and device and computer readable storage medium
CN112085019A (en) * 2020-08-31 2020-12-15 深圳思谋信息科技有限公司 Character recognition model generation system, method and device and computer equipment
CN112508000A (en) * 2020-11-26 2021-03-16 上海展湾信息科技有限公司 Method and equipment for generating OCR image recognition model training data
CN112508003A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Character recognition processing method and device
CN113379001A (en) * 2021-07-16 2021-09-10 支付宝(杭州)信息技术有限公司 Processing method and device for image recognition model
CN113936187A (en) * 2021-10-14 2022-01-14 泰康保险集团股份有限公司 Text image synthesis method and device, storage medium and electronic equipment
CN113989484A (en) * 2021-11-02 2022-01-28 古联(北京)数字传媒科技有限公司 Ancient book character recognition method and device, computer equipment and storage medium
CN114004858A (en) * 2021-11-19 2022-02-01 清华大学 Method and device for identifying aviation cable surface code based on machine vision
CN114511041A (en) * 2022-04-01 2022-05-17 北京世纪好未来教育科技有限公司 Model training method, image processing method, device, equipment and storage medium
CN114565915A (en) * 2022-04-24 2022-05-31 深圳思谋信息科技有限公司 Sample text image acquisition method, text recognition model training method and device

Also Published As

Publication number Publication date
CN114758339A (en) 2022-07-15

Similar Documents

Publication Publication Date Title
US20200410273A1 (en) Target detection method and apparatus, computer-readable storage medium, and computer device
WO2023202197A9 (en) Text recognition method and related apparatus
CN111832449A (en) Engineering drawing display method and related device
CN112395834B (en) Brain graph generation method, device and equipment based on picture input and storage medium
CN113780326A (en) Image processing method and device, storage medium and electronic equipment
CN110211032B (en) Chinese character generating method and device and readable storage medium
JP7320570B2 (en) Method, apparatus, apparatus, medium and program for processing images
CN110956133A (en) Training method of single character text normalization model, text recognition method and device
CN111833413B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN114758339B (en) Method and device for acquiring character recognition model, computer equipment and storage medium
CN114419621A (en) Method and device for processing image containing characters
CN110134920A (en) Draw the compatible display methods of text, device, terminal and computer readable storage medium
CN118095205A (en) Information extraction method, device and equipment of layout file and storage medium
CN111353493A (en) Text image direction correction method and device
CN111651969A (en) Style migration
US20220129423A1 (en) Method for annotating data, related apparatus and computer program product
CN116225956A (en) Automated testing method, apparatus, computer device and storage medium
CN116597293A (en) Multi-mode scene recognition method, device, computer equipment and storage medium
CN110780850B (en) Requirement case auxiliary generation method and device, computer equipment and storage medium
CN115017922A (en) Method and device for translating picture, electronic equipment and readable storage medium
CN114399782A (en) Text image processing method, device, equipment, storage medium and program product
CN111144066B (en) Adjusting method, device and equipment for font of font library and storage medium
CN111291758A (en) Method and device for identifying characters of seal
US10762607B2 (en) Method and device for sensitive data masking based on image recognition
CN115457572A (en) Model training method and device, computer equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant