CN111898606B

CN111898606B - Night imaging identification method for superimposing transparent time characters in video image

Info

Publication number: CN111898606B
Application number: CN202010422410.0A
Authority: CN
Inventors: 聂晖; 杨小波; 李军
Original assignee: Wuhan Eastwit Technology Co ltd
Current assignee: Wuhan Eastwit Technology Co ltd
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2023-04-07
Anticipated expiration: 2040-05-19
Also published as: CN111898606A

Abstract

The invention belongs to the field of computer vision, and relates to a method for identifying transparent time annotation information in a night video image, which comprises the following steps: improving the training process configuration of the CRNN; making a transparent time character and overlapping a night background image as a training input sample; a two-dimensional attention mechanism module is introduced to train a recognition model suitable for the characters at the transparent time at night; and outputting a time identification result from the image to be detected at night by using the identification model and carrying out logic verification. The method aims at the identification requirement of the transparent time marking information on the mainstream camera equipment, uses the night imaging environment, preliminarily improves the difference between the transparent time characters and the background image, further enhances the feature extraction capability of the transparent time characters, constructs a recognition model of a natural scene 'substrate-free' superposed transparent feature text, and realizes the transparent time character recognition method with the feature weighting extraction capability in the night imaging environment.

Description

Night imaging identification method for superimposing transparent time characters in video image

Technical Field

The invention belongs to the field of computer vision, and can be used for detecting the overlapping time characters in pictures of video monitoring systems in public security and related industries. In particular to a method for identifying transparent time marking information in a night video image.

Background

In the design process of a recognition scheme for superimposing time characters in video monitoring images, it is found that some mainstream brands of cameras may adopt a stroke transparent (i.e. non-black and non-white gray) display style when superimposing characters, which brings new challenges to the character recognition of a 'substrate-free' background which faces many difficulties. Experimental data show that under the same natural scene, the recognition accuracy of the character strokes is reduced by more than 20% by using a transparent style of the character strokes than an opaque (pure black or pure white) style, and the expected application effect of the scheme is seriously influenced.

In order to obtain a more ideal effect, the recognition algorithm of overlapping time characters in an image based on a deep learning technology generally assumes a constraint condition of an application scene, namely, the overlapped characters are marked by adopting a standard, pure black or pure white and non-transparent style in the image. Because of the inherent low contrast property of the transparent characters, the transparent characters are easily confused with the natural background of the image, and become more difficult to identify, and the performance of the training model is directly influenced.

Through experimental data analysis, the recognition accuracy of stroke transparent characters is improved by only depending on algorithm training, and a bottleneck is met.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method for identifying transparent time characters in night video images aiming at the identification requirement of stroke transparent time characters.

The basic technical concept of the invention is to improve the training process configuration of CRNN (a general text recognition neural network based on image sequences); making a transparent time character and overlapping a night background image as a training input sample; a two-dimensional attention mechanism module is introduced to train a recognition model suitable for the characters at the transparent time at night; and outputting a time identification result from the image to be detected at night by using the identification model and carrying out logic verification.

In order to solve the technical problem, the invention provides a method for recognizing transparent time characters in night imaging, which comprises the following steps of:

step i, improving a training method of the CRNN universal text recognition network;

step ii, making a CRNN night transparent time character training sample;

step iii, introducing an attention module to strengthen and train a night transparent time character recognition model;

and iv, identifying transparent time characters in the image to be detected at night and checking the reasonability of an output value.

Preferably, in the step i, the specific steps of improving the training method of the CRNN universal text recognition network include:

1-1) in a feature extraction link of a main ResNet classification network (a mainstream CNN convolution network), 3-layer sampling is adopted to retain more transverse features of characters;

1-2) when extracting the character height features, using maximum pooling;

1-3) selecting a single-layer LSTM (long-short time memory network), and using the output hidden layer vector for attention vector calculation.

Preferably, in the step ii, the specific step of making the CRNN night transparent time character training sample includes:

2-1) generating time characters with black and white colors, various fonts and various time formats on the background transparent image by using a character pixel rendering function;

2-2) generating random spots for the time character stroke pixels by using a PerlinNoise noise point diagram;

2-3) performing transparency processing on the time characters with random spots, and superposing the time characters on a random night background image to be used as an input sample for training;

2-4) taking the text form of the time characters in the step 2-1), and carrying out ignoring and forced replacing treatment on the characters in the non-standard time format to serve as a recognition target sample matched with the step 2-3);

preferably, the step iii of introducing the attention module to intensively train the night transparent time character recognition model specifically includes:

3-1) recording the matching probability vector of each time character feature decoded from the feature map and the target time character text by adopting a CTC coding system;

3-2) when the matching probability vector is decoded and output, the hidden layer vector inside the LSTM is superposed on the processed feature map to generate an attention weight map, and the feature of the time character at the corresponding position is enhanced;

3-3) in the attention module, reducing the attention weight map into an attention vector, and simply adding the attention vector and the matching probability vector to obtain a final target character matching probability vector;

and 3-4) obtaining the time character recognition model when the matching probability is integrally optimal.

Preferably, the step vi, the specific steps of identifying the time character in the image to be detected at night and checking the reasonability of the output value include:

4-1) collecting an image to be detected of the video monitoring equipment at night, and inputting the image to the identification model;

4-2) when the number of the output digital characters is not enough than the standard, judging that the output digital characters cannot be verified, and forcibly converting the output digital characters into a specific time value;

4-3) for recognizing that the number of the output numeric characters exceeds the standard, filling or deleting redundant time connection symbols '-and';

4-4) completing time segmentation and correction according to logic constraints of 'year, month and day' and 'hour, minute and second' and combining the matching probability recorded in 3-3), and converting into a final time value.

So far, the CRNN night character recognition model training is completed, the image to be detected at night is input, the time character recognition result is obtained, and the technical scheme of the invention is realized.

The beneficial effects of the invention include:

1) Aiming at the identification requirement of transparent time marking information on mainstream camera equipment, an identification model of a natural scene 'substrate-free' superposed transparent characteristic text is realized, and a special solution is formed.

2) The problem of feature learning of the transparent time characters is turned to a new idea of further improving the character and background distinguishing degree in a specific imaging mode, and a key technology is provided for the recognition requirements of similar application fields.

Drawings

The technical solution of the present invention will be further specifically described with reference to the accompanying drawings and the detailed description.

FIG. 1 is a basic flow diagram of the process of the present invention.

Fig. 2 is a training input sample of the CRNN network night recognition model.

FIG. 3 is a diagram illustrating the display effect of representative time characters in the night image to be examined.

FIG. 4 is a diagram of an attention-based mechanism for enhancing character feature extraction.

Fig. 5 is a time character format standardized example pattern of the recognition result, in which 5 (a) denotes ": "time character format of symbol is schematically illustrated, and 5 (b) represents deletion of redundant": schematic diagram of time character format of symbol.

FIG. 6 is an exemplary pattern of time character logic verification of recognition results.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the present invention provides an overall flowchart of a method for recognizing transparent time characters in night imaging, which mainly comprises the following steps:

step i, self-defining a CRNN text recognition network structure;

step ii, making a night transparent time character training sample;

step iii, introducing an attention module to strengthen training and recognizing the model;

More specifically, step i comprises the following subdivision steps:

1-1) in a feature extraction link of a main ResNet classification network (a mainstream CNN convolution network), adopting 3-layer sampling setting to reserve more transverse features of characters;

in this embodiment, 3-layer sampling obtains a 64-pixel width feature map from an original image with a width of 512 pixels.

1-2) when extracting the character height characteristics, using maximum value pooling;

mf＝max({f _0,y |y∈{0,1,2,3}})

and selecting the largest feature from the 4 feature values in the height direction, thereby improving the character recognition accuracy.

Step ii comprises the following subdivision steps:

2-1) generating time characters with black and white colors, multiple fonts and multiple time formats on the background transparent image by using a character pixel rendering function;

Ω _i ＝{0,1，...，K}

drawing K characters, where Ω _i Representing the range of pixels to be drawn for the ith character (containing only the stroke pixels of that character), Ω ₀ Representing background pixels other than character pixels.

I[x,y]Representing a pixel [ x, y ] in an image]RGB average luminance value of [ omega ] _i Indicating the number of background pixels covered by the ith character, and D (i) indicating the average brightness of the covered background pixels of the ith character.

The rendering function for character generation is as follows:

using RGBA coding, [0,0,0,0] represents a colorless transparent pixel, [0,0,0,1] represents a black pixel, and [1,1,1,1] represents a white pixel.

wherein eta is the generated noise point diagram, and eta [ x, y ] belongs to {0,1}.

The larger the attenuation control coefficient k > =0,k is, the more obvious the attenuation effect is;

k =8 in the present embodiment.

2-3) performing transparency processing on the time characters with random spots, and superposing the time characters to a random night background image to be used as an input sample for training;

transparency blending the temporal character pixel and the background pixel b using the formula:

f”＝(1.0-λα)οb+λf'

wherein alpha represents a character image with a median value of a transparency channel of f being 1, lambda is a mixing parameter, and the smaller lambda is, the more transparent the character in the output image is and the invisible the character is;

in the present embodiment λ =0.9,

as shown in fig. 2, the training sample time character strokes are transparent and have a defect effect, which simulates the situation that the transparent time character strokes of the image to be detected are mixed with background pixels in practical application as shown in fig. 3, typically as follows: '0', '3') the partial stroke area of '5'.

2-4) taking the text form of the time characters in the step 2-1), and performing neglect and forced replacement processing on the characters in the non-standard time format to serve as a recognition target sample matched with the step 2-3);

Γ＝{"0","1","2","3","4","5","6","7","8","9","-",":"}

Γ ₁ = year and month

Γ ₂ = time and minute

Γ ₃ ＝U-Γ-Γ ₁ -Γ ₂

Γ denotes the set of supported standard characters, Γ ₁ Indicating that a character set should be replaced with "-",Γ ₂ indicating the character set, Γ, which should be replaced by ₃ Representing all unsupported character sets, including chinese characters.

The character replacement formula is as follows:

s'＝replace(s,Γ ₁ ,"-")

s”＝replace(s',Γ ₂ ,":")

s”'＝delete(s”,Γ ₃ )

step iii comprises the following subdivision steps:

p[i,c]

where i represents the ith character of the predicted string being output, c represents the probability that the ith character is predicted as character c, e.g., p [3, "8" ] represents the probability that the 3 rd character is predicted as character "8".

wherein, W _v 、

W _h 、/>

Representing a linear transformation requiring learning, v representing a feature map, N _i,j Eight adjacent fields near the (i, j) coordinate in the feature map, namely eight grid areas around the center grid in the nine-grid chart, h _t Representing a hidden layer vector corresponding to the t character; a is _i,j Is the attention vector corresponding to the (i, j) coordinate in the attention weight map, all a _i,j The whole of the compositionBody is the attention weight graph a.

As shown in fig. 4, the processing portion within the dashed box.

as shown in FIG. 4, the Attention module outputs an Attention vector.

Step iv comprises the following subdivision steps:

in this embodiment, the night period is 20 days — 04 days.

4-2) when the number of the recognized and output digital characters is less than that of the standard 14 (the number of digits in the standard format YYYY-MM-DD hh: MM: ss should reach 14), judging that the verification cannot be carried out, and forcibly converting the number into a specific time value;

identifying the image to be detected at night by using the model in 3-4), outputting a character set s,

z＝|delete(s,U-N)|

z represents the number of numeric characters output for recognition, where N represents the set of numeric characters and U-N is the set of all characters except the number.

When z <14, it is determined that verification is impossible, and the time is converted into a specific time value, and the present embodiment is collectively designated as time information "2000-01-01" 00 and outputs.

4-3) connecting symbols 'and' are added or deleted for the time reaching more than 14 numbers to be output;

s'＝pre(s,c _dash ,2)

the redundant "-" connectors are deleted, where the pre function indicates that the character is deleted, leaving only the top 2 designated characters.

Padding a "-" connector, where ist indicates inserting a character in a designated sequence number of a string s, c _dash Representing a "-" character, the d (c) function determines whether the character c is a number.

In the same way, the method for preparing the composite material,

s'＝re(pre(re(s),c _colon ,2))

deleting redundant ': characters, wherein the re function represents that the character string is reversed in order, the character string is reversed in order and then restored, namely the last two': characters can be reserved, and c _colon Representing a ": character.

m＝re(s)

s'＝re(n)

And (4) completing the ": connector, wherein the two times of reverse operations are completed by taking the last": as a reference.

As shown in FIG. 5, 5 (a) indicates that after the 'number is filled, 5 (b) indicates that the' number is deleted.

4-4) according to logic constraints of 'year, month and day' and 'hour, minute and second' and combining the matching probability obtained in 3-3), finishing time segmentation and correction, and converting into a final time value.

N ₁ ∈{c ₁ |(10c ₁ +c ₂ )∈[a,b],c ₁ ,c ₂ ∈{0,1,...,9}}

A correction method of two-digit numbers, wherein a and b represent the range of two-digit numbers, c ₁ c ₂ Representing the number in ten and one of the current prediction, p [ i, c ₁ ]Indicates that the ith character is predicted to be c ₁ The correction process is performed on this reasonable number set N ₁ And (5) finding the predicted character with the highest probability as a correction result again. When c is going to ₁ After the correction is finished, the correction is carried out by

For the correct ten digits, a set of reasonable numbers N is determined ₂ Then c is carried out ₂ And (4) correcting.

After verification, the correct time value is finally output, as shown in fig. 6.

So far, training of a night character recognition model is completed, a night image to be detected is input, a time character recognition result is obtained, and the technical scheme of the invention is realized.

In this embodiment, 8215 nighttime random image recognition results are counted, and 91.86% accuracy (full matching rate) is obtained.

The method achieves ideal application effect.

It will be clear to those skilled in the art that the specific values of the above parameters or thresholds can be adjusted according to the strictness of the sample training method and the specification implementation, and do not limit the present invention.

Finally, it should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and not intended to limit the present invention, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications and equivalents can be made in the technical solutions described in the foregoing embodiments, or some technical features of the present invention may be substituted. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A character recognition method for superimposing transparent time characters in a night image is characterized by comprising the following steps:

step ii, making a CRNN night image transparent time character training sample;

step iii, introducing an attention module to strengthen and train a night image transparent time character recognition model;

step iv, identifying transparent time characters in the night image to be detected and checking the reasonability of an output value;

in the step i, the specific steps of the training method for improving the CRNN universal text recognition network include:

1-1) in the feature extraction link of a main ResNet classification network, adopting 3-layer sampling to reserve more transverse features of characters;

1-2) when extracting the character height features, using maximum pooling;

1-3) selecting a single-layer LSTM, and using the output hidden layer vector thereof for attention vector calculation;

step iii, the specific steps of the attention module for strengthening training of the night image transparent time character recognition model include:

3-3) in the attention module, reducing the attention weight map into an attention vector, and adding the attention vector and the matching probability vector to obtain a final target character matching probability vector;

2. The method for character recognition by superimposing a transparent time character on a night image according to claim 1, wherein the step ii of preparing the CRNN night image transparent time character training sample comprises the steps of:

2-3) performing transparency processing on the time characters with random spots, and superposing the time characters to a randomly generated night background image to be used as an input sample of training;

2-4) taking the text form of the time characters in the step 2-1), and performing neglect and forced replacement processing on the time characters in the non-standard time format to serve as recognition target samples matched with the step 2-3).

3. The method for recognizing characters by superimposing transparent time characters on night images according to claim 2, wherein the step iv, the specific steps of recognizing the time characters in the image to be detected at night and checking the reasonableness of the output value, comprises: