CN110533030B - Deep learning-based sun film image timestamp information extraction method - Google Patents
Deep learning-based sun film image timestamp information extraction method Download PDFInfo
- Publication number
- CN110533030B CN110533030B CN201910765276.1A CN201910765276A CN110533030B CN 110533030 B CN110533030 B CN 110533030B CN 201910765276 A CN201910765276 A CN 201910765276A CN 110533030 B CN110533030 B CN 110533030B
- Authority
- CN
- China
- Prior art keywords
- picture
- character
- image
- time stamp
- projection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
The sun film image timestamp information extraction method based on deep learning comprises the following steps of: positioning and cutting out a time stamp information area in the solar ball film image; the time stamp information is year, month, day, time and minute information used for representing shooting time in the solar ball film image; step 2: the single character segmentation, namely, further segmenting characters in the time stamp information area to obtain single characters; step 3: and (3) character recognition, namely training a network by adopting a large number of samples, then recognizing the single characters obtained by segmentation in the step (2) by using the trained network, and integrating and storing recognition results. The method automatically identifies the digital time stamp in the sun observation image film by a machine and outputs the identified time information. The workload of manual identification and time information writing is reduced, so that the digitizing process of the batch of film data can be quickened, and the precious historical data can be more conveniently used for sun physical research.
Description
Technical Field
The invention relates to the technical field of sun observation image processing, in particular to a deep learning-based sun film image timestamp information extraction method.
Background
The solar sphere layer is a layer of atmosphere above the light sphere layer, and is used as a transition area from the light sphere layer to the corona layer, so that a magnetic field is unstable, and a severe flare burst phenomenon is often generated. The radiation of solar flares in the color sphere often appears in elongated bands on both sides of the Pole Inversion Line (PIL), which is considered to be evidence of the typical morphology of magnetic reconnection. In order to study the flare burst phenomenon, related personnel need to shoot and record the solar ball for a long time. Because of the large number of historic data, the time information of a large batch of color ball images is still presented in the form of images in the data, and no digital information is formed for direct reading by a computer. This brings great inconvenience to the development of scientific research using these materials.
On the one hand, the shooting time digitization of the image deepens the excavation of the effective information of the historical data, and on the other hand, the retrieval workload of scientific research personnel can be greatly reduced, so that the scientific research personnel can obtain more valuable results from the data, and the progress of the scientific research work is greatly facilitated.
The historical sun observation images are mostly stored by films, and the shooting time is printed on the films. In order to facilitate the effective use of these image data by researchers, it is necessary to extract the time stamp information from the film. The number of pictures is very huge, and manual identification and extraction are time-consuming and labor-consuming. Therefore, it is critical to automatically recognize the time stamp information in the image by using the computer, and whether the data can be efficiently utilized.
Disclosure of Invention
In order to solve the technical problems, the invention provides a deep learning-based sun film image timestamp information extraction method, which automatically identifies digital timestamps in sun observation image films by a machine and outputs identified time information so as to reduce the workload of manual identification and time information input. Therefore, the digitizing process of the film data can be quickened, and the precious historical data can be more conveniently used for sun physical research.
The technical scheme adopted by the invention is as follows:
the sun film image timestamp information extraction method based on deep learning comprises the following steps:
step1: positioning and cutting out a time stamp information area in the solar ball film image;
the time stamp information is year, month, day, time and minute information used for representing shooting time in the solar ball film image;
step 2: the single character segmentation, namely, further segmenting characters in the time stamp information area to obtain single characters;
step 3: and (3) character recognition, namely training a network by adopting a large number of samples, then recognizing the single characters obtained by segmentation in the step (2) by using the trained network, and integrating and storing recognition results.
The step1 comprises the following steps:
step 1.1, a sun sphere cutting step based on vertical projection:
the image is accumulated on the vertical component to obtain a vector of 1×n, the size of the image is m×n, and the pixel value of the pixel in the j columns of i rows is f ij (x, y), then in the vertical direction the projection is:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n. The position of the solar sphere can be further judged by calculating the projection of the picture in the vertical direction. At S 1j Vector [400,1800 ]]The result of the partial projection of the solar sphere. Also because the sun is symmetrical, it is only necessary to know the vector S 1j The position of the center of the sun can be positioned, and then the picture containing the solar spherical surface part is removed according to the length of the pixel occupied by the solar spherical surface in the vertical direction.
Step 1.2, judging the position of the timestamp and correcting the overturn based on the variance:
the variance of pictures containing a timestamp is much greater than the variance of pictures not containing a timestamp. Therefore, the picture with the time stamp is judged, after the picture with the time stamp is known, the picture needs to be subjected to inversion correction, if the picture is a left picture, the picture needs to be turned clockwise by 90 degrees, and otherwise, the picture needs to be turned anticlockwise by 90 degrees.
Step 1.3, finely dividing the timestamp character area based on a projection method:
for a picture of size mxn, the pixel value of the pixel in row i and column j is x ij The projections in the horizontal direction and the vertical direction are respectively:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n. S is S i1 Representing the sum of the pixel points of the j-th row of the image, S i1 The size of (2) is m×1. By calculating the projections of the picture in the horizontal direction and the vertical direction, the specific position of the time stamp area can be further calculated, so that the accurate segmentation of the picture is realized.
Step 1.4, cutting the timestamp area:
and cutting the picture based on the projection result of the picture in the horizontal and vertical directions. In order to ensure that the continuity of the pictures is not destroyed, the first point larger than the mean value is taken as a starting point, the last point larger than the mean value is taken as an end point, and all images from the starting point to the end point area are reserved. Assuming that the original image S has a size of m×n, the cut picture P has a size of m '×n', and the cutting formula is:
P=S(a:b,c:d),(a,c>1,b<m,d<n)
wherein:
where S (a: b, c: d) represents the rows a to b and columns c to d in picture S. x represents the value at a point in the horizontal projection,representing the mean value of each point in the horizontal projection. y represents the value of a point in the vertical projection, < >>Representing the mean of points in the vertical projection.Represented in the projection vector, x is greater than +.>Is the minimum of the positions of (a). />Represented in the projection vector, x is greater than +.>And the same applies to the maximum value of the position of (c).
In the step 2, the process of single character segmentation is as follows:
firstly, removing the background of a picture by adopting top hat operation, then removing noise by adopting a local binarization algorithm, and finally extracting a character region by adopting a connected domain algorithm; the algorithm defaults that the color of the character is white, if the color of the character is black, any effective area cannot be extracted after the extraction of the connected area, so that if the effective area does not exist, the algorithm returns to the local binarization part, and the color of the picture after the local binarization is turned over.
The background is removed by adopting a top hat operation algorithm, and the algorithm principle is that the original image and the original image are subjected to difference operation results. After the top cap operation, a part of background noise can be eliminated from the picture, and characters in the image can be highlighted.
And adopting a Sauvola local binarization algorithm to remove noise. The cutting of the characters can be completed only by extracting the connected domain conforming to the size of the characters. And the method for extracting the connected domain is fast, so that a lot of program running time can be saved.
And extracting a character area by adopting a connected domain algorithm, firstly carrying out local binarization on the picture, and then removing the oversized or undersized connected domain to eliminate partial interference. The effect is better when the local binarization threshold value is 16 multiplied by 16, and the area of the connected domain of the character is between [500,5000 ]. After the above processing, some invalid areas are not eliminated, so the invention further deletes the invalid areas by judging whether each connected domain meets the length, width and length-width ratio of standard characters. From the statistical information, the height of the character is between [90,110], the width of the character is between [10,60], and the aspect ratio of the character is not less than 1.
According to the position of each connected domain in the binary image corresponding to the position in the original image, the single character areas in the original image can be cut out respectively. In order to ensure that the obtained pictures are consistent in size, each picture is respectively filled and the size of each picture is converted into a 28 multiplied by 28 standard picture.
In the step 3, the character recognition process is as follows:
a convolutional neural network algorithm in deep learning is employed for character recognition. The convolutional neural network for character recognition built by the invention comprises two convolutional layers, two pooling layers and a full connection layer. The first convolution layer convolves the character picture with the input size of 28 x 28 by 6 different convolution cores with the size of 5 x 5, and after the first layer convolution, the original character picture becomes a 24 x 6 feature map. The first pooling layer uses a pooling function with a sliding window of 2×2 to re-extract features of the result of the first convolution layer, and after the pooling of the layer, the result becomes a 12×12×6 feature map. The second convolution layer adopts 12 different convolution cores with the size of 5×5 to carry out feature re-extraction on the feature map of the pooling layer, and the size of the extracted feature map is 8×8×12. The second pooling layer pools the feature map convolved by the second convolution layer, and the feature map size after pooling becomes 4×4×12. And inputting the feature map after the second pooling operation into a full-connection layer to obtain the feature vector of the character. And finally classifying the feature vectors of the characters and corresponding to the actual numbers to finish the recognition of the time characters in the time stamp.
And identifying single characters in time information in the solar color film map by using the convolutional neural network obtained through training, combining the identified characters in sequence, corresponding to file names of the original map, and filling the combined characters into an Excel table for later manual checking and database establishment.
The method further comprises the step 4 of manually checking the date:
for the color ball image in a period of time, the shooting date of each picture can be automatically calculated by inputting the information of the year, month and day of the first image. There are occasionally dates in the middle where sun observation is not performed, and then a manual check mode is needed to modify automatically generated error date information.
The invention discloses a deep learning-based sun film image timestamp information extraction method, which has the following technical effects:
1) The invention provides a time stamp information extraction method based on deep learning, which is used for systematically identifying and arranging the time information of 700 or more ten thousand solar ball film image materials which are scanned and digitized in the United states of America national solar astronomical platform 1956-2003. Firstly, locating and dividing a timestamp information area in an image; secondly, eliminating noise interference by adopting methods such as top cap operation, local binarization, connected domain screening and the like, and performing character segmentation on the timestamp information region image; then, 10000 classified character pictures are selected to train a convolutional neural network, and the recognition effect of the obtained network is tested; and finally, carrying out batch identification on the timestamp information in 10000 color ball images by using the trained network, and then carrying out quantitative analysis on the identification result. The result shows that the method can automatically, accurately and rapidly realize the positioning and the identification of the time stamp information in the scanned solar film image.
2) The method based on the convolutional neural network in the deep learning is adopted to study the problem of identifying time information in the solar color ball film picture of the last 50 years shot by the national astronomical platform in the United states. The results show that: the method has strong applicability to character recognition in the pictures, the recognition accuracy can reach more than 98 percent, the average processing of one picture does not exceed 0.1 second, the requirements of the invention on recognition speed and recognition quality in practical application can be met, the method has strong portability, and the method has high reference value for solving the later-stage same type problems.
Drawings
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
fig. 1 is a block diagram of a general convolutional neural network.
Fig. 2 is a time stamp character recognition flow chart.
FIG. 3 (a) is a first image of a solar ball film with time stamp information;
FIG. 3 (b) is a second image of a solar ball film with time stamp information;
fig. 3 (c) is a solar ball film image three with time stamp information.
Fig. 4 is a schematic diagram of a timestamp information area in an image.
FIG. 5 is S 1j The vector is shown in the coordinate axis.
Fig. 6 is a view obtained by removing the solar spherical surface.
Fig. 7 is a picture including a time stamp.
FIG. 8 (a) is a projection vector diagram I;
fig. 8 (b) is a projection vector diagram two.
Fig. 9 is a time stamp area cutting result diagram.
Fig. 10 is a flowchart of a single character extraction algorithm.
Fig. 11 (a) is an intensity distribution diagram of a character (before top hat operation);
fig. 11 b shows the intensity distribution diagram of the character (after the top hat operation).
Fig. 12 is a binarized picture.
Fig. 13 is a binary image after noise removal.
Fig. 14 is a binary image after the invalid region is eliminated.
Fig. 15 is a character cutting result diagram.
Fig. 16 is a diagram of a character recognition convolutional neural network.
Fig. 17 is a date check graphical interface diagram.
Detailed Description
The following describes embodiments of the present invention by way of specific examples:
a general convolutional neural network architecture includes an input layer, a convolutional layer, a pooling layer, a full connection layer, an output layer, and the like, and the structure of the architecture is shown in fig. 1. The input layer input data is classified by the feature vector output by the output layer through softmax logistic regression, and when the input layer is character image data, the character image can be classified by the result of the output layer classification, so that character recognition is realized. There may be multiple convolutional layers, pooled layers, and fully-connected layers in a convolutional neural network as desired, and fig. 1 is merely representative of its general form.
The extraction of the time stamp information in the scanned solar balloon film image is performed by Convolutional Neural Network (CNN), and is mainly divided into three parts, as shown in fig. 2:
step1, positioning and cutting a picture timestamp information area;
and step 3, training a network by adopting a large number of samples, identifying the segmented characters by using the trained network, and integrating and storing the identification results.
Step1: and positioning and clipping the picture timestamp information area.
As shown in fig. 3 (a), 3 (b) and 3 (c), the resolution of each original solar ball film picture is 1600×2048, and the time stamp information is usually placed on the left side or the right side of the picture, and the position is not fixed. The character formats of the time information are roughly classified into two types, as shown in fig. 3 (a), 3 (b), and the character patterns of each type are different, so that classification is required to recognize characters. Also, since the sharpness of each picture is different, most pictures are darker, and the characters are difficult to recognize, as shown in fig. 3 (c). Thus requiring preprocessing of each picture. Only the timestamp character of the whole picture is needed by the present invention, as is the part in the red box of fig. 4. The time stamp information is recorded with the information of the year, month, day, time, minute, and second of taking the photo, such as the part in the yellow frame in fig. 4. According to the observation time precision of the data, only the information of the year, month, day, time and minute is needed to be obtained. Since the location of the time information is not fixed, the present invention should first locate and cut the time stamp area. The position of the time stamp is on the left side or the right side and not on the spherical surface of the sun, so the invention needs to cut off the part containing the sun first and then find the position of the time stamp.
Step 1.1: first is a sun sphere removal step based on vertical projection.
The vertical projection method is a method for checking the distribution characteristics of pixel points in the vertical direction according to the information of the pixel points in an image. The calculation method is that the image is accumulated on the vertical component to obtain a vector of 1 x n. Assuming that the size of the picture is m×n, the pixel value of the pixel in i rows and j columns is f ij (x, y), then in the vertical direction the projection is:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n. The position of the solar sphere can be further judged by calculating the projection of the picture in the vertical direction. After the vertical projection calculation is performed on FIG. 3 (a), the process will reach S 1j The vectors are shown in fig. 5. As can be seen from fig. 5, at S 1j Vector [400,1800 ]]The result of the partial projection of the solar sphere. And due to the sunIs symmetrical, so the invention only needs to know the position of the vector S 1j The position of the center of the sun can be positioned, and then the picture containing the solar spherical surface part is removed according to the pixel length occupied by the solar spherical surface in the vertical direction. Fig. 6 shows two small images obtained after removal of the solar sphere.
Step 1.2: the timestamp position and rollover correction are determined based on the variance.
A picture is essentially a matrix comprising a plurality of pixels, the magnitude of which is reflected in the color in the picture. For example: in the binary image, "0" represents black, and "1" represents white. In fig. 6, not a binary image but a 256-level gray scale image. The size of each pixel represents the brightness level of the pixel, and 256 levels are added, so that the higher the number of levels, the higher the brightness at the pixel position. As can be seen from fig. 6, in the picture containing the time stamp, the time stamp is represented by the high-brightness dot, whereas in the picture not containing the time stamp, most of the pixels are darker in brightness. The method adopts a time stamp position judging method based on variance, namely judging which picture the time stamp is on according to the variance of pixel values in an image matrix. For a picture of size m×n, the variance calculation formula is expressed as follows.
Wherein x is ij Representing pixel points in [ m, n ]]The pixel value at which it is located,represents the average value of the pixel values of the total, and n x m represents the total number of pixels.
Obviously, the variance of pictures containing a timestamp is much greater than the variance of pictures not containing a timestamp. Thereby judging the picture where the time stamp is located. After knowing the picture in which the time stamp is located, the picture needs to be subjected to inversion correction. If the left picture is a left picture, the left picture needs to be turned clockwise by 90 degrees, and otherwise, the left picture needs to be turned anticlockwise by 90 degrees. The image rotation formula is as follows:
where x, y represents the original pixel position, x ', y' represents the pixel position after rotation transformation, and β represents the counterclockwise rotation angle. Taking fig. 6 as an example, the image after variance judgment and rotation is shown in fig. 7.
As can be seen from fig. 7, there is still a lot of useless information in the picture containing the time stamp. This has some effect on the calculation speed. Considering a huge amount of pictures, these useless areas may cause unnecessary memory consumption.
Step 1.3: the invention adopts a horizontal and vertical projection method to realize the precise segmentation of the time stamp area.
The horizontal and vertical projection method is a method for checking the distribution characteristics of pixels in the horizontal direction and the vertical direction respectively according to the information of the pixels in an image. Often for accurate projection of the target region for later segmentation operations. The calculation method is that the image is respectively accumulated on the horizontal component and the vertical component to obtain two vectors. For a picture of size mxn, the pixel value of the pixel in row i and column j is x ij The projections in the horizontal direction and the vertical direction are respectively:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n. S is S i1 Representing the sum of the pixel points of the j-th row of the image, S i1 The size of (2) is m×1. The specific position of the time stamp region can be further obtained by calculating the projections of the picture in the horizontal direction and the vertical direction, so that the accurate segmentation of the picture is realized. Taking fig. 7 as an example, vectors projected in the horizontal direction and the vertical direction are shown in fig. 8 (a) and fig. 8 (b).
Step 1.4: the time stamp area is cut.
And cutting the picture based on the projection result of the picture in the horizontal and vertical directions. As can be seen from fig. 8 (a), 8 (b), the projection results in the horizontal and vertical directions corresponding to the time stamp areas in the picture are high. According to this feature, the time stamp region division is defined to be completed by retaining only a portion where the projection result is larger than the average value. In order to ensure that the continuity of the pictures is not destroyed, the first point larger than the mean value is taken as a starting point, the last point larger than the mean value is taken as an end point, and all images from the starting point to the end point area are reserved. Assuming that the original image S has a size of m×n, the cut picture P has a size of m '×n', and the cutting formula is:
P=S(a:b,c:d),(a,c>1,b<m,d<n) (5)
wherein:
where S (a: b, c: d) represents the rows a to b and columns c to d in picture S. x represents the value at a point in the horizontal projection,representing the mean value of each point in the horizontal projection. y represents the value of a point in the vertical projection, < >>Representing the mean of points in the vertical projection.Represented in the projection vector, x is greater than +.>Is the minimum of the positions of (a). />Represented in the projection vector, x is greater than +.>And the same applies to the maximum value of the position of (c). Taking fig. 7 as an example, is cutAs shown later in fig. 9.
Step 2: single character segmentation.
As can be seen from fig. 9, the date character is small and easily and partially blurred, and is easily deleted as noise interference in the post-processing, so that the effect on the result is severe. Therefore, only the time division information is considered in the identification process, and a manual filling mode is adopted for the date, so that the workload is low.
In practice, the cutting of individual characters is most difficult. Fig. 9 is a result of cutting from a picture of a relatively good quality. In practical cases, however, most of the pictures are similar to fig. 3 (c), the character portions are blurred, and some of them cannot be recognized even by human eyes. The acquisition of individual characters is much more difficult than the acquisition of character areas, while also most easily affecting the final recognition result. In addition, as shown in fig. 3 (a), 3 (b) and 3 (c), the character format of the timestamp information area is different, so that the universality of the algorithm needs to be considered in the processing process. For a single character cutting flow, as shown in fig. 10, firstly, top hat operation is adopted to remove the background of the picture, then a local binarization algorithm is adopted to remove noise, and finally, a connected domain algorithm is adopted to extract a character region. Since character types are classified into two types, as in fig. 3 (a), 3 (b), characters are represented in white and black, respectively. The algorithm defaults that the color of the character is white, if the color of the character is black, any effective area cannot be extracted after the extraction of the connected area, so that if the effective area does not exist, the algorithm returns to the local binarization part, and the color of the picture after the local binarization is turned over.
Step 2.1: background removal is based on top hat operations. The algorithm principle is that the original image and the original image are subjected to difference in operation result. The algorithm can be described as:
topimg=tophat(img,element)=img-open(img,element) (6)
wherein topimg represents an original image, img represents a core for top cap operation and on operation.
Taking 29×29 check fig. 9 as an example, fig. 11 (a) and 11 (b) are respectively intensity distribution diagrams before and after top hat calculation. As can be seen from fig. 11 (b), the image is subjected to the top hat operation, so that a part of background noise can be eliminated, and characters in the image can be highlighted. Although the noise cannot be completely removed, the workload of the post-operation can be reduced, and the effect of excessive noise on the local binarization result is avoided.
Step 2.2: the noise removal is based on a Sauvola local binarization algorithm.
The Sauvola algorithm is explained as follows:
step1: calculating a MEAN and a variance STD of the pixel point f (x, y) within the range of n x n;
step 2: calculating a threshold value T (x, y) of the pixel point f (x, y) according to a formula;
wherein k is a custom parameter and 0< k <1.N is the dynamic range of the standard deviation;
the picture after processing with k=35 and n=0.08 is shown in fig. 12.
From the binarized picture, it can be seen that the character can be separated from the background, and the background is too large or too small, and has many noise points. The cutting of the characters can be completed only by extracting the connected domain conforming to the size of the characters. And the method for extracting the connected domain is fast, so that a lot of program running time can be saved.
Step 2.3: character region extraction is based on a connected domain algorithm.
In order to remove noise as much as possible and retain effective information, if the images are 1 st and 2 nd type images, the images need to be subjected to local binarization, and then oversized or undersized connected domains are removed so as to eliminate partial interference. If the picture is the picture of the type 2, binarizing operation with the threshold value of 0.25 is carried out on the picture, inversion is carried out to obtain a character represented by white, and then the processing is the same as the previous type. Experiments show that the effect is better when the local binarization threshold value is 16 multiplied by 16, and the area of the connected domain of the character is between [500,5000 ]. The processing results are shown in fig. 13.
As can be seen from fig. 13, some of the ineffective area is not eliminated after the above-described processing, so the present invention further deletes the ineffective area by judging whether each connected domain meets the length, width and aspect ratio of the standard character. From the statistical information, the height of the character is between [90,110], the width of the character is between [10,60], and the aspect ratio of the character is not less than 1. The results obtained are shown in FIG. 14.
As can be seen from fig. 14, the invalid areas of the picture are all eliminated, so that the single character areas in the original picture can be cut out according to the positions of the connected areas in the binary picture corresponding to the positions in the original picture. To ensure consistent picture sizes, the present invention fills each picture and transforms its size into a 28 x 28 standard picture, with a single character cut end result as shown in fig. 15.
Step 3: and (5) character recognition.
A convolutional neural network algorithm in deep learning is employed for character recognition. The convolutional neural network for character recognition constructed in the present invention comprises two convolutional layers, two pooling layers and one full-connection layer, as shown in fig. 16. The first convolution layer convolves the character picture with the input size of 28 x 28 by 6 different convolution cores with the size of 5 x 5, and after the first layer convolution, the original character picture becomes a 24 x 6 feature map. The first pooling layer uses a pooling function with a sliding window of 2×2 to re-extract features of the result of the first convolution layer, and after the pooling of the layer, the result becomes a 12×12×6 feature map. The second convolution layer adopts 12 different convolution cores with the size of 5×5 to carry out feature re-extraction on the feature map of the pooling layer, and the size of the extracted feature map is 8×8×12. The second pooling layer pools the feature map convolved by the second convolution layer, and the feature map size after pooling becomes 4×4×12. And inputting the feature map after the second pooling operation into a full-connection layer to obtain the feature vector of the character. And finally classifying the feature vectors of the characters and corresponding to the actual numbers to finish the recognition of the time characters in the time stamp.
In the invention, the training steps of the convolutional neural network for time character recognition are divided into the following 3 steps:
step 3.1: and adding labels to single character pictures obtained from the color ball images as data samples required by a training network.
Step 3.2: the sample data are integrated into a 28X N matrix as the X vector of the input layer, where N is the number of character samples. And taking the digital label corresponding to each dimension matrix in the X vector as the Y vector of the input layer.
Step 3.3: the network is trained through forward propagation and Backward Propagation (BP), coefficients of the network are updated in a circulating and iterative mode, and a ReLu activation function and a maximum pooling function are adopted, so that a network structure with high identification accuracy is finally obtained. Where the input vectors X and Y are fed into the iteration 100 times.
And identifying single characters in time information in the solar color film map by using the convolutional neural network obtained through training, combining the identified characters in sequence, corresponding to file names of the original map, and automatically filling the file names into an Excel table for later manual check and database establishment.
Step 4: and manually checking the date, namely manually checking whether the automatically generated picture date information is wrong.
When the automatic identification of the time information such as "hour, minute" in the time stamp is completed, an important step to be performed is to manually check the date information (year, month, day), referring to fig. 17. Since the photographing time is mostly continuous and a 24-hour method is adopted, it is easy to determine whether the photographing date is changed. For example, if the first recognition result is "2359" and the second recognition result is "000", the shooting date of the second sheet is added by one day to the first shooting date. Therefore, for the color ball image in a period of time, the shooting date of each picture can be calculated by only knowing the shooting start date. The interface of fig. 17 is adopted to check the first few pieces of data of each day, if the date is wrong, only the shooting date of the first picture of each day is needed to be modified, and then the date of the subsequent picture is automatically updated through a recurrence algorithm.
When the user performs date checking, the user opens the interface and fills in the original picture path and the corresponding Excel table path. Clicking the Open button, the program opens the pictures in turn according to the corresponding numbers of the first day of each date in the Excel table, and fills the dates into the text boxes on the right side. If the date recorded in the Excel table is correct and the last picture is the last date, the Next Day button is directly clicked to check the Next date. If the user is wrong, the user needs to find out the date-jumped picture through the Last button and the Next button and fill the picture into the text box on the right side, and the update button program is clicked to automatically update all the dates on the back. And checking sequentially until the program runs to the last date. The time required for 1 person to complete date checking of ten thousand pictures was about 10 minutes. In the invention, the time information of time stamp 'hour, minute' and the like is automatically extracted by adopting the steps 1-3, so that the time is mainly spent on the step of 'step 4. Manually checking date'. If the traditional mode is adopted, the information of 'year, month, day, time and minute' is manually input one by one, 1 person can complete the time stamp information input of 1 ten thousand pictures only in at least two days. Therefore, the invention has remarkable benefits in the aspects of improving the time stamp information input efficiency, saving manpower and the like.
Claims (7)
1. The sun film image timestamp information extraction method based on deep learning is characterized by comprising the following steps of:
step1: positioning and cutting out a time stamp information area in the solar ball film image;
step 2: the single character segmentation, namely, further segmenting characters in the time stamp information area to obtain single characters;
step 3: character recognition, namely training a network by adopting a large number of samples, then recognizing the single characters obtained by segmentation in the step 2 by using the trained network, and integrating and storing recognition results;
the step1 comprises the following steps:
step 1.1, a sun sphere cutting step based on vertical projection:
the image is accumulated on the vertical component to obtain a vector of 1×n, the size of the image is m×n, and the pixel value of the pixel in the j columns of i rows is f ij (x, y), then in the vertical direction the projection is:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n; the position of the solar spherical surface can be further judged by calculating the projection of the picture in the vertical direction; at S 1j Vector [400,1800 ]]The result of the projection of the solar color sphere part is in between; also because the sun is symmetrical, it is only necessary to know the vector S 1j The position of the center of the sun can be positioned, and then the picture containing the solar spherical surface part is removed according to the length of the pixel occupied by the solar spherical surface in the vertical direction;
step 1.2, judging the position of the timestamp and correcting the overturn based on the variance:
the variance of pictures containing time stamps is much greater than the variance of pictures not containing time stamps; judging the picture in which the time stamp is located, after knowing the picture in which the time stamp is located, carrying out inversion correction on the picture, if the picture is a left picture, turning the picture by 90 degrees clockwise, otherwise, turning the picture by 90 degrees anticlockwise;
step 1.3, finely dividing the timestamp character area based on a projection method:
for a picture of size mxn, the pixel value of the pixel in row i and column j is x ij The projections in the horizontal direction and the vertical direction are respectively:
wherein S is 1j Representing the result of summation of pixel points in ith column of image, S 1j The size of (2) is 1×n; s is S i1 Representing the sum of the pixel points of the j-th row of the image, S i1 The size of (2) is m×1; the specific position of the time stamp area can be further calculated by calculating the projection of the picture in the horizontal direction and the vertical direction, so that the accurate segmentation of the picture is realized; in the step 3, the character recognition process is as follows:
the method comprises the steps that a convolutional neural network algorithm in deep learning is adopted to carry out character recognition, the built convolutional neural network for character recognition comprises two convolutional layers, two pooling layers and a full-connection layer, a first convolutional layer carries out convolution on character pictures with input sizes of 28 multiplied by 28 through 6 different convolution cores with sizes of 5 multiplied by 5, and after the first layer of convolution, the original character pictures are changed into 24 multiplied by 6 feature pictures; the first pooling layer adopts a pooling function with a sliding window of 2 multiplied by 2 to re-extract the characteristics of the result of the first convolution layer, and the result is changed into a 12 multiplied by 6 characteristic diagram after the pooling of the layer; the second convolution layer adopts 12 different convolution cores with the size of 5 multiplied by 5 to carry out feature re-extraction on the feature map of the pooling layer, and the size of the extracted feature map is 8 multiplied by 12; the second pooling layer pools the feature map convolved by the second convolution layer, and the size of the feature map after pooling becomes 4 multiplied by 12; inputting the feature map after the second pooling operation into a full-connection layer to obtain the feature vector of the character; finally, classifying the feature vectors of the characters and corresponding to the actual numbers to finish the recognition of the time characters in the time stamp; and identifying single characters in time information in the solar spherical film graph by using the convolutional neural network obtained through training, combining the identified characters in sequence, corresponding to file names of the original graph, and filling the file names into an Excel table for later manual checking and database establishment.
2. The deep learning-based sun film image time stamp information extraction method of claim 1, wherein: based on the projection results of the pictures in the horizontal and vertical directions obtained in the step 1.3, cutting the pictures: in order to ensure that the continuity of the pictures is not damaged, taking the first point which is larger than the average value as a starting point, the last point which is larger than the average value as an end point, and reserving all images from the starting point to the end point region; assuming that the original image S has a size of m×n, the cut picture P has a size of m '×n', and the cutting formula is:
P=S(a:b,c:d),(a,c>1,b<m,d<n)
wherein:
wherein S (a: b, c: d) represents rows a to b and columns c to d in the picture S; x represents the value at a point in the horizontal projection,representing the average value of each point in horizontal projection; y represents the value of a point in the vertical projection, < >>Representing the average value of each point in vertical projection;represented in the projection vector, x is greater than +.>A minimum value of the position of (2); />Represented in the projection vector, x is greater thanIs the maximum of the positions of (a).
3. The deep learning-based sun film image time stamp information extraction method of claim 1, wherein: in the step 2, the process of single character segmentation is as follows:
firstly, removing the background of a picture by adopting top hat operation, then removing noise by adopting a local binarization algorithm, and finally extracting a character region by adopting a connected domain algorithm; the algorithm defaults that the color of the character is white, if the color of the character is black, any effective area cannot be extracted after the extraction of the connected area, so that if the effective area does not exist, the algorithm returns to the local binarization part, and the color of the picture after the local binarization is turned over.
4. The deep learning-based sun film image time stamp information extraction method of claim 3, wherein:
background removal is carried out by adopting a top hat operation algorithm, and the algorithm principle is that the original image and the original image are subjected to difference in open operation result; after the top cap operation, a part of background noise can be eliminated from the picture, and characters in the image can be highlighted.
5. The deep learning-based sun film image time stamp information extraction method of claim 3, wherein:
adopting a Sauvola local binarization algorithm to remove noise; the cutting of the characters can be completed only by extracting the connected domain conforming to the size of the characters.
6. The deep learning-based sun film image time stamp information extraction method of claim 3, wherein:
character region extraction is carried out by adopting a connected domain algorithm, and an invalid region is further deleted by judging whether each connected domain accords with the length, the width and the length-width ratio of a standard character; the height of the character is between [90,110], the width of the character is between [10,60], and the length-width ratio of the character is not less than 1;
according to the position of each connected domain in the binary image corresponding to the position in the original image, the single character areas in the original image can be cut out respectively; to ensure consistent picture sizes, the algorithm fills each picture separately and transforms its size into a standard picture of 28 x 28.
7. The deep learning-based sun film image time stamp information extraction method of claim 1, wherein: the method further comprises the step 4 of manually checking the date:
for the color ball image in a period of time, only the information of the year, month and day of the first image is input, and the shooting date of each picture can be automatically calculated; there are occasionally some dates in the middle where sun observation is not performed, and the automatically generated error date information is modified by adopting a manual checking mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910765276.1A CN110533030B (en) | 2019-08-19 | 2019-08-19 | Deep learning-based sun film image timestamp information extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910765276.1A CN110533030B (en) | 2019-08-19 | 2019-08-19 | Deep learning-based sun film image timestamp information extraction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110533030A CN110533030A (en) | 2019-12-03 |
CN110533030B true CN110533030B (en) | 2023-07-14 |
Family
ID=68663766
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910765276.1A Active CN110533030B (en) | 2019-08-19 | 2019-08-19 | Deep learning-based sun film image timestamp information extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110533030B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898606B (en) * | 2020-05-19 | 2023-04-07 | 武汉东智科技股份有限公司 | Night imaging identification method for superimposing transparent time characters in video image |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026177A (en) * | 1995-08-29 | 2000-02-15 | The Hong Kong University Of Science & Technology | Method for identifying a sequence of alphanumeric characters |
CN101751568A (en) * | 2008-12-12 | 2010-06-23 | 汉王科技股份有限公司 | ID No. locating and recognizing method |
CN105528606A (en) * | 2015-10-30 | 2016-04-27 | 小米科技有限责任公司 | Region identification method and device |
CN105528600A (en) * | 2015-10-30 | 2016-04-27 | 小米科技有限责任公司 | Region identification method and device |
WO2017020723A1 (en) * | 2015-08-04 | 2017-02-09 | 阿里巴巴集团控股有限公司 | Character segmentation method and device and electronic device |
CN106611174A (en) * | 2016-12-29 | 2017-05-03 | 成都数联铭品科技有限公司 | OCR recognition method for unusual fonts |
CN108921163A (en) * | 2018-06-08 | 2018-11-30 | 南京大学 | A kind of packaging coding detection method based on deep learning |
CN109359695A (en) * | 2018-10-26 | 2019-02-19 | 东莞理工学院 | A kind of computer vision 0-O recognition methods based on deep learning |
CN109784342A (en) * | 2019-01-24 | 2019-05-21 | 厦门商集网络科技有限责任公司 | A kind of OCR recognition methods and terminal based on deep learning model |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101246551A (en) * | 2008-03-07 | 2008-08-20 | 北京航空航天大学 | Fast license plate locating method |
CN102402686B (en) * | 2011-12-07 | 2016-04-27 | 北京云星宇交通科技股份有限公司 | A kind of registration number character dividing method based on connected domain analysis |
JP6268023B2 (en) * | 2014-03-31 | 2018-01-24 | 日本電産サンキョー株式会社 | Character recognition device and character cutout method thereof |
CN108734189A (en) * | 2017-04-20 | 2018-11-02 | 天津工业大学 | Vehicle License Plate Recognition System based on atmospherical scattering model and deep learning under thick fog weather |
CN107590498B (en) * | 2017-09-27 | 2020-09-01 | 哈尔滨工业大学 | Self-adaptive automobile instrument detection method based on character segmentation cascade two classifiers |
CN109657665B (en) * | 2018-10-31 | 2023-01-20 | 广东工业大学 | Invoice batch automatic identification system based on deep learning |
-
2019
- 2019-08-19 CN CN201910765276.1A patent/CN110533030B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026177A (en) * | 1995-08-29 | 2000-02-15 | The Hong Kong University Of Science & Technology | Method for identifying a sequence of alphanumeric characters |
CN101751568A (en) * | 2008-12-12 | 2010-06-23 | 汉王科技股份有限公司 | ID No. locating and recognizing method |
WO2017020723A1 (en) * | 2015-08-04 | 2017-02-09 | 阿里巴巴集团控股有限公司 | Character segmentation method and device and electronic device |
CN105528606A (en) * | 2015-10-30 | 2016-04-27 | 小米科技有限责任公司 | Region identification method and device |
CN105528600A (en) * | 2015-10-30 | 2016-04-27 | 小米科技有限责任公司 | Region identification method and device |
CN106611174A (en) * | 2016-12-29 | 2017-05-03 | 成都数联铭品科技有限公司 | OCR recognition method for unusual fonts |
CN108921163A (en) * | 2018-06-08 | 2018-11-30 | 南京大学 | A kind of packaging coding detection method based on deep learning |
CN109359695A (en) * | 2018-10-26 | 2019-02-19 | 东莞理工学院 | A kind of computer vision 0-O recognition methods based on deep learning |
CN109784342A (en) * | 2019-01-24 | 2019-05-21 | 厦门商集网络科技有限责任公司 | A kind of OCR recognition methods and terminal based on deep learning model |
Non-Patent Citations (4)
Title |
---|
Imagenet classification with deep convolutional neural networks;A.Krizhevsky,I.Sutskever, G.E. Hinton;《Proceedings of the 25th International Conference on Neural Information Processing Systems》;全文 * |
基于SVM手绘太阳黑子图像背景提取方法;曾祥云,郑胜等;《微型机与应用》;全文 * |
基于神经网络的芯片表面字符检测识别系统;唐铭豆;陶青川;冯谦;;现代计算机(专业版)(09);全文 * |
手绘太阳黑子图手写字符分割方法研究;朱道远;郑胜;曾祥云;徐高贵;;微型机与应用(20);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110533030A (en) | 2019-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325203B (en) | American license plate recognition method and system based on image correction | |
CN111723585B (en) | Style-controllable image text real-time translation and conversion method | |
CN108596166A (en) | A kind of container number identification method based on convolutional neural networks classification | |
CN107403130A (en) | A kind of character identifying method and character recognition device | |
CN109872278B (en) | Image cloud layer removing method based on U-shaped network and generation countermeasure network | |
CN112232371B (en) | American license plate recognition method based on YOLOv3 and text recognition | |
CN112036231B (en) | Vehicle-mounted video-based lane line and pavement indication mark detection and identification method | |
CN111626146A (en) | Merging cell table segmentation and identification method based on template matching | |
CN113158977B (en) | Image character editing method for improving FANnet generation network | |
CN111640116B (en) | Aerial photography graph building segmentation method and device based on deep convolutional residual error network | |
CN113435407B (en) | Small target identification method and device for power transmission system | |
CN110414517B (en) | Rapid high-precision identity card text recognition algorithm used for being matched with photographing scene | |
CN111027538A (en) | Container detection method based on instance segmentation model | |
CN113378812A (en) | Digital dial plate identification method based on Mask R-CNN and CRNN | |
CN111626145A (en) | Simple and effective incomplete form identification and page-crossing splicing method | |
CN113505781A (en) | Target detection method and device, electronic equipment and readable storage medium | |
CN111415364A (en) | Method, system and storage medium for converting image segmentation samples in computer vision | |
CN113486894A (en) | Semantic segmentation method for satellite image feature component | |
CN114120359A (en) | Method for measuring body size of group-fed pigs based on stacked hourglass network | |
CN110533030B (en) | Deep learning-based sun film image timestamp information extraction method | |
CN110298366B (en) | Crop distribution extraction method and device | |
CN113077438B (en) | Cell nucleus region extraction method and imaging method for multi-cell nucleus color image | |
CN113012167B (en) | Combined segmentation method for cell nucleus and cytoplasm | |
CN112052859A (en) | License plate accurate positioning method and device in free scene | |
CN115115542B (en) | Quick restoration method for color difference strip after cloud platform remote sensing image mosaic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |