CN111368840A - Certificate picture processing method and device - Google Patents
Certificate picture processing method and device Download PDFInfo
- Publication number
- CN111368840A CN111368840A CN202010104574.9A CN202010104574A CN111368840A CN 111368840 A CN111368840 A CN 111368840A CN 202010104574 A CN202010104574 A CN 202010104574A CN 111368840 A CN111368840 A CN 111368840A
- Authority
- CN
- China
- Prior art keywords
- seal
- picture
- certificate
- stamp
- connected domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/273—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion removing elements interfering with the pattern to be recognised
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a method and a device for processing certificate pictures, and relates to the technical field of computers. One embodiment of the method comprises: obtaining a sample set comprising a plurality of samples; training a seal filtering model by adopting a Unet network based on the sample set; and carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model. According to the embodiment, the seal in the certificate picture can be removed on the premise of not damaging the information in the certificate picture, so that the subsequent identification of the column in the certificate picture is facilitated.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for processing a certificate picture.
Background
When the certificate picture contains the seal, if the seal covers the column of the certificate, the seal added in the certificate picture needs to be removed before certificate identification. In the prior art, the position of a seal in a certificate picture is mainly determined by using target detection, and the seal on the certificate picture is wiped off according to the identified position of the seal before image identification.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
information in the certificate picture is often damaged, and the position of the stamp is blurred after image processing.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for processing a certificate picture, which can remove a stamp from the certificate picture without destroying information in the certificate picture, so as to facilitate subsequent identification of a field in the certificate picture.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method for certificate picture processing, including:
obtaining a sample set comprising a plurality of samples;
training a seal filtering model by adopting a Unet network based on the sample set;
and carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
Optionally, obtaining a sample set comprising a plurality of samples comprises: acquiring a plurality of certificate pictures without a seal, and adding the seal to each certificate picture without the seal to obtain the certificate picture with the seal;
based on the sample set, training a seal filtering model by adopting a Unet network, comprising the following steps: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output.
Optionally, adding a stamp to each certificate picture without a stamp to obtain a certificate picture with a stamp, including:
and adopting PS software to intercept the stamp picture from the stamp template picture, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv program to obtain the certificate picture with the stamp corresponding to each certificate picture without the stamp.
Optionally, after the seal filtering process is performed on the certificate picture containing the seal, the method further includes:
clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
determining each connected domain in the binary image and the position information of each connected domain by adopting a connected domain algorithm;
and determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
Optionally, clustering all pixel points in the certificate picture by adopting a K-means algorithm; the connected domain algorithm is as follows: the Two-Pass method or the Seed-Filling method.
Optionally, the certificate configuration information includes: the position information of each column in the certificate picture;
determining a column corresponding to each connected domain according to preset certificate configuration information and position information of each connected domain, wherein the determining comprises the following steps: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
Optionally, the document is an identity card.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for processing a document picture, including:
a sample acquisition module for acquiring a sample set containing a plurality of samples;
the model training module is used for training a seal filtering model by adopting a Unet network based on the sample set;
and the seal filtering module is used for carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
Optionally, the obtaining a sample set including a plurality of samples by the sample obtaining module includes: acquiring a plurality of certificate pictures without a seal, and adding the seal to each certificate picture without the seal to obtain the certificate picture with the seal;
the model training module adopts the Unet network to train and train the seal filtering model based on the sample set, and comprises: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output.
Optionally, the obtaining a seal by the sample obtaining module adds a seal to each certificate picture without a seal to obtain a certificate picture with a seal, including:
and adopting PS software to intercept the stamp picture from the stamp template picture, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv program to obtain the certificate picture with the stamp corresponding to each certificate picture without the stamp.
Optionally, the apparatus in the embodiment of the present invention further includes: a column identification module for filtering the seal of the certificate picture,
clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
determining each connected domain in the binary image and the position information of each connected domain by adopting a connected domain algorithm;
and determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
Optionally, the column recognition module clusters all pixel points in the certificate picture by adopting a K-means algorithm; the connected domain algorithm is as follows: the Two-Pass method or the Seed-Filling method.
Optionally, the certificate configuration information includes: the position information of each column in the certificate picture;
the column identification module determines columns corresponding to the connected domains according to preset certificate configuration information and the position information of the connected domains, and the column identification module comprises: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
Optionally, the document is an identity card.
According to a third aspect of the embodiments of the present invention, there is provided an electronic device for certificate picture processing, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method provided by the first aspect of the embodiments of the present invention.
According to a fourth aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method provided by the first aspect of embodiments of the present invention.
One embodiment of the above invention has the following advantages or benefits: the seal filtering model is trained through the Unet network, the seal filtering model obtained based on training is used for carrying out seal filtering processing on the certificate picture containing the seal, the seal in the certificate picture can be removed on the premise of not damaging information in the certificate picture, and subsequent identification of columns in the certificate picture is facilitated.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic flow chart of a method for processing a document picture in an alternative embodiment of the invention;
FIG. 2 is a schematic flow chart of the main process for performing stamp filtering processing according to an alternative embodiment of the present invention;
FIG. 3 is a schematic diagram of a Unet network architecture in an alternative embodiment of the present invention;
FIG. 4 is a schematic diagram of a main flow of field recognition according to an alternative embodiment of the present invention;
FIG. 5 is a schematic diagram of the main modules of an apparatus for credential picture processing in accordance with an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
According to one aspect of the embodiment of the invention, a method for processing a certificate picture is provided.
The certificate picture processing method of the embodiment of the invention can further comprise the steps of carrying out seal filtering processing on the certificate picture, and also can comprise the steps of carrying out seal filtering processing and column identification processing on the certificate picture.
Fig. 1 is a schematic main flow chart of a method for processing a certificate picture in an alternative embodiment of the present invention, and as shown in fig. 1, the method for processing the certificate picture includes: s101, acquiring a certificate picture containing a seal; step S102, carrying out seal filtering processing on the certificate picture containing the seal; and step S103, carrying out field identification processing on the certificate picture processed in the step S102.
The main flow of performing the stamp filtering process is exemplarily described below with reference to fig. 2. As shown in fig. 2, includes: step S201, step S202, and step S203.
Step S201, a sample set including a plurality of samples is obtained.
The samples are used for training the seal filtering model, and each sample comprises two pictures: certificate pictures without seals and certificate pictures with seals.
Optionally, obtaining a sample set comprising a plurality of samples comprises: obtaining a plurality of certificate pictures without stamps, and adding the stamps in each certificate picture without stamps to obtain the certificate picture with the stamps. Therefore, the consistency of the certificate picture containing the seal and the certificate picture without the seal in the same sample can be ensured, and the accuracy of model training is further improved.
Optionally, adding a stamp to each certificate picture without a stamp to obtain a certificate picture with a stamp, including: and intercepting the stamp picture from the stamp template picture by adopting PS (Photoshop) software, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv (open source computer vision library) program to obtain the certificate picture containing the stamp corresponding to each certificate picture without the stamp. The seal picture is obtained by adopting PS software, so that the definition of the seal picture can be improved. The seal picture is added to each certificate picture without the seal by adopting an opencv program, so that the sample acquisition speed can be improved.
And S202, training a seal filtering model by adopting a Unet network (FCNs-based segmentation network) based on the sample set. And the seal filtering model is trained by adopting the Unet network, so that the accuracy is good.
Optionally, training the seal filtering model by using a net network includes: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output. In the embodiment, the certificate picture with the seal is input from the input end of the model, and the certificate picture without the seal is used as the learning target training seal filtering model, so that the accuracy of model training can be improved.
The specific structure of the Unet network may be selectively set according to actual conditions, which is not limited in the embodiment of the present invention. In an alternative embodiment shown in fig. 3, a kernel (a high-level neural network API) deep learning framework is used to describe the Unet network, and the specific operator expression form and code are exemplified as follows:
inpt=Input(shape=(input_size_1,input_size_2,3))
conv1=Conv2d_BN(inpt,8,(3,3))
conv1=Conv2d_BN(conv1,8,(3,3))
pool1=MaxPooling2D(pool_size=(2,2),strides=(2,2),padding='same')(conv1)
conv2=Conv2d_BN(pool1,16,(3,3))
conv2=Conv2d_BN(conv2,16,(3,3))
pool2=MaxPooling2D(pool_size=(2,2),strides=(2,2),padding='same')(conv2)
conv3=Conv2d_BN(pool2,32,(3,3))
conv3=Conv2d_BN(conv3,32,(3,3))
pool3=MaxPooling2D(pool_size=(2,2),strides=(2,2),padding='same')(conv3)
conv4=Conv2d_BN(pool3,64,(3,3))
conv4=Conv2d_BN(conv4,64,(3,3))
pool4=MaxPooling2D(pool_size=(2,2),strides=(2,2),padding='same')(conv4)
conv5=Conv2d_BN(pool4,128,(3,3))
#conv5=Dropout(0.1)(conv5)
conv5=Conv2d_BN(conv5,128,(3,3))
#conv5=Dropout(0.1)(conv5)
convt1=Conv2dT_BN(conv5,64,(3,3))
concat1=concatenate([conv4,convt1],axis=3)
#concat1=Dropout(0.1)(concat1)
conv6=Conv2d_BN(concat1,64,(3,3))
conv6=Conv2d_BN(conv6,64,(3,3))
convt2=Conv2dT_BN(conv6,32,(3,3))
concat2=concatenate([conv3,convt2],axis=3)
#concat2=Dropout(0.1)(concat2)
conv7=Conv2d_BN(concat2,32,(3,3))
conv7=Conv2d_BN(conv7,32,(3,3))
convt3=Conv2dT_BN(conv7,16,(3,3))
concat3=concatenate([conv2,convt3],axis=3)
#concat3=Dropout(0.1)(concat3)
conv8=Conv2d_BN(concat3,16,(3,3))
conv8=Conv2d_BN(conv8,16,(3,3))
convt4=Conv2dT_BN(conv8,8,(3,3))
concat4=concatenate([conv1,convt4],axis=3)
#concat4=Dropout(0.1)(concat4)
conv9=Conv2d_BN(concat4,8,(3,3))
conv9=Conv2d_BN(conv9,8,(3,3))
#conv9=Dropout(0.1)(conv9)
outpt=Conv2D(filters=3,kernel_size=(1,1),strides=(1,1),padding='sam e',activation='sigmoid')(conv9)
and step S203, performing seal filtering processing on the certificate picture containing the seal based on the seal filtering model. The seal filtering model obtained based on the training in the step S202 is used for carrying out seal filtering processing on the certificate picture containing the seal, so that the seal in the certificate picture can be removed on the premise of not damaging the information in the certificate picture, and subsequent identification of columns in the certificate picture is facilitated.
In an optional embodiment, after the stamp filtering process, a field recognition process may be further performed. The field refers to an information item in the certificate. The column identification process identifies the information of each column in the certificate picture.
The main flow of the field recognition processing is exemplarily described below with reference to fig. 4. As shown in fig. 4, after the seal filtering process is performed on the certificate picture containing the seal, the process of performing the field identification process includes:
s401, clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
s402, determining each connected domain in the binarized image and position information of each connected domain by using a connected domain algorithm;
step S403, determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
The process of dividing a collection of physical or abstract objects into classes composed of similar objects is called clustering. Through clustering and binarization processing, the situation that identification cannot be carried out or the identification accuracy is low due to overexposure of the certificate picture can be avoided.
The clustering method can be selectively set according to the actual situation, such as K-medoids algorithm and CLARANS algorithm. Optionally, a K-means algorithm is adopted to cluster all pixel points in the certificate picture. The K-means algorithm comprises the following steps: (1) selecting initialized k samples as initial clustering centers; (2) calculating the distance from each sample in the data set to k clustering centers and dividing the distance into the class corresponding to the clustering center with the minimum distance; (3) for each class, recalculating its cluster center (i.e., the centroids of all samples belonging to the class); (4) repeating the above two steps (2) and (3) until reaching the stopping condition (iteration number, minimum error change, etc.). The K-means algorithm has good clustering effect, can ensure better flexibility when processing a large data set, and has low algorithm complexity.
Connected Component generally refers to an image Region (Blob) composed of foreground pixels having the same pixel value and adjacent positions in an image. The connected component algorithm refers to an algorithm for determining a connected component. The connected domain algorithm may be selectively set according to an actual situation, and optionally, the connected domain algorithm is: the Two-Pass method or the Seed-Filling method. And the connected domain method can be used for quickly determining each connected domain in each certificate picture. Each connected domain corresponds to one column.
The field generally includes a field name and a field content. Taking a name column in the identity card as an example, the name is a column name, and the XXX behind the name is column content; taking the gender column in the ID card as an example, the "gender" is the column name, and the "male" or "female" is the column content. When determining each connected domain, the embodiment of the invention can acquire the connected domain comprising the field name and the field content, and can further acquire the connected domain comprising the field content.
The certificate configuration information is information for describing the position of each column in the certificate picture. Taking the front picture of the identity card as an example, the configuration information of the identity card can be as follows: the first line is a name column, the left side of the second line is a gender column, the right side of the second line is a ethnic column, the left side of the third line is a birth year column, the middle of the third line is a birth month column, the right side of the third line is a birth date column, the fourth line is an address column, and the fifth line is a national identification number column. The field identification method of the embodiment of the invention is particularly suitable for the certificates with specific formats, such as identity cards, account books, bank cards, passports, license plates and the like.
Optionally, the certificate configuration information includes: and the column position information of each column in the certificate picture. Determining a column corresponding to each connected domain according to preset certificate configuration information and position information of each connected domain, wherein the determining comprises the following steps: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
For example, taking the credential configuration information in the foregoing as an example, based on the preset credential configuration information, the connected domain in the first row may be used as a name column, the connected domain on the left side of the second row may be used as a gender column, the connected domain on the right side may be used as a ethnic column, the connected domain on the left side of the third row may be used as a birth year column, the connected domain in the middle may be used as a birth month column, the connected domain on the right side may be used as a birth date column, the connected domain in the fourth row may be used as an address column, and the connected domain in the fifth row may be used as a citizenship number. In this way, subsequent analysis can be performed according to the identified fields, for example, the content in each field is analyzed.
According to the field identification method provided by the embodiment of the invention, all pixel points in the certificate picture are clustered, and field identification is carried out according to the binary picture obtained by clustering, so that the situation that the field identification is inaccurate or can not be identified due to higher exposure of the picture can be avoided, and the accuracy of certificate field identification is improved.
The certificate of the embodiment of the invention can be an identity card, a house notebook, a bank card, a passport, a license plate and the like. The certificate picture refers to a copy or a scanning piece of the certificate and the like.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for implementing the above method.
Fig. 5 is a schematic diagram of main modules of an apparatus for processing a document picture according to an embodiment of the present invention, and as shown in fig. 5, the apparatus 500 for processing a document picture includes:
a sample acquiring module 501, configured to acquire a sample set including a plurality of samples;
a model training module 502 for training a seal filtering model by using a Unet network based on the sample set;
and the seal filtering module 503 is used for performing seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
Optionally, the obtaining a sample set including a plurality of samples by the sample obtaining module includes: acquiring a plurality of certificate pictures without a seal, and adding the seal to each certificate picture without the seal to obtain the certificate picture with the seal;
the model training module adopts the Unet network to train and train the seal filtering model based on the sample set, and comprises: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output.
Optionally, the obtaining a seal by the sample obtaining module adds a seal to each certificate picture without a seal to obtain a certificate picture with a seal, including:
and adopting PS software to intercept the stamp picture from the stamp template picture, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv program to obtain the certificate picture with the stamp corresponding to each certificate picture without the stamp.
Optionally, the apparatus in the embodiment of the present invention further includes: a column identification module for filtering the seal of the certificate picture,
clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
determining each connected domain in the binary image and the position information of each connected domain by adopting a connected domain algorithm;
and determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
Optionally, the column recognition module clusters all pixel points in the certificate picture by adopting a K-means algorithm; the connected domain algorithm is as follows: the Two-Pass method or the Seed-Filling method.
Optionally, the certificate configuration information includes: the position information of each column in the certificate picture;
the column identification module determines columns corresponding to the connected domains according to preset certificate configuration information and the position information of the connected domains, and the column identification module comprises: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
Optionally, the document is an identity card.
According to a third aspect of the embodiments of the present invention, there is provided an electronic device for certificate picture processing, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method provided by the first aspect of the embodiments of the present invention.
According to a fourth aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method provided by the first aspect of embodiments of the present invention.
Fig. 6 illustrates an exemplary system architecture 600 of a method of or apparatus for credential picture processing to which embodiments of the present invention can be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 504 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The terminal devices 601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 601, 602, 603. The background management server may analyze and otherwise process the received data such as the stamp filtering request or the field identification request, and feed back a processing result (for example, a certificate picture with a stamp removed, and field identification result information — just an example) to the terminal device.
It should be noted that the method for processing the certificate picture provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the device for processing the certificate picture is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprising: a sample acquisition module for acquiring a sample set containing a plurality of samples; the model training module is used for training a seal filtering model by adopting a Unet network based on the sample set; and the seal filtering module is used for carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model. The names of these modules do not limit the module itself under certain circumstances, for example, the sample acquisition module may also be described as a "module for performing stamp filtering processing on a certificate picture containing a stamp based on the stamp filtering model".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: obtaining a sample set comprising a plurality of samples; training a seal filtering model by adopting a Unet network based on the sample set; and carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
According to the technical scheme of the embodiment of the invention, the seal filtering model is trained by adopting the Unet network, the seal filtering processing is carried out on the certificate picture containing the seal based on the seal filtering model obtained by training, the seal in the certificate picture can be removed on the premise of not damaging the information in the certificate picture, and the subsequent identification of the column in the certificate picture is facilitated.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (16)
1. A method of credential picture processing, comprising:
obtaining a sample set comprising a plurality of samples;
training a seal filtering model by adopting a Unet network based on the sample set;
and carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
2. The method of claim 1, wherein obtaining a sample set comprising a plurality of samples comprises: acquiring a plurality of certificate pictures without a seal, and adding the seal to each certificate picture without the seal to obtain the certificate picture with the seal;
based on the sample set, training a seal filtering model by adopting a Unet network, comprising the following steps: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output.
3. The method according to claim 1, wherein adding a stamp to each of the document pictures without a stamp to obtain a document picture with a stamp comprises:
and adopting PS software to intercept the stamp picture from the stamp template picture, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv program to obtain the certificate picture with the stamp corresponding to each certificate picture without the stamp.
4. The method of claim 1, wherein after the stamp filtering process is performed on the document image containing the stamp, the method further comprises:
clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
determining each connected domain in the binary image and the position information of each connected domain by adopting a connected domain algorithm;
and determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
5. The method of claim 4, wherein all pixel points in the certificate picture are clustered using a K-means algorithm; the connected domain algorithm is as follows: the Two-Pass method or the Seed-Filling method.
6. The method of claim 4, wherein the credential configuration information comprises: the position information of each column in the certificate picture;
determining a column corresponding to each connected domain according to preset certificate configuration information and position information of each connected domain, wherein the determining comprises the following steps: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
7. The method of any of claims 1-6, wherein the document is an identification card.
8. An apparatus for credential image processing, comprising:
a sample acquisition module for acquiring a sample set containing a plurality of samples;
the model training module is used for training a seal filtering model by adopting a Unet network based on the sample set;
and the seal filtering module is used for carrying out seal filtering processing on the certificate picture containing the seal based on the seal filtering model.
9. The apparatus of claim 8, wherein the sample acquisition module acquires a sample set comprising a plurality of samples, comprising: acquiring a plurality of certificate pictures without a seal, and adding the seal to each certificate picture without the seal to obtain the certificate picture with the seal;
the model training module adopts the Unet network to train and train the seal filtering model based on the sample set, and comprises: and training a seal filtering model by taking the certificate picture containing the seal as a model input and taking the certificate picture without the seal as a model output.
10. The apparatus of claim 8, wherein the sample acquisition module adds a stamp to each of the non-stamp containing document pictures to obtain a stamp containing document picture, comprising:
and adopting PS software to intercept the stamp picture from the stamp template picture, and adding the stamp picture into each certificate picture without the stamp by utilizing an opencv program to obtain the certificate picture with the stamp corresponding to each certificate picture without the stamp.
11. The apparatus of claim 8, further comprising: a column identification module for filtering the seal of the certificate picture,
clustering all pixel points in the certificate picture to obtain background pixel points and foreground pixel points, and setting the pixel value of the background pixel points to be 0 and the pixel value of the foreground pixel points to be 1 to obtain a binary image;
determining each connected domain in the binary image and the position information of each connected domain by adopting a connected domain algorithm;
and determining a column corresponding to each connected domain according to preset certificate configuration information and the position information of each connected domain.
12. The apparatus of claim 11, wherein the field identification module clusters all pixels in the certificate picture using a K-means algorithm; the connected domain algorithm is as follows: the Two-Pass method or the Seed-Filling method.
13. The apparatus of claim 11, wherein the credential configuration information comprises: the position information of each column in the certificate picture;
the column identification module determines columns corresponding to the connected domains according to preset certificate configuration information and the position information of the connected domains, and the column identification module comprises: and regarding any connected domain, taking the column corresponding to the column position information matched with the position information of any connected domain as the column of any connected domain.
14. A device as claimed in any one of claims 8 to 13, wherein the document is an identification card.
15. An electronic device for credential picture processing, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010104574.9A CN111368840A (en) | 2020-02-20 | 2020-02-20 | Certificate picture processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010104574.9A CN111368840A (en) | 2020-02-20 | 2020-02-20 | Certificate picture processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111368840A true CN111368840A (en) | 2020-07-03 |
Family
ID=71211484
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010104574.9A Pending CN111368840A (en) | 2020-02-20 | 2020-02-20 | Certificate picture processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111368840A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116129456A (en) * | 2023-02-09 | 2023-05-16 | 广西壮族自治区自然资源遥感院 | Method and system for identifying and inputting property rights and interests information |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016154314A (en) * | 2015-02-20 | 2016-08-25 | シャープ株式会社 | Image processing apparatus, television receiver, control method, program, and recording medium |
CN108171237A (en) * | 2017-12-08 | 2018-06-15 | 众安信息技术服务有限公司 | A kind of line of text image individual character cutting method and device |
CN109308476A (en) * | 2018-09-06 | 2019-02-05 | 邬国锐 | Billing information processing method, system and computer readable storage medium |
CN109886974A (en) * | 2019-01-28 | 2019-06-14 | 北京易道博识科技有限公司 | A kind of seal minimizing technology |
CN109977935A (en) * | 2019-02-27 | 2019-07-05 | 平安科技(深圳)有限公司 | A kind of text recognition method and device |
CN110619642A (en) * | 2019-09-05 | 2019-12-27 | 四川大学 | Method for separating seal and background characters in bill image |
-
2020
- 2020-02-20 CN CN202010104574.9A patent/CN111368840A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016154314A (en) * | 2015-02-20 | 2016-08-25 | シャープ株式会社 | Image processing apparatus, television receiver, control method, program, and recording medium |
CN108171237A (en) * | 2017-12-08 | 2018-06-15 | 众安信息技术服务有限公司 | A kind of line of text image individual character cutting method and device |
CN109308476A (en) * | 2018-09-06 | 2019-02-05 | 邬国锐 | Billing information processing method, system and computer readable storage medium |
CN109886974A (en) * | 2019-01-28 | 2019-06-14 | 北京易道博识科技有限公司 | A kind of seal minimizing technology |
CN109977935A (en) * | 2019-02-27 | 2019-07-05 | 平安科技(深圳)有限公司 | A kind of text recognition method and device |
CN110619642A (en) * | 2019-09-05 | 2019-12-27 | 四川大学 | Method for separating seal and background characters in bill image |
Non-Patent Citations (3)
Title |
---|
凌蓝风: "《CCF2019-基于OCR的身份证识别(三)印章去除 - 知乎》", 19 December 2019 * |
凌蓝风: "《CCF2019-基于OCR的身份证识别(四)文字定位》", 8 December 2019 * |
易显维: "《CCF2019-基于OCR的身份证识别方案文档》", 13 December 2019 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116129456A (en) * | 2023-02-09 | 2023-05-16 | 广西壮族自治区自然资源遥感院 | Method and system for identifying and inputting property rights and interests information |
CN116129456B (en) * | 2023-02-09 | 2023-07-25 | 广西壮族自治区自然资源遥感院 | Method and system for identifying and inputting property rights and interests information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3437019B1 (en) | Optical character recognition in structured documents | |
US10915980B2 (en) | Method and apparatus for adding digital watermark to video | |
CN109308681B (en) | Image processing method and device | |
CA3018437C (en) | Optical character recognition utilizing hashed templates | |
AU2017423339B2 (en) | Simulating image capture | |
CN111402120B (en) | Labeling image processing method and device | |
CN109344762B (en) | Image processing method and device | |
US20210200971A1 (en) | Image processing method and apparatus | |
CN109993749B (en) | Method and device for extracting target image | |
CN107977379B (en) | Method and device for mining information | |
CN110895811A (en) | Image tampering detection method and device | |
EP3869398A2 (en) | Method and apparatus for processing image, device and storage medium | |
CN114495146A (en) | Image text detection method and device, computer equipment and storage medium | |
CN112396060B (en) | Identification card recognition method based on identification card segmentation model and related equipment thereof | |
CN113742485A (en) | Method and device for processing text | |
CN111368840A (en) | Certificate picture processing method and device | |
CN111401137A (en) | Method and device for identifying certificate column | |
CN111881778A (en) | Text detection method, device, equipment and computer readable medium | |
CN111178353A (en) | Image character positioning method and device | |
CN109816035B (en) | Image processing method and device | |
CN118734039A (en) | Feature information extraction method, device, electronic equipment and storage medium | |
CN113780267A (en) | Method, device and equipment for character recognition and computer readable medium | |
CN115311451A (en) | Image blur degree evaluation method and device, computer equipment and storage medium | |
CN114201467A (en) | Information display method and device, electronic equipment and computer readable medium | |
CN112948028A (en) | Method and device for detecting page display information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220920 Address after: 25 Financial Street, Xicheng District, Beijing 100033 Applicant after: CHINA CONSTRUCTION BANK Corp. Address before: 25 Financial Street, Xicheng District, Beijing 100033 Applicant before: CHINA CONSTRUCTION BANK Corp. Applicant before: Jianxin Financial Science and Technology Co.,Ltd. |