CN113255807B - Face analysis model training method, electronic device and storage medium - Google Patents
Face analysis model training method, electronic device and storage medium Download PDFInfo
- Publication number
- CN113255807B CN113255807B CN202110620497.7A CN202110620497A CN113255807B CN 113255807 B CN113255807 B CN 113255807B CN 202110620497 A CN202110620497 A CN 202110620497A CN 113255807 B CN113255807 B CN 113255807B
- Authority
- CN
- China
- Prior art keywords
- face
- picture
- training
- enhanced
- analysis model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 180
- 238000004458 analytical method Methods 0.000 title claims abstract description 127
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 45
- 230000009466 transformation Effects 0.000 claims abstract description 18
- 238000006243 chemical reaction Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 52
- 238000012360 testing method Methods 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 6
- 238000003062 neural network model Methods 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 10
- 230000008569 process Effects 0.000 description 8
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 210000000887 face Anatomy 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The embodiment of the invention relates to the field of data processing, and discloses a face analysis model training method, electronic equipment and a storage medium. The training method of the face analysis model comprises the following steps: acquiring a picture training set, wherein the picture training set comprises a non-labeled face picture; performing data enhancement on the non-labeled face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the non-labeled face picture, an enhanced face picture and conversion parameters corresponding to the enhanced face picture; and training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model. The invention can reduce the training cost of the face analysis model and improve the generalization effect of the face analysis model.
Description
Technical Field
The embodiment of the invention relates to the field of data processing, in particular to a face analysis model training method, electronic equipment and a storage medium.
Background
With the rapid development of deep learning technology and hardware technology in recent years, the face analysis technology also starts to use deep learning to obtain a model, and at present, a face analysis method based on deep learning is to learn an image-image face analysis model from a large number of training pictures with labels.
However, the existing training method for the face analysis model depends on a large number of labeled pictures, and the cost of manual labeling is high, so that the training cost of the face analysis model is high; and the currently disclosed marked picture set is mainly based on European and American face data, and compared with an actual face picture, the human species and the picture style have larger difference, so that the generalization effect of the trained face analysis model is poor.
Disclosure of Invention
The embodiment of the invention aims to provide a face analysis model training method, electronic equipment and a storage medium, which can be used for training a face analysis model by using a non-labeled face picture, an enhanced face picture and a transformation parameter so as to achieve the aims of reducing the training cost of the face analysis model and improving the generalization effect of the face analysis model.
In order to solve the above technical problem, an embodiment of the present invention provides a face analysis model training method, including: acquiring a picture training set, wherein the picture training set comprises a non-labeled face picture; performing data enhancement on the non-labeled face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the non-labeled face picture, an enhanced face picture and conversion parameters corresponding to the enhanced face picture; and training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model.
An embodiment of the present invention further provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above-described face analysis model training method.
The embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, wherein the computer program is executed by a processor to implement the above-mentioned face analysis model training method.
Compared with the prior art, in the process of training the face analysis model, the embodiment of the invention obtains the enhanced face picture and the transformation parameter by processing the non-labeled picture, and trains the face analysis network by using the non-labeled picture, the enhanced face picture and the transformation parameter, thereby achieving the purposes of reducing the training cost of the face analysis model and improving the generalization effect of the face analysis model, and solving the technical problems of poor generalization effect and high cost of labeling new training data caused by mainly using the labeled picture based on European and American face data to train the face analysis model in the prior art.
In addition, the method for training a face analysis model according to the embodiment of the present invention further includes, before performing data enhancement on the unlabeled face picture: and cutting the non-labeled face picture to a preset size. The technical scheme provided by the invention can cut the non-labeled face picture to the preset size, and avoid the problem of large data calculation amount caused by too large irrelevant area of the non-labeled face picture, so that the invention can process data load and improve the face analysis efficiency.
In addition, according to the face analysis model training method provided by the embodiment of the invention, the picture training set comprises face key points of the label-free face picture; the training method of the face analysis model further comprises the following steps: and training the face analysis model by using the label-free face picture based on a predefined key point loss function, wherein the key point loss function is obtained by the face key point and a face key point prediction result of the label-free face picture. The technical scheme provided by the invention can train the face analysis model, so that the face analysis model has the function of predicting key points of the face picture. The invention can improve the practicability of the face analysis model.
In addition, the face analysis model training method provided by the embodiment of the present invention further includes: acquiring a test picture set, wherein the test picture set comprises test pictures; processing the test picture set by using the trained face analysis model to obtain a face prediction result of the test picture set and a face key point prediction result of the test picture set; acquiring the analysis performance of the trained face analysis model according to the face prediction result and the face key point prediction result; if the analysis performance meets a preset condition, stopping training the face analysis model; and if the analysis performance does not meet the preset condition, training the face analysis model. The technical scheme provided by the invention can automatically judge whether to stop training the face analysis model according to the acquired analysis performance, so that the method is more intelligent.
In addition, the method for training a face analysis model according to the embodiment of the present invention, which performs data enhancement on the unlabeled face image to obtain an enhanced image training set, further includes: acquiring a marked face picture; based on a predefined classification loss function, training an initial model by using the marked face picture to obtain the pre-training model, wherein the initial model is a neural network model constructed by using a divide and conquer thought, and the classification loss function is obtained through self-adaptive weight and face type loss. The technical scheme provided by the invention can pre-train the face analysis model by using a small number of marked face pictures before the face analysis model is trained, so that the classification accuracy of training by using unmarked face pictures and enhanced face pictures can be improved, and the training time of the face analysis model is shortened.
In addition, according to the face analysis model training method provided by the embodiment of the invention, the self-adaptive weight is obtained through the distribution probability of each classification result of the pre-training model; the face type loss is obtained through the labeling result of the face picture with the label and the analysis result of the face picture with the label. According to the technical scheme provided by the invention, the self-adaptive weight function used in the pre-training process is obtained according to the classification result of each picture, so that the problem that the subdivided type regions with smaller areas are not easy to optimize under the condition of uniform weight due to different classified type areas of the skin regions is solved. The invention can improve the analysis effect on the types of the face areas.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1 is a first flowchart of a face analysis model training method according to an embodiment of the present invention;
FIG. 2 is a flowchart II of a face analysis model training method according to an embodiment of the present invention;
FIG. 3 is a flow chart III of a face analysis model training method according to an embodiment of the present invention;
FIG. 4 is a fourth flowchart of a face analysis model training method according to an embodiment of the present invention;
FIG. 5 is a flow chart of a fifth method for training a face analytic model according to an embodiment of the present invention;
fig. 6 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The embodiment of the invention relates to a face analysis model training method, as shown in fig. 1, specifically comprising:
Specifically, the acquired picture training set comprises a plurality of unlabelled face pictures, wherein the unlabelled face pictures do not carry pixel labels; in addition, the picture training set may further include face key points of each non-labeled face picture, and the face key points may appear on each non-labeled face picture in a labeled form, and the key points may be position coordinates of a face center point in the picture, or position coordinates of a specified face part in the picture, and may be set according to user requirements; the key points are detected before the image training set is formed by the non-labeled face images, the specific detection method is not limited, and any method capable of detecting the key points of the face can be used; the existence form of the key points in the image training set is not limited, and the key points can be stored in a certain fixed position on each unmarked picture, can also be stored in a separate data set, and the like.
It should be noted here that: the acquired picture training set can also contain a small number of marked face pictures, the unmarked face pictures and the marked face pictures contained in the unmarked face pictures have certain requirements on the number (a large number of unmarked face pictures and a small number of marked face pictures), and if the unmarked face pictures comprise N unmarked face pictures and M marked face pictures, M and N are integers larger than 0, and N is far larger than M, or the size of N is the value of M multiplied by a preset multiple. The marked face pictures in the picture training set have two functions: firstly, before a model is used for processing a non-labeled face picture, the labeled face picture is processed, so that the processing effect of the non-labeled face picture is improved; but in the process of processing the non-labeled face by using the model, the labeled face image is processed to prevent the inertization of the face analysis model and ensure the analysis capability of the face analysis model.
And 102, performing data enhancement on the non-labeled face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the non-labeled face picture, the enhanced face picture and conversion parameters corresponding to the enhanced face picture.
Specifically, after a picture training set is obtained, data enhancement processing needs to be performed on each unlabelled face picture in the picture training set to obtain an enhanced picture training set, the enhanced picture training set comprises conversion parameters corresponding to the unlabelled face picture, the enhanced face picture and the enhanced face picture, the enhanced face picture also comprises an unlabelled enhanced face picture (when the picture training set comprises a small number of labeled face pictures, the enhanced picture training set also comprises labeled face enhanced pictures), and the conversion parameters are used for indicating which data enhancement is performed on the unlabelled face picture; the data enhancement processing method can comprise one or more of mirror image processing, cutting processing, stretching processing, rotation processing, fuzzy processing, brightness processing, contrast processing and noise processing, and can also adopt different data enhancement processing methods for different unlabeled face pictures, wherein the mirror image processing refers to left and right mirror images, upper and lower mirror images or diagonal mirror images of the pictures, the cutting processing refers to cutting out partial face pictures on an original picture at random, adjusting the size of the cut partial face pictures to a fixed size, the stretching processing refers to stretching the original picture along the horizontal direction or the vertical direction and filling the stretched picture to the fixed size, the rotation processing refers to randomly selecting one group from a plurality of groups of template key points at different angles for affine transformation, and the fuzzy processing, brightness processing, contrast processing and noise processing refer to blurring, contrast processing and noise processing the original picture by adopting a certain algorithm, Brightness adjustment, contrast adjustment, and noise processing.
And 103, training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model.
Specifically, each unmarked face picture, each enhanced face picture and each transformation parameter are input into a pre-training model for processing (the pre-training model is a neural network model constructed by dividing and treating ideas after a small number of marked face pictures are used for training), the face prediction results of each unmarked face picture and each enhanced face picture are obtained, the face prediction results of each unmarked face picture are subjected to data enhancement transformation according to the respective corresponding transformation parameters, the enhanced face prediction results of each unmarked face picture are obtained, a consistency loss function is calculated according to the face prediction results of each enhanced face picture of the enhanced face prediction results of each unmarked face picture, then whether the training is stopped or not is judged according to the value of the consistency loss function, and the model after the training is stopped is an analytic model, the output of the face analysis model is a picture, each region is divided on the picture, each region is marked with H × W × N, H is the length of the region, W is the width of the region, N is the face type of the region, the value of N is a probability vector of the pixel value of the region after being normalized by a softmax function, and the face type can comprise: nose, eyes, mouth and other human face parts.
The expression of the consistency loss function is:
Losspair=|Transform(FaceParsingNet(pic))-FaceParsingNet(Transform(pic))|
wherein, Transform (faceParsingNet (pic)) is an enhanced face prediction result of the non-labeled face picture; FaceParsingNet (transform (pic)) is the result of face prediction for enhancing face pictures.
Compared with the prior art, in the process of training the face analysis model, the embodiment of the invention obtains the enhanced face picture and the transformation parameter by processing the non-labeled picture, and trains the face analysis network by using the non-labeled picture, the enhanced face picture and the transformation parameter, thereby achieving the purposes of reducing the training cost of the face analysis model and improving the generalization effect of the face analysis model, and solving the technical problems of poor generalization effect and high cost of labeling new training data caused by mainly using the labeled picture based on European and American face data to train the face analysis model in the prior art.
The embodiment of the invention relates to a training method of a face analysis model, as shown in fig. 2, specifically comprising:
Specifically, this step is substantially the same as step 101 in the embodiment of the present invention, and is not described herein again.
And step 202, cutting each non-labeled face picture to a preset size.
Specifically, after the non-labeled face picture is obtained, the non-labeled face picture can be cut firstly, so that all the non-labeled face pictures become pictures with the same size, the post-processing is convenient, a user can set a preset size according to the self requirement or the size condition of each picture in the picture training set, and meanwhile, the invention does not limit a picture cutting tool.
It should be noted here that: when the picture training set also contains a small number of pictures with labeled faces, the same operation is required to be carried out on the pictures with labeled faces.
And 203, sequentially performing data enhancement on each cut unmarked face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the unmarked face picture, the enhanced face picture and the corresponding transformation parameters of the enhanced face picture.
Specifically, this step is substantially the same as step 102 in the embodiment of the present invention, and is not described herein again.
And step 204, training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameters in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model.
Specifically, this step is substantially the same as step 103 in the embodiment of the present invention, and is not described herein again.
Compared with the prior art, the method and the device for processing the human face image have the advantages that on the basis of the beneficial effects brought by other implementation modes, the non-labeled human face image can be cut to the preset size, and the problem of large data calculation amount caused by too large irrelevant areas in the non-labeled human face image is solved, so that the method and the device can process data load, and the human face analysis efficiency is improved.
The embodiment of the invention relates to a training method of a face analysis model, as shown in fig. 3, specifically comprising:
Specifically, this step is substantially the same as step 101 in the embodiment of the present invention, and is not described herein again.
And 302, performing data enhancement on the non-labeled face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the non-labeled face picture, the enhanced face picture and conversion parameters corresponding to the enhanced face picture.
Specifically, this step is substantially the same as step 102 in the embodiment of the present invention, and is not described herein again.
And 303, training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model.
Specifically, this step is substantially the same as step 103 in the embodiment of the present invention, and is not described herein again.
And 304, training a face analysis model by using the label-free face picture based on a predefined key point loss function, wherein the key point loss function is obtained by face key points and face key point prediction results of the label-free face picture.
Specifically, after the face analysis model is obtained, each unlabelled face picture can be input into the face analysis model again for processing, a face key point prediction result of each unlabelled face picture is obtained, a loss function is calculated according to the face key point prediction result and the face key points of the unlabelled face pictures contained in the picture training set, whether the training of the face analysis model is stopped or not is judged according to the key point loss function, and the trained face analysis model is stopped to have the function of predicting the face key points, wherein the face analysis model can process the unlabelled face picture through any face key point detection algorithm. The expression of the keypoint loss function is:
Losscri=|Coorpred-Coorgt|
wherein, the colorgtFor the key points of the face marked on the face picture without marking, colorparsingAnd the prediction result is the face key point prediction result of the face picture without label.
In addition, it should be noted that the training of the function of predicting the key points of the face, which is described in the above steps, is only an example after the training of the face analysis function, and the training of the function of predicting the key points of the face may also be placed before the training step of the face analysis function, when training a pre-training model, the obtained training set with labeled pictures may also include each key point of the face with labeled face pictures, and the pre-training model may also perform the training of the function of predicting the key points of the face in addition to the training classification.
In addition, it should be noted that the invention can also use the unlabelled face picture and the enhanced face picture to perform the training of the function of predicting the key points of the face together, when the enhanced face picture is needed to perform the training of the function of predicting the key points of the face, the data enhancement processing method which can influence the key points of the face, such as stretching processing, rotating processing, mirror image processing and the like, cannot be selected when the unlabelled face picture is subjected to data enhancement, so as to avoid influencing the accuracy of the trained face analysis model in predicting the key points of the face, and the loss function used when the unlabelled face picture and the enhanced face picture are used to perform the training of the function of predicting the key points of the face together has little difference with the consistency loss function.
Compared with the prior art, the embodiment of the invention can also have the function of predicting key points of the face picture by training the face analysis model on the basis of the beneficial effects brought by other embodiments, so that the invention can improve the practicability of the face analysis model.
The embodiment of the invention relates to a training method of a face analysis model, as shown in fig. 4, specifically comprising:
Specifically, this step is substantially the same as step 101 in the embodiment of the present invention, and is not described herein again.
Specifically, this step is substantially the same as step 102 in the embodiment of the present invention, and is not described herein again.
And 403, training a preset pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model.
Specifically, this step is substantially the same as step 103 in the embodiment of the present invention, and is not described herein again.
And step 404, training a face analysis model by using the label-free face picture based on a predefined key point loss function, wherein the key point loss function is obtained by face key points and face key point prediction results of the label-free face picture.
Specifically, this step is substantially the same as step 404 in the embodiment of the present invention, and is not described herein again.
Specifically, the obtained test picture is any picture containing a face image, after the test picture is obtained, the obtained test picture can be cut to enable the sizes of all the test pictures to be the same, and the test picture set can also contain a correct face prediction result of each test picture and a correct face key point of each test picture.
And 406, processing the test picture set by using the trained face analysis model to obtain a face prediction result of the test picture set and a face key point prediction result of the test picture set.
Specifically, each test picture in the test picture set is processed by using the trained face analysis model, and a face prediction result and a face key point prediction result of each test picture are obtained.
And step 407, acquiring the analysis performance of the trained face analysis model according to the face prediction result and the face key point prediction result.
Specifically, the face prediction result and the face key point prediction result of each test picture acquired by the face analysis model are compared with the correct face prediction result and the correct face key point of each test picture in the test picture training set to acquire the analysis performance of the trained analysis model, and the analysis performance can be fed back in a correct rate mode or other modes.
Specifically, the obtained analytic performance is compared with preset conditions, such as: the preset condition is that the accuracy is more than or equal to 90 percent; when the preset condition is met, the face analysis model is mature, and can be put into use without training, and then step 409 is executed; when the preset condition is not met, it indicates that the accuracy of the face analysis model still needs to be improved, and training is further required to improve the accuracy, and then step 410 is executed.
And step 409, stopping training the face analysis model.
And step 410, training the face analysis model.
It should be noted here that: this embodiment illustrates that the analysis performance is determined according to the face prediction result and the face key point prediction result, and when the face analysis model is not trained for the face key point prediction function, the analysis performance is determined only by using the face prediction result.
Compared with the prior art, the embodiment of the invention can automatically judge whether to stop the training of the face analysis model or not according to the acquired analysis performance on the basis of the beneficial effects brought by other embodiments, so that the invention is more intelligent.
The embodiment of the invention relates to a training method of a face analysis model, as shown in fig. 5, specifically comprising:
Specifically, this step is the same as that mentioned in step 101 of the embodiment of the present invention, and is not repeated here.
Specifically, after the marked face picture in the picture training set is obtained, the pre-training model can be obtained by pre-training the initial model by using the marked face picture, so that the pre-training model used by the method can better process the unmarked face picture; the initial model of the application is a neural network model constructed based on a divide-and-conquer thought, wherein the divide-and-conquer thought refers to that an input marked face picture is divided into three large classes (background, skin and hair), then a predicted skin area is further subdivided (face, nose, eyes, mouth, eyebrows, ears, neck and the like), classification of the marked face picture is completed, then pixel values of each face type area are analyzed based on a classification result, a face type of the marked face picture is obtained, a classification loss function is calculated according to the face type, whether training is stopped or not is judged, and the pre-training model is obtained after the training is stopped.
The classification loss function is expressed as:
wherein,as a face type loss function, WiThe method is a self-adaptive weight function, i is a pixel subscript, refers to any pixel in a picture, and the pixel value is a probability vector normalized by a softmax function and can be used for representing the type of a human face;
the expression for face type loss is:
wherein,for the analysis result of the model with the labeled face picture,labeling results of the faces with labeled face pictures;
the expression of the adaptive weight is:
wherein the facetypeiFrequency (facetype) as a face type0) The probability is the face type i, and a is a preset constant;
the expression of the face type probability is:
among them, numPixel (facetype)i) The number of pixels of the current face type, h is the length of the current face picture, and w is the width of the current face picture.
Specifically, this step is substantially the same as step 102 in the embodiment of the present invention, and is not described herein again.
And step 504, training the pre-training model by using the unlabeled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on the predefined consistency loss function to obtain a face analysis model.
Specifically, this step is substantially the same as step 103 in the embodiment of the present invention, and is not described herein again.
Compared with the prior art, the implementation mode of the invention can pre-train the face analysis model by utilizing a small number of marked pictures before the face analysis model is trained on the basis of the beneficial effects brought by other implementation modes, and the self-adaptive weight used in the pre-training is obtained according to the distribution probability of the human type of each unmarked face picture; the invention can improve the classification accuracy of training by using the unlabelled face picture and the enhanced face picture, reduce the training time of the face analysis model and improve the analysis effect on the face region type, and meanwhile, the neural network model is constructed by adopting the divide-and-conquer thought, so that the model learning is simpler and the training of the model is accelerated.
An embodiment of the present invention relates to an electronic device, as shown in fig. 6, including:
at least one processor 601; and the number of the first and second groups,
a memory 602 communicatively coupled to the at least one processor 601; wherein,
the memory 602 stores instructions executable by the at least one processor 601 to enable the at least one processor 601 to perform any one of the above methods for training a face parsing model of the present invention.
Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.
The present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium comprises: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.
Claims (9)
1. A face analysis model training method is characterized by comprising the following steps:
acquiring a picture training set, wherein the picture training set comprises a non-labeled face picture;
performing data enhancement on the non-labeled face picture to obtain an enhanced picture training set, wherein the enhanced picture training set comprises the non-labeled face picture, an enhanced face picture and conversion parameters corresponding to the enhanced face picture;
training a preset pre-training model by using the unlabelled face picture, the enhanced face picture and the transformation parameter in the enhanced picture training set based on a predefined consistency loss function to obtain a face analysis model, wherein,
the consistency loss function is obtained through the enhanced face prediction result of the non-labeled face picture and the face prediction result of the enhanced face picture, and the enhanced face prediction result is obtained through converting the face prediction result of the non-labeled face picture according to the conversion parameters.
2. The training method for the face analytic model of claim 1, wherein the picture training set comprises face key points of the label-free face picture; the training method of the face analysis model further comprises the following steps:
and training the face analysis model by using the label-free face picture based on a predefined key point loss function, wherein the key point loss function is obtained by the face key point and a face key point prediction result of the label-free face picture.
3. The training method of the face analytic model according to claim 2, further comprising:
acquiring a test picture set, wherein the test picture set comprises test pictures;
processing the test picture set by using the trained face analysis model to obtain a face prediction result of the test picture set and a face key point prediction result of the test picture set;
acquiring the analysis performance of the trained face analysis model according to the face prediction result and the face key point prediction result;
if the analysis performance meets a preset condition, stopping training the face analysis model;
and if the analysis performance does not meet the preset condition, training the face analysis model.
4. The training method of the face analytic model according to any one of claims 1-2, wherein before the data enhancement of the unlabeled face picture, the method further comprises:
and cutting the non-labeled face picture to a preset size.
5. The training method of the face analysis model according to any one of claims 1-2, wherein the data enhancement comprises one or a combination of the following:
mirror processing, clipping processing, stretching processing, rotation processing, blurring processing, brightness processing, contrast processing and noise processing.
6. The training method of the face analysis model according to any one of claims 1-2, wherein the data enhancement of the unlabeled face picture to obtain an enhanced picture training set further comprises:
acquiring a marked face picture;
based on a predefined classification loss function, training an initial model by using the marked face picture to obtain the pre-training model, wherein the initial model is a neural network model constructed by using a divide and conquer thought, and the classification loss function is obtained through self-adaptive weight and face type loss.
7. The training method for the face analysis model according to claim 6, wherein the adaptive weight is obtained by a distribution probability of each classification result of the pre-training model;
the face type loss is obtained through the labeling result of the face picture with the label and the analysis result of the face picture with the label.
8. An electronic device, comprising: at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of training a face analytical model according to any one of claims 1 to 7.
9. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the face analysis model training method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110620497.7A CN113255807B (en) | 2021-06-03 | 2021-06-03 | Face analysis model training method, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110620497.7A CN113255807B (en) | 2021-06-03 | 2021-06-03 | Face analysis model training method, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113255807A CN113255807A (en) | 2021-08-13 |
CN113255807B true CN113255807B (en) | 2022-03-25 |
Family
ID=77186254
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110620497.7A Active CN113255807B (en) | 2021-06-03 | 2021-06-03 | Face analysis model training method, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113255807B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106980641A (en) * | 2017-02-09 | 2017-07-25 | 上海交通大学 | The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks |
CN107729885A (en) * | 2017-11-23 | 2018-02-23 | 中电科新型智慧城市研究院有限公司 | A kind of face Enhancement Method based on the study of multiple residual error |
CN110163235A (en) * | 2018-10-11 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Training, image enchancing method, device and the storage medium of image enhancement model |
CN111507155A (en) * | 2020-01-17 | 2020-08-07 | 长江大学 | U-Net + + and UDA combined microseism effective signal first-arrival pickup method and device |
CN112052818A (en) * | 2020-09-15 | 2020-12-08 | 浙江智慧视频安防创新中心有限公司 | Unsupervised domain adaptive pedestrian detection method, unsupervised domain adaptive pedestrian detection system and storage medium |
CN112734037A (en) * | 2021-01-14 | 2021-04-30 | 清华大学 | Memory-guidance-based weakly supervised learning method, computer device and storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275784B (en) * | 2020-01-20 | 2023-06-13 | 北京百度网讯科技有限公司 | Method and device for generating image |
-
2021
- 2021-06-03 CN CN202110620497.7A patent/CN113255807B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106980641A (en) * | 2017-02-09 | 2017-07-25 | 上海交通大学 | The quick picture retrieval system of unsupervised Hash and method based on convolutional neural networks |
CN107729885A (en) * | 2017-11-23 | 2018-02-23 | 中电科新型智慧城市研究院有限公司 | A kind of face Enhancement Method based on the study of multiple residual error |
CN110163235A (en) * | 2018-10-11 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Training, image enchancing method, device and the storage medium of image enhancement model |
CN111507155A (en) * | 2020-01-17 | 2020-08-07 | 长江大学 | U-Net + + and UDA combined microseism effective signal first-arrival pickup method and device |
CN112052818A (en) * | 2020-09-15 | 2020-12-08 | 浙江智慧视频安防创新中心有限公司 | Unsupervised domain adaptive pedestrian detection method, unsupervised domain adaptive pedestrian detection system and storage medium |
CN112734037A (en) * | 2021-01-14 | 2021-04-30 | 清华大学 | Memory-guidance-based weakly supervised learning method, computer device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113255807A (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3989111A1 (en) | Video classification method and apparatus, model training method and apparatus, device and storage medium | |
CN108156519B (en) | Image classification method, television device and computer-readable storage medium | |
US20230237841A1 (en) | Occlusion Detection | |
CN113160257B (en) | Image data labeling method, device, electronic equipment and storage medium | |
EP3989104A1 (en) | Facial feature extraction model training method and apparatus, facial feature extraction method and apparatus, device, and storage medium | |
CN110334585A (en) | Table recognition method, apparatus, computer equipment and storage medium | |
US10878593B2 (en) | Pupil localizing method and system | |
EP3852061A1 (en) | Method and device for damage segmentation of vehicle damage image | |
CN103971131A (en) | Preset facial expression recognition method and device | |
JP2021531571A (en) | Certificate image extraction method and terminal equipment | |
CN110930296A (en) | Image processing method, device, equipment and storage medium | |
CN110599453A (en) | Panel defect detection method and device based on image fusion and equipment terminal | |
JP2007047965A (en) | Method and device for detecting object of digital image, and program | |
CN111986785A (en) | Medical image labeling method and device, equipment and storage medium | |
WO2022199710A1 (en) | Image fusion method and apparatus, computer device, and storage medium | |
CN116258861A (en) | Semi-supervised semantic segmentation method and segmentation device based on multi-label learning | |
Chen et al. | Multi-dimensional color image recognition and mining based on feature mining algorithm | |
CN111985439B (en) | Face detection method, device, equipment and storage medium | |
CN113255807B (en) | Face analysis model training method, electronic device and storage medium | |
CN114299573A (en) | Video processing method and device, electronic equipment and storage medium | |
CN113538213A (en) | Data processing method, system and storage medium for makeup migration | |
CN115272527B (en) | Image coloring method based on color disc countermeasure network | |
CN113066115B (en) | Deep prediction network training method, device, server and readable storage medium | |
CN116563604A (en) | End-to-end target detection model training, image target detection method and related equipment | |
CN116580445A (en) | Large language model face feature analysis method, system and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230410 Address after: 230091 room 611-217, R & D center building, China (Hefei) international intelligent voice Industrial Park, 3333 Xiyou Road, high tech Zone, Hefei, Anhui Province Patentee after: Hefei lushenshi Technology Co.,Ltd. Address before: 100083 room 3032, North B, bungalow, building 2, A5 Xueyuan Road, Haidian District, Beijing Patentee before: BEIJING DILUSENSE TECHNOLOGY CO.,LTD. Patentee before: Hefei lushenshi Technology Co.,Ltd. |