CN109858445A

CN109858445A - Method and apparatus for generating model

Info

Publication number: CN109858445A
Application number: CN201910099403.9A
Authority: CN
Inventors: 邓启力
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2019-06-07
Anticipated expiration: 2039-01-31
Also published as: CN109858445B

Abstract

Embodiment of the disclosure discloses the method and apparatus for generating model.One specific embodiment of this method includes: acquisition training sample set；It is concentrated from training sample and chooses training sample, execute following training step: the sample facial image in training sample is inputted to the feature extraction layer of initial neural network, obtain characteristics of image；Characteristics of image is inputted to the first sub-network of initial neural network, generates face key point information；Face key point information and characteristics of image are inputted to the second sub-network of initial neural network, obtain the deviation of face key point information；Based on the sample face key point information in face key point information and training sample, the expectation deviation of face key point information is determined；Based on deviation and desired deviation, determine whether initial neural network trains completion；In response to determining that training is completed, the initial neural network that training is completed is determined as face key point identification model.The embodiment helps to realize more accurate face critical point detection.

Description

Method and apparatus for generating model

Technical field

Embodiment of the disclosure is related to field of computer technology, more particularly, to generates the method and apparatus of model.

Background technique

With the development of face recognition technology, face critical point detection technology is also widely applied, such as spy Effect addition, human face three-dimensional model building etc..

Face key point refers to the point with the indexing of obvious semantic space in face.Currently, the stream of face critical point detection Journey, which is generally, inputs a face critical point detection model trained in advance for facial image to be detected, obtains testing result.

Summary of the invention

Embodiment of the disclosure proposes the method and apparatus for generating model, and the side for handling facial image Method and device.

In a first aspect, the embodiment of the present disclosure provides a kind of method for generating model, this method comprises: obtaining training Sample set, wherein training sample includes that sample facial image and the sample face marked in advance for sample facial image are crucial Point information；It is concentrated from training sample and chooses training sample, and execute following training step: will be in selected training sample Sample facial image inputs the feature extraction layer of initial neural network, obtains characteristics of image；Characteristics of image obtained is inputted First sub-network of initial neural network, generates the face key point information of sample facial image；Face generated is crucial Point information and characteristics of image input the second sub-network of initial neural network, obtain deviation corresponding to face key point information； Based on the sample face key point information in face key point information and training sample, face key point information generated is determined Corresponding expectation deviation；Based on deviation corresponding to face key point information and desired deviation, determine that initial neural network is No training is completed；In response to determining that training is completed, the initial neural network that training is completed is determined as face key point identification mould Type.

In some embodiments, the second sub-network includes the first generation layer and the second generation layer；And by people generated Face key point information and characteristics of image input the second sub-network of initial neural network, obtain corresponding to face key point information Deviation, comprising: face key point information generated is inputted to the first generation layer of the second sub-network, obtains face key point letter The corresponding hotspot graph of breath, wherein the image-region of hotspot graph includes numerical value set, for the numerical value in numerical value set, the number Value is for characterizing face key point in the probability of the numerical value position；By hotspot graph obtained and characteristics of image input the Second generation layer of two sub-networks obtains deviation corresponding to face key point information.

In some embodiments, this method further include: training is not completed in response to the initial neural network of determination, and adjustment is initial Relevant parameter in neural network is concentrated from training sample and chooses training sample in unselected training sample, and uses The initial neural network of the last time adjustment and the last training sample chosen, continue to execute training step.

Second aspect, embodiment of the disclosure provides a kind of method for handling facial image, this method comprises: obtaining Take target facial image；Target facial image is inputted into the side using any embodiment in method described in above-mentioned first aspect In the face key point identification model that method generates, generates face key point information corresponding to target facial image and face is crucial Deviation corresponding to point information；Based on face key point information generated and deviation, generate corresponding to target facial image As a result face key point information.

The third aspect, embodiment of the disclosure provide it is a kind of for generating the device of model, the device include: obtain it is single Member is configured to obtain training sample set, wherein training sample include sample facial image and for sample facial image it is preparatory The sample face key point information of mark；Training unit is configured to concentrate from training sample and chooses training sample, and executes Sample facial image in selected training sample: being inputted the feature extraction layer of initial neural network by following training step, Obtain characteristics of image；Characteristics of image obtained is inputted to the first sub-network of initial neural network, generates sample facial image Face key point information；Face key point information generated and characteristics of image are inputted to the second subnet of initial neural network Network obtains deviation corresponding to face key point information；It is closed based on the sample face in face key point information and training sample Key point information, determines expectation deviation corresponding to face key point information generated；Based on corresponding to face key point information Deviation and desired deviation, determine whether initial neural network trains completion；In response to determining that training is completed, training is completed Initial neural network is determined as face key point identification model.

In some embodiments, the second sub-network includes the first generation layer and the second generation layer；And training unit is into one Step is configured to: face key point information generated being inputted to the first generation layer of the second sub-network, obtains face key point Hotspot graph corresponding to information, wherein the image-region of hotspot graph includes numerical value set, should for the numerical value in numerical value set Numerical value is for characterizing face key point in the probability of the numerical value position；Hotspot graph obtained and characteristics of image are inputted Second generation layer of the second sub-network obtains deviation corresponding to face key point information.

In some embodiments, device further include: adjustment unit is configured in response to determine initial neural network not Training is completed, and is adjusted the relevant parameter in initial neural network, is concentrated from training sample and choose in unselected training sample Training sample, and initial neural network and the last training sample chosen using the last adjustment, continue to execute Training step.

Fourth aspect, embodiment of the disclosure provide a kind of for handling the device of facial image, which includes: figure As acquiring unit, it is configured to obtain target facial image；First generation unit is configured to input target facial image and adopt In the face key point identification model that the method for any embodiment generates in the method described in above-mentioned first aspect, mesh is generated Mark deviation corresponding to face key point information corresponding to facial image and face key point information；Second generation unit, quilt It is configured to generate result face key point corresponding to target facial image based on face key point information generated and deviation Information.

5th aspect, embodiment of the disclosure provide a kind of electronic equipment, which includes: one or more places Manage device；Storage device is stored thereon with one or more programs；When one or more programs are held by one or more processors Row, so that one or more processors realize the side of any embodiment in the method as described in first aspect and second aspect Method.

6th aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program, The program realizes any embodiment in method described in above-mentioned first aspect and second aspect method when being executed by processor.

The method and apparatus for generating model that embodiment of the disclosure provides, by obtaining training sample set, wherein Training sample includes sample facial image and the sample face key point information that marks in advance for sample facial image, then from Training sample, which is concentrated, chooses training sample, and executes following training step: by the sample face in selected training sample Image inputs the feature extraction layer of initial neural network, obtains characteristics of image；Characteristics of image obtained is inputted into initial nerve First sub-network of network generates the face key point information of sample facial image；By face key point information generated and Characteristics of image inputs the second sub-network of initial neural network, obtains deviation corresponding to face key point information；Based on face Sample face key point information in key point information and training sample, determines corresponding to face key point information generated It is expected that deviation；Based on deviation corresponding to face key point information and desired deviation, determine whether initial neural network has trained At；In response to determining that training is completed, the initial neural network that training is completed is determined as face key point identification model, thus institute The face key point identification model of generation can predict the face key point information of facial image, and the face predicted simultaneously The deviation of key point information facilitates face key point information and deviation using model prediction, generates more accurate result Face key point information realizes more accurate face critical point detection.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein；

Fig. 2 is the flow chart according to one embodiment of the method for generating model of the disclosure；

Fig. 3 is according to an embodiment of the present disclosure for generating the schematic diagram of an application scenarios of the method for model；

Fig. 4 is the flow chart according to another embodiment of the method for handling facial image of the disclosure；

Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating model of the disclosure；

Fig. 6 is the structural schematic diagram according to one embodiment of the device for handling facial image of the disclosure；

Fig. 7 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of embodiment of the disclosure.

Specific embodiment

The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the method for generating model that can apply the disclosure or the device for generating model, and For handling the exemplary system architecture 100 of the method for facial image or the embodiment of the device for handling facial image.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as image processing class is soft on terminal device 101,102,103 Part, web browser applications, searching class application, instant messaging tools, social platform software etc..

Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as can be and utilize terminal device 101,102,103 The training sample set of upload carries out the model training server of model training.Model training server can use the training of acquisition Sample set carries out model training, generates face key point identification model.In addition, after training obtains face key point identification model, Server can send face key point identification model to terminal device 101,102,103, also can use face key point Identification model carries out the identification of face key point to facial image.

It should be noted that can be by server 105 for generating the method for model provided by embodiment of the disclosure It executes, can also be executed by terminal device 101,102,103, correspondingly, the device for generating model can be set in service In device 105, also it can be set in terminal device 101,102,103.In addition, for handling provided by embodiment of the disclosure The method of facial image can be executed by server 105, can also be executed by terminal device 101,102,103, correspondingly, be used for The device of processing facial image can be set in server 105, also can be set in terminal device 101,102,103.

It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module) It is implemented as single software or software module.It is not specifically limited herein.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.Training sample set needed for training pattern is not required to from remote Journey obtains or target facial image to be processed is not required in the case where long-range obtain, and above system framework can not include Network, and only need server or terminal device.

With continued reference to Fig. 2, the process of one embodiment of the method for generating model according to the disclosure is shown 200.The method for being used to generate model, comprising the following steps:

Step 201, training sample set is obtained.

In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for model It crosses wired connection mode or radio connection obtains training sample set.Wherein, training sample include sample facial image and The sample face key point information marked in advance for sample facial image.Sample facial image can be for the progress of sample face Shoot image obtained.Sample face key point information is for characterizing position of the sample face key point in sample facial image It sets, can include but is not limited at least one of following: number, text, symbol, image.For example, sample face key point information can Think the key point coordinate for characterizing position of the sample face key point in sample facial image.Herein, key point is sat Mark can be the coordinate under the coordinate system pre-established based on sample facial image.

In practice, face key point can be point crucial in face, specifically, can be influence face mask or five The point of official's shape.As an example, face key point can be point corresponding to point, eyes corresponding to nose etc..

Specifically, above-mentioned executing subject is available to be pre-stored within local training sample set, also available communication The training sample set that the electronic equipment (such as terminal device shown in FIG. 1) of connection is sent.

Step 202, it is concentrated from training sample and chooses training sample, and execute following training step: by selected instruction Practice the feature extraction layer that the sample facial image in sample inputs initial neural network, obtains characteristics of image；By figure obtained As feature inputs the first sub-network of initial neural network, the face key point information of generation sample facial image；It will be generated Face key point information and characteristics of image input the second sub-network of initial neural network, it is right to obtain face key point information institute The deviation answered；Based on the sample face key point information in face key point information and training sample, face generated is determined Expectation deviation corresponding to key point information；Based on deviation corresponding to face key point information and desired deviation, determine initial Whether neural network trains completion；In response to determining that training is completed, the initial neural network that training is completed is determined as face and is closed Key point identification model.

In the present embodiment, based on training sample set obtained in step 201, above-mentioned executing subject can be from training sample It concentrates and chooses training sample, and the following training step of execution (step 2021- step 2026):

Step 2021, the sample facial image in selected training sample is inputted to the feature extraction of initial neural network Layer obtains characteristics of image.

Wherein, characteristics of image can be the features such as color, the shape of image.Initial neural network is predetermined, use In the various neural networks (such as convolutional neural networks) for generating face key point identification model.Face key point identification model can With face key point corresponding to facial image for identification.Herein, initial neural network can be unbred nerve Network, or do not train by training but the neural network completed.Specifically, initial neural network includes feature extraction Layer.Feature extraction layer is used to extract the characteristics of image of inputted facial image.Specifically, feature extraction layer includes that can extract In addition the structure (such as convolutional layer) of characteristics of image also may include other structures (such as pond layer), herein with no restrictions.

Step 2022, characteristics of image obtained is inputted to the first sub-network of initial neural network, generates sample face The face key point information of image.

In the present embodiment, initial neural network further includes the first sub-network.First sub-network is connect with feature extraction layer, Characteristics of image for being exported based on feature extraction layer generates face key point information.Face key point information generated is Face key point information corresponding to first sub-network predicts, sample facial image.

It is appreciated that often there is error between measured value and true value, so pre- using initial neural network in practice The face key point information and practical face key point information measured are also typically present difference.

It should be noted that herein, the first sub-network may include for generating result (face key point information) Structure (such as classifier, full articulamentum) can also include (such as defeated for exporting the structure of result (face key point information) Layer out).

Step 2023, face key point information generated and characteristics of image are inputted to the second subnet of initial neural network Network obtains deviation corresponding to face key point information.

In the present embodiment, initial neural network further includes the second sub-network, the second sub-network respectively with the first sub-network It is connected with feature extraction layer, the characteristics of image for being exported based on feature extraction layer, determines that the face of the first sub-network output closes Deviation corresponding to key point information.Deviation corresponding to face key point information is for characterizing face key point information generated The difference of practical face key point information relative to sample facial image.Here, the second sub-network deviation generated is base In the deviation that characteristics of image predicts.

It should be noted that herein, the second sub-network may include that (face key point information institute is right for generating result The deviation answered) structure (such as classifier, full articulamentum), can also include for exporting result (face key point information institute Corresponding deviation) structure (such as output layer).

In some optional implementations of the present embodiment, the second sub-network includes that the first generation layer and second generate Layer；And above-mentioned executing subject can obtain deviation corresponding to face key point information by following steps:

Firstly, face key point information generated to be inputted to the first generation layer of the second sub-network, it is crucial to obtain face Hotspot graph corresponding to point information.

Wherein, the first generation layer is connect with the first sub-network, the face key point letter for being exported based on the first sub-network Breath generates hotspot graph corresponding to face key point information.Here, the image-region of hotspot graph includes numerical value set.For number Numerical value in value set, the numerical value is for characterizing face key point in the probability of the numerical value position.It should be noted that Hotspot graph is identical as the shape of sample facial image, size difference.And then the position where the numerical value in hotspot graph can be with sample Position in this facial image is corresponding, and therefore, hotspot graph can serve to indicate that the face key point in sample facial image Position.

It should be noted that hotspot graph may include at least one numerical value set, wherein at least one numerical value set Each numerical value set can correspond to a face key point information.

Specifically, in, hotspot graph corresponding with the position in the sample facial image that face key point information is characterized Position on numerical value can be 1.According to each position in hotspot graph with numerical value 1 corresponding at a distance from position, Ge Gewei Setting corresponding numerical value can be gradually reduced.I.e. position corresponding to distance values 1 is remoter, and corresponding numerical value is smaller.

It should be noted that the position where numerical value in hotspot graph can be by the minimum rectangle for surrounding numerical value Lai really It is fixed.Specifically, the center of above-mentioned minimum rectangle can be determined as numerical value position, alternatively, can be by minimum rectangle Endpoint location be determined as numerical value position.

Then, the second generation layer that hotspot graph obtained and characteristics of image are inputted to the second sub-network obtains face and closes Deviation corresponding to key point information.

Wherein, the second generation layer is connect with the first generation layer and feature extraction layer respectively, for defeated based on feature extraction layer The hotspot graph of characteristics of image and the output of the first generation layer out, determines corresponding to the face key point information for inputting the first generation layer Deviation.

In this implementation, hotspot graph can indicate the position range of face key point, believe compared to face key point Breath, can more accurate, easily return out the position feature of face key point using hotspot graph, so can it is more quick, Accurately generate deviation corresponding to face key point information.

Step 2024, based on the sample face key point information in face key point information and training sample, determination is given birth to At face key point information corresponding to expectation deviation.

Herein, above-mentioned executing subject can determine face key point information that the first sub-network generates and mark in advance The difference of sample face key point information, and then identified difference is determined as it is expected deviation.

As an example, the face key point information that the first sub-network generates is coordinate " (10,19) ".Sample face key point Information is coordinate " (11,18) ".Then it is expected that deviation can be (- 1,1), wherein -1=10-11, for characterizing the difference of abscissa It is different；1=19-18, for characterizing the difference of ordinate.

Step 2025, based on deviation corresponding to face key point information and desired deviation, determine that initial neural network is No training is completed.

Herein, above-mentioned executing subject can use preset loss function and calculate obtained face key point information institute Difference between corresponding deviation and expectation deviation, and then determine whether the difference being calculated is less than or equal to default difference threshold Value determines that initial neural metwork training is completed in response to determining that the difference being calculated is less than or equal to default discrepancy threshold.Its In, preset loss function can be various loss functions, for example, L2 norm or Euclidean distance etc..

Step 2026, in response to determining that training is completed, the initial neural network that training is completed is determined as face key point Identification model.

In the present embodiment, above-mentioned executing subject can be completed in response to the initial neural metwork training of determination, and training is complete At initial neural network be determined as face key point identification model.

In some optional implementations of the present embodiment, above-mentioned executing subject may also respond to determine initial nerve Network not complete by training, adjusts the relevant parameter in initial neural network, unselected training sample is concentrated from training sample Middle selection training sample, and initial neural network and the last training sample chosen using the last adjustment, after It is continuous to execute above-mentioned training step (step 2021- step 2026).

Specifically, above-mentioned executing subject can training not be completed in response to the initial neural network of determination, obtained by calculating Difference, adjust the relevant parameter of initial neural network.Here it is possible to adopt in various manners based on the discrepancy adjustment being calculated The relevant parameter of initial neural network.For example, BP (Back Propagation, backpropagation) algorithm or SGD can be used (Stochastic Gradient Descent, stochastic gradient descent) algorithm adjusts the relevant parameter of initial neural network.

With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for generating model of the present embodiment Figure.In the application scenarios of Fig. 3, server 301 can obtain training sample set 302 first, wherein training sample includes sample Facial image and the sample face key point information marked in advance for sample facial image.Sample face key point information is used for Characterize the position of the sample face key point in sample facial image.For example, sample face key point information can be sample people The coordinate of face key point.

Then, server 301 can choose training sample 3021 from training sample set 302, and execute following training Step: the sample facial image 30211 in selected training sample 3021 is inputted to the feature extraction of initial neural network 303 Layer 3031 obtains characteristics of image 304；Characteristics of image 304 obtained is inputted to the first sub-network of initial neural network 303 3032, generate the face key point information 305 of sample facial image 30211；By face key point information 305 generated and figure As feature 304 inputs the second sub-network 3033 of initial neural network 303, obtain inclined corresponding to face key point information 305 Poor 306；Based on the sample face key point information 30212 in face key point information 305 and training sample 3021, determination is given birth to At face key point information 305 corresponding to expectation deviation 307；Based on deviation 306 corresponding to face key point information 305 With desired deviation 307, determine whether initial neural network 303 trains completion；In response to determining that training is completed, training is completed Initial neural network 303 is determined as face key point identification model 308.

The method provided by the above embodiment of disclosure face key point identification model generated can predict people simultaneously The face key point information of face image, and the deviation of face key point information predicted help to utilize model prediction Face key point information and deviation generate more accurate result face key point information, realize that more accurate face is crucial Point detection.

With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for handling facial image. This is used to handle the process 400 of the method for facial image, comprising the following steps:

Step 401, target facial image is obtained.

It in the present embodiment, can for handling the executing subject (such as server shown in FIG. 1) of the method for facial image To obtain target facial image by wired connection mode or radio connection.Wherein, target facial image can for The facial image of face key point identification is carried out to it.Face key point is point crucial in face, specifically, can be influence The point of face mask or face shape.

In the present embodiment, above-mentioned executing subject can obtain target facial image using various methods.Specifically, above-mentioned Executing subject is available to be pre-stored within local target facial image or the available communication connection of above-mentioned executing subject Electronic equipment (such as terminal device shown in FIG. 1) send target facial image.

Step 402, by target facial image input face key point identification model trained in advance, target face is generated Deviation corresponding to face key point information and face key point information corresponding to image.

In the present embodiment, based on target facial image obtained in step 401, above-mentioned executing subject can be by target person In face image input face key point identification model trained in advance, the letter of face key point corresponding to target facial image is generated Deviation corresponding to breath and face key point information.Wherein, face key point information is for characterizing face key point in target person Position in face image can include but is not limited at least one of following: text, number, symbol, image.Face key point letter The corresponding deviation of breath, which can be used for characterizing, utilizes face corresponding to face key point identification model prediction target facial image Prediction error when key point information.

In the present embodiment, face key point identification model is the method according to described in above-mentioned Fig. 2 corresponding embodiment It generates, which is not described herein again.

Step 403, it is based on face key point information generated and deviation, generates result corresponding to target facial image Face key point information.

In the present embodiment, based on the face key point information generated in step 402 and deviation, above-mentioned executing subject can be with Generate result face key point information corresponding to target facial image.Wherein, as a result face key point information is to close to face The face key point information of key point identification model output carries out the face key point information after error compensation.

It is appreciated that usually can all have prediction error when carrying out information prediction using model, to prediction result in practice Carrying out error compensation can make compensated result more close to legitimate reading, help to improve the accuracy of information prediction.

As an example, the face key point information of face key point identification model output can be the coordinate of face key point (10,19).The deviation of face key point identification model output can be (0.2, -0.5), wherein " 0.2 " can be used for characterizing people The error of abscissa in the coordinate of the face key point of face key point identification model output；" -0.5 " can be used for characterizing face The error of ordinate in the coordinate of the face key point of key point identification model output.In turn, above-mentioned executing subject can be right The error " 0.2 " of abscissa " 10 " and abscissa in the coordinate of face key point carries out asking poor, the cross after obtaining error compensation Coordinate " 9.8 "；The error " -0.5 " of ordinate " 19 " and ordinate in the coordinate of face key point is carried out asking poor, is missed The compensated ordinate " 19.5 " of difference, finally, above-mentioned executing subject can use the abscissa " 9.8 " and vertical seat after error compensation Mark " 19.5 " composition result face key point information " (9.8,19.5) " (coordinate of the face key point i.e. after error compensation).

The method that embodiment of the disclosure provides then is inputted target facial image by obtaining target facial image In advance in trained face key point identification model, generates face key point information corresponding to target facial image and face closes Deviation corresponding to key point information is finally based on face key point information generated and deviation, generates target facial image institute Corresponding result face key point information, so as to the face key point information that is exported based on face key point identification model and Deviation generates more accurate result face key point information, improves the accuracy of face key point identification.

With further reference to Fig. 5, as the realization to method shown in above-mentioned Fig. 2, present disclose provides one kind for generating mould One embodiment of the device of type, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the present embodiment includes: acquiring unit 501 and training unit for generating the device 500 of model 502.Wherein, acquiring unit 501 is configured to obtain training sample set, wherein training sample includes sample facial image and needle The sample face key point information that sample facial image is marked in advance；Training unit 502 is configured to concentrate from training sample Training sample is chosen, and executes following training step: the sample facial image in selected training sample being inputted initial The feature extraction layer of neural network obtains characteristics of image；Characteristics of image obtained is inputted to the first son of initial neural network Network generates the face key point information of sample facial image；Face key point information generated and characteristics of image are inputted Second sub-network of initial neural network, obtains deviation corresponding to face key point information；Based on face key point information and Sample face key point information in training sample, determines expectation deviation corresponding to face key point information generated；Base The deviation corresponding to face key point information and desired deviation, determine whether initial neural network trains completion；In response to true Fixed training is completed, and the initial neural network that training is completed is determined as face key point identification model.

In the present embodiment, for generate model device 500 acquiring unit 501 can by wired connection mode or Person's radio connection obtains training sample set.Wherein, training sample includes sample facial image and for sample facial image The sample face key point information marked in advance.Sample facial image can be to carry out shooting figure obtained to sample face Picture.Sample face key point information for characterizing position of the sample face key point in sample facial image, may include but It is not limited at least one of following: number, text, symbol, image.

In practice, face key point can be point crucial in face, specifically, can be influence face mask or five The point of official's shape.

In the present embodiment, the training sample set obtained based on acquiring unit 501, training unit 502 can be from training samples This concentration selection training sample, and the following training step of execution (step 5021- step 5026):

Step 5021, the sample facial image in selected training sample is inputted to the feature extraction of initial neural network Layer obtains characteristics of image.

Wherein, characteristics of image can be the features such as color, the shape of image.Initial neural network is predetermined, use In the various neural networks (such as convolutional neural networks) for generating face key point identification model.Face key point identification model can With face key point corresponding to facial image for identification.Herein, initial neural network can be unbred nerve Network, or do not train by training but the neural network completed.Specifically, initial neural network includes feature extraction Layer.Feature extraction layer is used to extract the characteristics of image of inputted facial image.

Step 5022, characteristics of image obtained is inputted to the first sub-network of initial neural network, generates sample face The face key point information of image.

Step 5023, face key point information generated and characteristics of image are inputted to the second subnet of initial neural network Network obtains deviation corresponding to face key point information.

Step 5024, based on the sample face key point information in face key point information and training sample, determination is given birth to At face key point information corresponding to expectation deviation.

Herein, training unit 502 can determine face key point information that the first sub-network generates and mark in advance The difference of sample face key point information, and then identified difference is determined as it is expected deviation.

Step 5025, based on deviation corresponding to face key point information and desired deviation, determine that initial neural network is No training is completed.

Herein, training unit 502 can use preset loss function and calculate obtained face key point information institute Difference between corresponding deviation and expectation deviation, and then determine whether the difference being calculated is less than or equal to default difference threshold Value determines that initial neural metwork training is completed in response to determining that the difference being calculated is less than or equal to default discrepancy threshold.

Step 5026, in response to determining that training is completed, the initial neural network that training is completed is determined as face key point Identification model.

In the present embodiment, training unit 502 can be completed in response to the initial neural metwork training of determination, and training is completed Initial neural network be determined as face key point identification model.

In some optional implementations of the present embodiment, the second sub-network includes that the first generation layer and second generate Layer；And training unit 502 can be further configured to: face key point information generated is inputted the second sub-network First generation layer obtains hotspot graph corresponding to face key point information, wherein the image-region of hotspot graph includes set of values It closes, for the numerical value in numerical value set, the numerical value is for characterizing face key point in the probability of the numerical value position；By institute The hotspot graph and characteristics of image of acquisition input the second generation layer of the second sub-network, obtain inclined corresponding to face key point information Difference.

In some optional implementations of the present embodiment, device 500 can also include: that adjustment unit (does not show in figure Out), it is configured in response to determine that initial neural network not complete by training, adjusts the relevant parameter in initial neural network, from Training sample is concentrated and chooses training sample in unselected training sample, and uses the initial nerve net of the last adjustment Network and the last training sample chosen, continue to execute training step.

It is understood that all units recorded in the device 500 and each step phase in the method with reference to Fig. 2 description It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 500 and its In include unit, details are not described herein.

The face key point identification model generated of device provided by the above embodiment 500 of the disclosure can be predicted simultaneously The face key point information of facial image, and the deviation of face key point information predicted help to utilize model prediction Face key point information and deviation, generate more accurate result face key point information, realize that more accurate face closes The detection of key point.

With further reference to Fig. 6, as the realization to method shown in above-mentioned Fig. 4, present disclose provides one kind for handling people One embodiment of the device of face image, the Installation practice is corresponding with embodiment of the method shown in Fig. 4, which specifically may be used To be applied in various electronic equipments.

As shown in fig. 6, the device 600 for handling facial image of the present embodiment includes: image acquisition unit 601, the One generation unit 602 and the second generation unit 603.Wherein, image acquisition unit 601 is configured to obtain target facial image； First generation unit 602 is configured to input target facial image and be generated using method described in the corresponding embodiment of Fig. 2 Face key point identification model in, generate target facial image corresponding to face key point information and face key point information Corresponding deviation；Second generation unit 603 is configured to generate target based on face key point information generated and deviation Result face key point information corresponding to facial image.

It in the present embodiment, can be by wired company for handling the image acquisition unit 601 of the device 600 of facial image It connects mode or radio connection obtains target facial image.Wherein, target facial image can be for carry out face to it The facial image of key point identification.Face key point is crucial point in face, specifically, can for influence face mask or The point of face shape.

In the present embodiment, the target facial image obtained based on image acquisition unit 601, the first generation unit 602 can To generate people corresponding to target facial image in target facial image input face key point identification model trained in advance Deviation corresponding to face key point information and face key point information.Wherein, face key point information is for characterizing face key Position of the point in target facial image can include but is not limited at least one of following: text, number, symbol, image.People Deviation corresponding to face key point information, which can be used for characterizing, predicts target facial image institute using face key point identification model Prediction error when corresponding face key point information.

In the present embodiment, the face key point information and deviation generated based on the first generation unit 602, second generates list Result face key point information corresponding to target facial image can be generated in member 603.Wherein, as a result face key point information is Face key point information after carrying out error compensation to the face key point information of face key point identification model output.

It is understood that all units recorded in the device 600 and each step phase in the method with reference to Fig. 4 description It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 600 and its In include unit, details are not described herein.

The device provided by the above embodiment 600 of the disclosure is by obtaining target facial image, then by target face figure As in input face key point identification model trained in advance, generate face key point information corresponding to target facial image and Deviation corresponding to face key point information is finally based on face key point information generated and deviation, generates target face Result face key point information corresponding to image, so as to the face key point exported based on face key point identification model Information and deviation generate more accurate result face key point information, improve the accuracy of face key point identification.

Below with reference to Fig. 7, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1 Server or terminal device) 700 structural schematic diagram.Terminal device in embodiment of the disclosure can include but is not limited to all As mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable Formula multimedia player), the mobile terminal and such as number TV, desk-top meter of car-mounted terminal (such as vehicle mounted guidance terminal) etc. The fixed terminal of calculation machine etc..Terminal device or server shown in Fig. 7 are only an example, should not be to the implementation of the disclosure The function and use scope of example bring any restrictions.

As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.) 701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708 Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704. Input/output (I/O) interface 705 is also connected to bus 704.

In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 706 of head, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 707 of dynamic device etc.；Storage device 708 including such as tape, hard disk etc.；And communication device 709.Communication device 709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.Each box shown in Fig. 7 can represent a device, can also root According to needing to represent multiple devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708 It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with It is computer-readable signal media or computer readable storage medium either the two any combination.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have The electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device Either device use or in connection.And in embodiment of the disclosure, computer-readable signal media may include In a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program code.It is this The data-signal of propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate Combination.Computer-readable signal media can also be any computer-readable medium other than computer readable storage medium, should Computer-readable signal media can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on computer-readable medium can transmit with any suitable medium, Including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more When a program is executed by the electronic equipment, so that the electronic equipment: obtaining training sample set, wherein training sample includes sample Facial image and the sample face key point information marked in advance for sample facial image；It is concentrated from training sample and chooses training Sample, and execute following training step: the sample facial image in selected training sample is inputted into initial neural network Feature extraction layer, obtain characteristics of image；Characteristics of image obtained is inputted to the first sub-network of initial neural network, is generated The face key point information of sample facial image；Face key point information generated and characteristics of image are inputted into initial nerve net Second sub-network of network obtains deviation corresponding to face key point information；Based in face key point information and training sample Sample face key point information, determine expectation deviation corresponding to face key point information generated；Based on face key Deviation and desired deviation corresponding to point information, determine whether initial neural network trains completion；In response to determining that training is completed, The initial neural network that training is completed is determined as face key point identification model.

In addition, when said one or multiple programs are executed by the electronic equipment, it is also possible that the electronic equipment: obtaining Take target facial image；By in target facial image input face key point identification model trained in advance, target face is generated Deviation corresponding to face key point information and face key point information corresponding to image；Based on face key point generated Information and deviation generate result face key point information corresponding to target facial image.

The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof The computer program code of work, described program design language include object oriented program language-such as Java, Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor Including acquiring unit and training unit.Wherein, the title of these units is not constituted to the unit itself under certain conditions It limits, for example, acquiring unit is also described as " obtaining the unit of training sample set ".

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for generating model, comprising:

Obtain training sample set, wherein training sample includes sample facial image and marks in advance for sample facial image Sample face key point information；

It is concentrated from the training sample and chooses training sample, and execute following training step: will be in selected training sample Sample facial image input the feature extraction layer of initial neural network, obtain characteristics of image；Characteristics of image obtained is defeated Enter the first sub-network of initial neural network, generates the face key point information of sample facial image；Face generated is closed Key point information and characteristics of image input the second sub-network of initial neural network, obtain inclined corresponding to face key point information Difference；Based on the sample face key point information in face key point information and training sample, face key point generated is determined Expectation deviation corresponding to information；Based on deviation corresponding to face key point information and desired deviation, initial nerve net is determined Whether network trains completion；In response to determining that training is completed, the initial neural network that training is completed is determined as face key point and is known Other model.

2. according to the method described in claim 1, wherein, the second sub-network includes the first generation layer and the second generation layer；And

Second sub-network that face key point information generated and characteristics of image are inputted to initial neural network obtains people Deviation corresponding to face key point information, comprising:

Face key point information generated is inputted to the first generation layer of the second sub-network, it is right to obtain face key point information institute The hotspot graph answered, wherein the image-region of hotspot graph includes numerical value set, and for the numerical value in numerical value set, which is used for Face key point is characterized in the probability of the numerical value position；

The second generation layer that hotspot graph obtained and characteristics of image are inputted to the second sub-network, obtains face key point information institute Corresponding deviation.

3. according to the method described in claim 1, wherein, the method also includes:

In response to the initial neural network of determination, training is not completed, and the relevant parameter in initial neural network is adjusted, from the training Choose training sample in sample set in unselected training sample, and using the last adjustment initial neural network and The training sample that the last time is chosen, continues to execute the training step.

4. a kind of method for handling facial image, comprising:

Obtain target facial image；

The face key point identification that target facial image input is generated using the method as described in one of claim 1-3 In model, generate inclined corresponding to face key point information corresponding to the target facial image and face key point information Difference；

Based on face key point information generated and deviation, the key of result face corresponding to the target facial image is generated Point information.

5. a kind of for generating the device of model, comprising:

Acquiring unit is configured to obtain training sample set, wherein training sample includes sample facial image and for sample people The sample face key point information that face image marks in advance；

Training unit is configured to concentrate from the training sample and chooses training sample, and executes following training step: by institute Sample facial image in the training sample of selection inputs the feature extraction layer of initial neural network, obtains characteristics of image；By institute The characteristics of image of acquisition inputs the first sub-network of initial neural network, generates the face key point information of sample facial image； Face key point information generated and characteristics of image are inputted to the second sub-network of initial neural network, obtain face key point Deviation corresponding to information；Based on the sample face key point information in face key point information and training sample, determination is given birth to At face key point information corresponding to expectation deviation；Based on deviation corresponding to face key point information and desired deviation, Determine whether initial neural network trains completion；In response to determining that training is completed, the initial neural network that training is completed is determined For face key point identification model.

6. device according to claim 5, wherein the second sub-network includes the first generation layer and the second generation layer；And

The training unit is further configured to:

7. device according to claim 5, wherein described device further include:

Adjustment unit is configured in response to determine that initial neural network not complete by training, adjusts the phase in initial neural network Parameter is closed, is concentrated from the training sample and chooses training sample in unselected training sample, and is adjusted using the last Whole initial neural network and the last training sample chosen, continue to execute the training step.

8. a kind of for handling the device of facial image, comprising:

Image acquisition unit is configured to obtain target facial image；

First generation unit is configured to input the target facial image using the side as described in one of claim 1-3 In the face key point identification model that method generates, face key point information and face corresponding to the target facial image are generated Deviation corresponding to key point information；

Second generation unit is configured to generate the target face figure based on face key point information generated and deviation As corresponding result face key point information.

9. a kind of electronic equipment, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-4.

10. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Method as described in any in claim 1-4.