CN111354077B - Binocular vision-based three-dimensional face reconstruction method - Google Patents

Binocular vision-based three-dimensional face reconstruction method Download PDF

Info

Publication number
CN111354077B
CN111354077B CN202010134543.8A CN202010134543A CN111354077B CN 111354077 B CN111354077 B CN 111354077B CN 202010134543 A CN202010134543 A CN 202010134543A CN 111354077 B CN111354077 B CN 111354077B
Authority
CN
China
Prior art keywords
face
parallax
points
point
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010134543.8A
Other languages
Chinese (zh)
Other versions
CN111354077A (en
Inventor
达飞鹏
夏颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010134543.8A priority Critical patent/CN111354077B/en
Publication of CN111354077A publication Critical patent/CN111354077A/en
Application granted granted Critical
Publication of CN111354077B publication Critical patent/CN111354077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20101Interactive definition of point of interest, landmark or seed

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a binocular vision-based three-dimensional face reconstruction method, which comprises the following steps: a left camera and a right camera are used for shooting a face image respectively at different visual angles at the same time, epipolar rectification is carried out on the face image by utilizing the internal and external parameters of the cameras, and the brightness of the rectified face image is balanced; dividing the face into 6 regions according to the contour feature points on the image and calculating the parallax range of each region; extracting Canny characteristic points on the image, matching the Canny characteristic points within the limit of parallax ranges of different areas, and carrying out left-right consistency inspection; optimizing the quality and quantity of the seed points by adopting a mode of fitting a general quadric surface according to the local shape characteristics of the human face; obtaining a dense face disparity map through region growing and disparity optimization; and obtaining three-dimensional point cloud of the face according to the camera parameters and the dense face disparity map, and obtaining a three-dimensional face model through point cloud meshing and texture mapping. The method has the advantages of simple and efficient algorithm, small scene limitation, simple equipment and higher robustness.

Description

Binocular vision-based three-dimensional face reconstruction method
Technical Field
The invention relates to a method for reconstructing a three-dimensional face model by using face images shot by a binocular camera, belonging to the technical field of computer vision.
Background
At present, the common methods for obtaining the three-dimensional face model mainly comprise laser three-dimensional scanning and structured light three-dimensional scanning, and although the three-dimensional face model reconstructed by the method has high precision, the method has the defects of expensive hardware equipment, complex application scene and difficulty in wide application in a common environment. The stereo matching algorithm based on binocular vision can also obtain human face three-dimensional point cloud data, establish a more accurate three-dimensional human face model, and does not need to project a grating to a measured object.
The principle of the stereo matching algorithm is that a disparity value corresponding to a pixel point is obtained by searching pixel points matched with each other in a left image and a right image, and then the disparity is restored to depth information of the pixel point in a world coordinate system through a triangularization function, so that a three-dimensional model is reconstructed, and the three-dimensional model is mainly divided into two categories of global stereo matching and local stereo matching. The global stereo matching method establishes a global energy function and solves an optimal solution based on algorithms such as dynamic programming, image segmentation, belief propagation and the like. Each pixel point in the image is subjected to global constraint of a parallax value and pixel information when being matched, and the method has the advantages of high matching precision, high calculation cost and difficulty in application in an actual scene. The local stereo matching algorithm calculates the matching cost among the pixel points through cost functions such as absolute error sum, error square sum, normalized product and the like, selects a local window optimization cost value around the pixel points according to different cost aggregation strategies, and finally selects the optimal matching point by point through a Winner-Takes-All criterion. Because only the constraint in the local range of the pixel point is utilized, the error matching rate of the local stereo matching method is high, especially in the low texture area of the image, but the calculation is simple and the real-time performance is high. Different from a common scene image, the face image has a large low-texture area, most pixel points in the face image have a plurality of highly similar matching points, and the traditional stereo matching algorithm establishes a matching relation between each point to be matched in the image and another image in the whole epipolar line range, so that the overall or local stereo matching algorithm is easy to cause mismatching.
In the method for reconstructing the three-dimensional face model based on binocular stereo matching, the region growing algorithm is more in application and the actual result is better. In addition, due to the low texture characteristic of the face image, the matching accuracy can also be effectively improved by reducing the parallax searching range of each pixel point in the actual matching process, and the common method is to extract the contour feature points of the face through a cascade regression tree algorithm and calculate the parallax value of each contour feature point so as to determine the whole face parallax range. The following documents, namely Face stereo matching and disparity calculation in stereo vision system [ C ]//2010 2nd International Conference on Industrial and Information systems, dalian, IEEE Press, 2010. The document is based on a three-dimensional face reconstruction method of document binocular stereo vision of a region growing algorithm [ J ]. Intelligent system report, 2009,4 (6): 513-520. A parallax search range is not limited, pixel points are matched in the whole polar line range to obtain seed points, the precision of the seed points is not high and the quantity of the seed points is too small, when the seed points have more mismatching, the mismatching points are transmitted to a surrounding pixel point group to be matched when the region grows, when the quantity of the seed points is less, accumulated mismatching of the region growth is easily caused, the stereo matching of the whole face part is difficult to guide, and finally, a face parallax image with high signal to noise ratio is difficult to generate. Therefore, the traditional face stereo matching method based on region growing has various defects of less seed point quantity, low seed point accuracy and the like, influences the final parallax region growing result, and is difficult to reconstruct a more accurate three-dimensional face model.
Disclosure of Invention
The technical problem is as follows: aiming at the problems in the prior art, the invention aims to provide a binocular vision-based three-dimensional face reconstruction method, which divides a face into different regions to respectively limit parallax ranges of the different regions and optimizes the quality and quantity of seed points in a mode of fitting a general quadric surface.
The technical scheme is as follows: a binocular vision-based three-dimensional face reconstruction method comprises the following steps
The method comprises the following steps:
step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles at the same time, and carrying out binocular calibration on the two cameras to obtain internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, the two face images after the polar line correction are subjected to brightness balance, and the image after the brightness balance corresponding to the left camera is recorded as I L And the image with balanced brightness corresponding to the right camera is marked as I R
And 2, step: using cascaded regression treesBy extracting I separately L And I R The upper 68 contour feature points are extracted, and I is respectively calculated according to the extracted 68 contour feature points L And I R Dividing a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region, marking the rest face regions except the five regions as non-five sense organ regions, and respectively calculating parallax ranges of the six regions;
and step 3: respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the whole face and are matched with I within the limitation of the parallax range of each region obtained in the step 2 L And I R Screening out the Canny feature points which are mismatched by adopting left and right consistency test, marking the Canny feature points which pass the left and right consistency test as initial seed points, and taking the rest unmatched face pixel points except the initial seed points as non-seed points;
and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting the non-seed points in a certain proportion to become newly added seed points; carrying out left-right consistency check on the initial seed points and the newly added seed points which are left after screening, wherein the seed points passing the check become seed points for regional growth;
and 5: using a region growing algorithm to take the seed point obtained in the step (4) as a center to diverge outwards, and matching the remaining face pixel points which are not matched to obtain the parallax value of each face pixel point; optimizing the obtained parallax value according to a parallax optimization algorithm to obtain a dense face parallax image;
step 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping.
The step 1 specifically comprises the following steps:
step 1.1: respectively shooting a human face image at different viewing angles by using a left camera and a right camera;
step 1.2: performing binocular calibration on the left camera and the right camera, and respectively calculating internal reference matrixes K of the left camera and the right camera L 、K R And an external reference rotation matrix R L 、R R And a translation vector t L 、t R
Step 1.3: polar line correction is carried out on the two face images shot in the step 1.1 by utilizing the internal reference matrix and the external reference rotation matrix of the two cameras;
step 1.4: and carrying out brightness equalization on the two human face images after the polar line correction.
The step 2 is specifically as follows:
step 2.1: respectively extracting I by using cascade regression tree algorithm L And I R 68 contour feature points around the upper eyebrows, eyes, nose, upper and lower lips, and outer contour of the face;
step 2.2: respectively calculating the parallax values, I, of 68 profile feature points L Upper m-th contour feature point f m Has a parallax value of
Figure BDA0002396862930000031
Wherein m ∈ {1, \8230;, 68}, (x) m ,y m ) Is f m (x) pixel coordinates of (c) m ′,y m ') is I R Upper m-th contour feature point f m ' pixel coordinates;
step 2.3: dividing the human face into a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region by using 68 contour characteristic points, wherein the parts of the whole human face except the left eye region, the right eye region, the nose region, the upper lip region and the lower lip region are non-five-sense-organ regions, all the regions are closed regions, and the boundaries of all the regions are continuous and closed contour lines with single pixel width;
step 2.4: respectively determining the parallax ranges of the 6 regions in the step 2.3, wherein the parallax ranges are not in the five sense organ regionParallax range D face Comprises the following steps:
D face =[min(d(f m )),max(d(f m ))],m∈{1,…,68},
parallax range D of left eye region leye Comprises the following steps:
D leye =[min(d(f m )),max(d(f m ))],m∈{37,...,42},
parallax range D of right eye region reye Comprises the following steps:
D reye =[min(d(f m )),max(d(f m ))],m∈{43,...,48},
parallax range D of nasal region nose Comprises the following steps:
D nose =[min(d(f m )),max(d(f m ))],m∈{28,...,36},
parallax range D of the upper lip region umouth Comprises the following steps:
D umouth =[min(d(f m )),max(d(f m ))],m∈{49,...,55,62,63,64},
parallax range D of lower lip region dmouth Comprises the following steps:
D dmouth =[min(d(f m )),max(d(f m ))],m∈{56,...,61,65,...,68}。
the step 3 is specifically as follows:
step 3.1: respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the whole face;
step 3.2: match I L Canny feature points and I of R In the process of the Canny feature point, the matching cost is measured by adopting an NCC cost function:
Figure BDA0002396862930000041
wherein C (x, y, d) is I L The matching cost of the Canny feature point with the upper pixel coordinate of (x, y), d is the disparity value of (x, y), N (x, y) is a neighborhood window with (x, y) as the center, I L (I, j) represents I L The gray value of the pixel point (i, j) in the upper N (x, y),
Figure BDA0002396862930000042
represents I L Mean value of gray values of all pixel points in the upper N (x, y), I R (I-d, j) represents I R The gray value of the upper pixel point (i-d, j),
Figure BDA0002396862930000043
represents I R Taking (x-d, y) as the mean value of the gray values of all the pixel points in the neighborhood window;
step 3.3: for I L C (x, y) of Canny feature point above, and calculating the sum of the c (x, y) and the I R All Canny feature points in the parallax range corresponding to the face area where the c (x, y) is located and the NCC matching cost of two adjacent points on the left and right of each Canny feature point;
step 3.4: for I R Each Canny feature point c ' (x ', y ') above, which is calculated with I L All Canny feature points in the parallax range corresponding to the face region where the c ' (x ', y ') is located, and the NCC matching cost of two adjacent points around each Canny feature point;
step 3.5: according to the Winner-Takes-All criterion for I L And I R Selecting a pixel point with the matching cost closest to 1 as a matching point of each Canny feature point, and completing the matching of the point;
step 3.6: and completing the matching of all Canny feature points, and carrying out left-right consistency test on the matching result, wherein the Canny feature points passing the left-right consistency test become initial seed points.
The step 4 is specifically as follows:
step 4.1: according to the local shape characteristics of the human face, the nth initial seed point s obtained in the step 3.6 is used n Selecting a fitting window W for the center n
And 4.2: w is to be n All initial seed points in the set G n In G with n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points;
step 4.3: screening out G n Initial seed points with the difference between the inner parallax value and the fitting parallax value exceeding a threshold value T;
step 4.4: after all the fitting windows are fitted, calculating a fitting disparity value of each non-seed point according to a general quadric surface obtained by fitting;
step 4.5: calculating the NCC matching cost of each non-seed point in 4.4 under All fitting parallax values, selecting an optimal matching point according to a Winner-Takes-All rule, and recording the corresponding matching cost as the most matched cost;
step 4.6: arranging the optimal matching costs of all the non-seed points in the step 4.5 in a descending order, and selecting the first 40% of the non-seed points as newly-added seed points;
step 4.7: and performing left-right consistency check on the matching results of the screened residual initial seed points and the newly added seed points, and forming region growing seed points through the left-right consistency check.
The step 5 is specifically as follows:
step 5.1: to I L On each face pixel point p to be matched k Determining p according to the region growing principle and the actual parallax range of each region in the step 2 k The final disparity search range of (a);
step 5.2: to p is p k Searching the optimal matching point in the parallax searching range;
step 5.3: with p k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window with a general quadric surface, and setting p k Corresponding fitting disparity value is d fit Calculating only d fit ,d fit Selecting an optimal matching point according to a Winner-Takes-All criterion for the matching cost of the point under +/-1;
step 5.4: and optimizing the parallax value according to a parallax optimization algorithm to obtain a final dense face parallax image.
The step 6 is specifically as follows:
step 6.1: according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate (u) of the pixel point 1 ,v 1 ) Calculating the human face parallax in the step 5 by adopting a stereo matching triangularization formulaThree-dimensional coordinates (X, Y, Z) of each pixel of the graph having a disparity value:
Figure BDA0002396862930000061
step 6.2: based on texture mapping and point cloud meshing, using I L And reconstructing textures of the three-dimensional face point cloud to obtain a final three-dimensional face model.
Advantageous effects
Compared with the prior art, the invention has the following remarkable advantages: the method overcomes the defect that the traditional stereo matching method is difficult to reconstruct the three-dimensional face model with higher accuracy; the parallax range of each region of the human face is simply and efficiently limited through the division of the facial region based on the contour feature points, and the accuracy of the matching process is improved; a new method for optimizing the quality and quantity of seed points based on surface fitting is defined.
Drawings
FIG. 1 is a complete flow chart of the present invention;
FIG. 2 is a schematic view of an image acquisition system;
FIG. 3 shows distribution positions and sequence numbers of 68 contour feature points extracted from a cascade regression tree;
FIG. 4 is a schematic view of a facial region partition;
FIG. 5 is a schematic diagram of searching matching points by the stereo matching algorithm, wherein (a) is a left image of a face image, and (b) is a right image of the face image;
FIG. 6 is a schematic diagram of the stereo matching left and right consistency check;
FIG. 7 is a schematic view of a surface fitting window;
FIG. 8 is a graph of the results of a fit to the initial seed points within the fitting window of FIG. 7;
FIG. 9 is a schematic diagram of overlapping fitting windows corresponding to different seed points;
fig. 10 is a schematic diagram of a single seed point restricted parallax search range when a region is increased, wherein (a) corresponds to a left image of a face image, and (b) corresponds to a right image of the face image;
fig. 11 is a schematic diagram illustrating a parallax search range limited by multiple sub-points when a region grows, wherein (a) corresponds to a left image of a face image, and (b) corresponds to a right image of the face image;
fig. 12 is a schematic diagram of a three-dimensional face reconstruction experiment result, in which (a) is a left image of a face image, (b) to (d) are face disparity maps obtained by ST, GF, and NL algorithms, respectively, (e) is a face disparity map obtained by the algorithm of the present invention, and (f) is a three-dimensional face model reconstructed by the algorithm herein.
Detailed Description
The technical solution of the present invention will be further described in detail with reference to the following examples and accompanying drawings.
1. The whole experimental process is as follows:
a binocular vision-based three-dimensional face reconstruction method is shown in figure 1 and comprises the following specific steps:
step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles, carrying out binocular calibration on the two cameras, and acquiring internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, the two face images after the polar line correction are subjected to brightness balance, and the image after the brightness balance corresponding to the left camera is recorded as I L And recording the image with balanced brightness corresponding to the right camera as I R
Step 2: respectively extracting I by adopting cascade regression tree algorithm L And I R The upper 68 contour feature points are extracted, and I is respectively calculated according to the extracted 68 contour feature points L And I R Dividing a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region, marking the rest face regions except the five regions as non-five sense organ regions, and respectively calculating parallax ranges of the six regions;
and step 3: respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the whole face and are matched with I within the limitation of the parallax range of each region obtained in the step 2 L And I R And screening out the mismatched Canny feature points by adopting left and right consistency tests, and marking the Canny feature points passing the left and right consistency tests as initial seed pointsThe other unmatched face pixel points except the initial seed point are non-seed points;
and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value of each initial seed point, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting a certain proportion of non-seed points to become newly-added seed points; carrying out left-right consistency inspection on the remaining initial seed points and the newly added seed points after screening, wherein the seed points passing the inspection become seed points for regional growth;
and 5: using a region growing algorithm to take the seed point obtained in the step (4) as a center to diverge outwards, and matching the remaining face pixel points which are not matched to obtain the parallax value of each face pixel point; optimizing the obtained parallax value according to a parallax optimization algorithm to obtain a dense face parallax image;
step 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping. .
2. Acquiring a human face image pair:
the method uses a left camera and a right camera with the same model to respectively shoot a human face image at different visual angles, and the optical centers of the two cameras are O L And O R The linear distance between the optical centers of the two cameras is a base line b, and the imaging planes of the two cameras are S L And S R Setting a point p on the face to be S under the world coordinate L And S R The projected points on are respectively p L And p R ,p L And p R I.e. the pairs of pixel points that match each other, as shown in fig. 2; performing binocular calibration on the two cameras, and respectively calculating internal reference matrixes K of the left camera and the right camera L 、K R And an external reference rotation matrix R L 、R R And a translation vector t L 、t R . The internal reference matrix is in the form of
Figure BDA0002396862930000081
Where f is the focal length of the camera, dx and dy are the pixel sizes, u 0 And v 0 Is the coordinate value of the center of the image; then, performing epipolar line correction on the two collected face images by using the internal and external parameters obtained by calibration; finally, according to the document Midway image equalisation [ J]119-134. Luminance equalization is performed on the two facial images after polar line correction, and the image with equalized luminance corresponding to the left is recorded as I L And recording the image with balanced brightness corresponding to the right camera as I R
2. Dividing the face area and solving the parallax range of different areas of the face:
respectively extracting I by adopting cascade regression tree algorithm L And I R Upper 68 contour feature points, as shown in FIG. 3, I L Top mth contour feature point f m Has a parallax value of
Figure BDA0002396862930000082
Where m ∈ { 1., 68}, (x) m ,y m ) Is f m (x) pixel coordinates of (c) m ′,y m ') is I R Top mth contour feature point f m ' pixel coordinates. The human face is divided into a left eye area, a right eye area, a nose area, an upper lip area and a lower lip area by using 68 contour feature points, the areas of the whole human face area except the left eye area, the right eye area, the nose area, the upper lip area and the lower lip area are non-five-sense-organ areas, the total number of the six face areas is six, as shown in fig. 4, all the areas are closed areas, and the boundaries of all the areas are continuous and closed contour lines with single pixel width.
The parallax ranges of the six face regions are calculated respectively, wherein the parallax range D of the non-five sense organ region face Comprises the following steps:
D face =[min(d(f m )),max(d(f m ))],m∈{1,...,68},
parallax range D of left eye region leye Comprises the following steps:
D leye =[min(d(f m )),max(d(f m ))],m∈{37,…,42},
parallax range D of right eye region reye Comprises the following steps:
D reye =[min(d(f m )),max(d(f m ))],m∈{43,…,48},
parallax range D of the nasal region nose Comprises the following steps:
D nose =[min(d(f m )),max(d(f m ))],m∈{28,…,36},
parallax range D of the upper lip region umouth Comprises the following steps:
D umouth =[min(d(f m )),max(d(f m ))],m∈{49,...,55,62,63,64},
parallax range D of lower lip region dmouth Comprises the following steps:
D dmouth =[min(d(f m )),max(d(f m ))],m∈{56,...,61,65,…,68}。
3. obtaining an initial seed point pair:
respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the face.
For I L C (x, y) of Canny, and calculating the sum of the c (x, y) and the I R All Canny feature points in the parallax range corresponding to the face area where the C (x, y) is located, and the NCC matching cost C (x, y, d) of two adjacent points around each Canny feature point:
Figure BDA0002396862930000091
wherein C (x, y, d) is I L Matching cost of Canny feature point with upper pixel coordinate of (x, y), d is parallax value of (x, y), N (x, y) is neighborhood window taking (x, y) as center, I L (I, j) represents I L The gray value of the pixel point (i, j) in the upper N (x, y),
Figure BDA0002396862930000092
represents I L Mean value of gray values of all pixel points in the upper N (x, y), I R (I-d, j) represents I R The gray value of the upper pixel point (i-d, j),
Figure BDA0002396862930000093
represents I R And (d) calculating the gray value average value of all pixel points in the neighborhood window taking (x-d, y) as the center.
For I R Each Canny feature point c ' (x ', y ') above, which is calculated with I L And all Canny feature points in the parallax range corresponding to the face region where the C ' (x ', y ') is located and the NCC matching cost C (x ', y ', d) of two points adjacent to the left and right of each Canny feature point.
Determination of I by means of the Winner-Takes-All criterion L And I R Matching points of each Canny feature point; as shown in (a) and (b) of fig. 5, if c (x, y) and c ' (x ', y) match with each other, the parallax value corresponding to the point c (x, y) is D = x-x ', D in the figure is the parallax range corresponding to the face region where c (x, y) is located, c ″ (x, y) is the position map of c (x, y) in the left figure in the right figure, and the two points have the same coordinate value.
Performing left-right consistency check on the matching result of the Canny feature points, wherein the Canny feature points passing the left-right consistency check become initial seed points, fig. 6 is a schematic diagram of left-right consistency check principle, and two solid lines are I respectively L And I R Two middle and two corresponding polar lines, pair I L And matching the point p (x, y), wherein possible matching points are located between line segments AB, AB is the parallax range of the face area where p (x, y) is located, the optimal matching point of p (x, y) is p '(x', y), then matching p '(x', y), possible matching points are located between line segments EF, EF is the parallax range of the face area where p '(x', y) is located, and if the optimal matching point of p '(x', y) is also p (x, y), the matching of p (x, y) passes the left-right consistency check.
4. Screening out mismatching seed points:
according to the local shape characteristics of the human face, the nth initial seed point s is used n Selecting a fitting window W for the center n (ii) a W is to be n All the initial seed points in the set G n In G with n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points; screening out G n Initial seed points with the difference between the inner parallax value and the fitting parallax value exceeding a threshold value T; fig. 7 is a schematic diagram of a fitting window centered on an initial seed point on the left face, where the fitting window is a white square frame in the drawing, pixel points in the square frame that are not black in color are initial seed points, fig. 8 is a fitting result of the initial seed points in the fitting window shown in fig. 7, and gray points are mistakenly matched initial seed points that are filtered out.
5. Newly adding seed points:
the non-seed points located in a single fitting window will obtain a fitting disparity value according to a general quadric surface obtained by fitting when the initial seed points which are not matched are screened out, and the non-seed points located in a plurality of fitting windows will obtain a plurality of fitting disparity values due to the fact that overlapping possibly exists between the fitting windows which take different initial seed points as centers. As shown in FIG. 9, let a non-seed point q in the graph j Simultaneously located in the fitting window W 1 And fitting window W 2 Inner, q j At W 1 And W 2 The fitting disparity values obtained in the interior are respectively d 1j And d 2j . And calculating the NCC matching cost of each non-seed point under All fitting parallax values, selecting the optimal matching point according to a Winner-Takes-All criterion, sequencing the optimal matching costs corresponding to All the non-seed points, and selecting the non-seed points with a certain proportion to become newly added seed points.
6. Generating a dense face disparity map:
the principle of region growing in the stereo matching algorithm is to add range constraint and smooth constraint for matching of pixel points, and limit the parallax search range of the remaining pixel points to be matched by using the parallax of seed points. Let s L Is I L Seed point of (1), which is at R Upper corresponding matching point is s R Then and s L Adjacent points p to be matched L Is matched with the point p R Is inevitably at s R Is adjacent toA near zone. Through s L Known parallax of d L I.e. p can be reduced L The disparity search range of (1). The specific realization of the region growing algorithm is mainly divided into single seed point limited growth and multi-seed point limited growth, wherein the single seed point limited growth refers to the case of p L Has only one seed point s in the four neighborhoods L Then the disparity search range is only limited by s L Restriction, as shown in (a) and (b) in fig. 10. Multiple seed points restrict growth by indicating if p L Has a plurality of control points s in the four neighborhoods L1 ,s L2 Etc. then p L S as a disparity search window L1 S and L2 the intersection of the limit ranges of (a) and (b) in fig. 11. p is a radical of L The intersection of the corresponding parallax search window and the epipolar line is p L The final parallax searching range is calculated, and all pixel points and p in the range are calculated L And selecting the optimal matching point according to a Winner-Takes-All rule to become a new seed point and adding the new seed point into the seed point sequence until the matching of the whole face is completed. For each point p to be matched k And searching the optimal matching point in the parallax searching range. If the accumulated matching error occurs in the matching process, the current point p to be matched is used k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window to a general quadric surface, and setting p k Corresponding fitting disparity value is d fit Calculating only d fit ,d fit And selecting the optimal matching point according to the Winner-Takes-All criterion for the matching cost of the point under +/-1. Finally, according to a High-quality single-shot capture of facial geometry [ J ]]The method comprises the following steps of (1) ACM Transactions on Graphics,2010,29 (4) and (40) in a ratio of (1-40).
7. Obtaining a three-dimensional face model:
according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate value (u) of the pixel point 1 ,v 1 ) And calculating the three-dimensional coordinates of each pixel point with the parallax value of the human face parallax image by adopting a stereo matching triangularization formula:
Figure BDA0002396862930000111
based on texture mapping and point cloud meshing, utilizing I L And reconstructing textures of the three-dimensional face point cloud to obtain a final three-dimensional face model.
And (3) experimental comparison: in order to verify the feasibility and the effectiveness of the invention, the experimental result of the invention is compared with the existing ST stereo matching algorithm, GF stereo matching algorithm and NL stereo matching algorithm, as shown in (a) to (f) of fig. 12, wherein (a) of fig. 12 is a left image of a human face image, (b) to (d) of fig. 12 are human face disparity maps obtained by ST, GF and NL algorithms, respectively, (e) of fig. 12 is a human face disparity map obtained by the context algorithm, and (f) of fig. 12 is a three-dimensional human face model reconstructed by the context algorithm.
The invention provides a face region division algorithm based on contour feature points extracted by a cascade regression tree, and improves the matching accuracy of seed point acquisition and parallax region growth by limiting the parallax range in each region of the face. According to the local shape characteristics of the human face, the quality and the number of the seed points are optimized by adopting a mode of fitting a general quadric surface, local connection and smoothness constraint are added among the seed points, sufficient and reliable seed points are provided for region growth, a more accurate three-dimensional human face model can be obtained, the reconstruction result is smoother, the precision is improved, and the robustness is better.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions should be included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims (7)

1. A binocular vision-based three-dimensional face reconstruction method is characterized by comprising the following steps:
step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles at the same time, and carrying out binocular calibration on the two cameras to obtain internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, brightness balance is carried out on the two face images after the polar line correction, and the image after the brightness balance corresponding to the left camera is marked as I L And recording the image with balanced brightness corresponding to the right camera as I R
Step 2: respectively extracting I by adopting cascade regression tree algorithm L And I R The upper 68 contour feature points are extracted, and I is respectively calculated according to the extracted 68 contour feature points L And I R Dividing a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region, marking the rest face regions except the five regions as non-five sense organ regions, and respectively calculating parallax ranges of the six regions;
and step 3: respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the whole face and are matched with I within the limitation of the parallax range of each region obtained in the step 2 L And I R Screening out the Canny feature points which are mismatched by adopting left and right consistency test, marking the Canny feature points which pass the left and right consistency test as initial seed points, and taking the rest unmatched face pixel points except the initial seed points as non-seed points;
and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting a certain proportion of non-seed points to become newly-added seed points; carrying out left-right consistency inspection on the remaining initial seed points and the newly added seed points after screening, wherein the seed points passing the inspection become seed points for regional growth;
and 5: using a region growing algorithm to take the seed point obtained in the step (4) as a center to diverge outwards, and matching the remaining face pixel points which are not matched to obtain the parallax value of each face pixel point; optimizing the obtained parallax value according to a parallax optimization algorithm to obtain a dense face parallax image;
and 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping.
2. The binocular vision-based three-dimensional face reconstruction method according to claim 1, wherein the step 1 specifically comprises:
step 1.1: respectively shooting a human face image at different viewing angles by using a left camera and a right camera;
step 1.2: carrying out binocular calibration on the left camera and the right camera, and respectively calculating the internal reference matrixes K of the left camera and the right camera L 、K R And an external reference rotation matrix R L 、R R And a translation vector t L 、t R
Step 1.3: polar line correction is carried out on the two face images shot in the step 1.1 by utilizing the internal reference matrix and the external reference rotation matrix of the two cameras;
step 1.4: and carrying out brightness equalization on the two human face images after the polar line correction.
3. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 2 specifically comprises:
step 2.1: respectively extracting I by using cascade regression tree algorithm L And I R 68 contour feature points around the upper eyebrows, eyes, nose, upper and lower lips, and outer contour of the face;
step 2.2: respectively calculating the parallax values, I, of 68 profile feature points L Upper m-th contour feature point f m Has a parallax value of
Figure FDA0002396862920000021
Where m ∈ { 1., 68}, (x) m ,y m ) Is f m (x) pixel coordinates of (c) m ′,y m ') is I R Top mth contour feature point f m ' pixel coordinates;
step 2.3: dividing the human face into a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region by using 68 contour characteristic points, wherein the parts of the whole human face except the left eye region, the right eye region, the nose region, the upper lip region and the lower lip region are non-five-sense-organ regions, all the regions are closed regions, and the boundaries of all the regions are continuous and closed contour lines with single pixel width;
step 2.4: respectively obtaining the parallax ranges of the 6 areas in the step 2.3, wherein the parallax range D of the non-five sense organ area face Comprises the following steps:
D face =[min(d(f m )),max(d(f m ))],m∈{1,...,68},
parallax range D of left eye region leye Comprises the following steps:
D leye =[min(d(f m )),max(d(f m ))],m∈{37,...,42},
parallax range D of right eye region reye Comprises the following steps:
D reye =[min(d(f m )),max(d(f m ))],m∈{43,...,48},
parallax range D of nasal region nose Comprises the following steps:
D nose =[min(d(f m )),max(d(f m ))],m∈{28,...,36},
parallax range D of the upper lip region umouth Comprises the following steps:
D umouth =[min(d(f m )),max(d(f m ))],m∈{49,…,55,62,63,64},
parallax range D of lower lip region dmouth Comprises the following steps:
D dmouth =[min(d(f m )),max(d(f m ))],m∈{56,…,61,65,…,68}。
4. the binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 3 specifically comprises:
step 3.1: respectively extracting I by using Canny edge operator L And I R The extracted Canny feature points are uniformly distributed on the whole face;
step 3.2: matching I L Canny feature points and I of R In the above Canny feature point process, the matching cost is measured by using the NCC cost function:
Figure FDA0002396862920000031
wherein C (x, y, d) is I L Matching cost of Canny feature point with upper pixel coordinate of (x, y), d is parallax value of (x, y), N (x, y) is neighborhood window taking (x, y) as center, I L (I, j) represents I L The gray value of the pixel point (i, j) in the upper N (x, y),
Figure FDA0002396862920000032
represents I L Mean value of gray values of all pixel points in the upper N (x, y), I R (I-d, j) represents I R The gray value of the upper pixel point (i-d, j),
Figure FDA0002396862920000033
represents I R Taking (x-d, y) as the mean value of the gray values of all the pixel points in the neighborhood window;
step 3.3: for I L C (x, y) of Canny feature point above, and calculating the sum of the c (x, y) and the I R All Canny feature points in the parallax range corresponding to the face area where the c (x, y) is located and the NCC matching cost of two adjacent points on the left and right of each Canny feature point;
step 3.4: for I R Each Canny feature point c ' (x ', y ') above, which is calculated with I L All Canny feature points in the parallax range corresponding to the face area where the c ' (x ', y ') is located and two points adjacent to the left and right of each Canny feature pointNCC matching cost;
step 3.5: according to the Winner-Takes-All criterion for I L And I R Selecting a pixel point with the matching cost closest to 1 as a matching point of each Canny feature point, and completing the matching of the point;
step 3.6: and completing the matching of all Canny feature points, and performing left-right consistency check on the matching result, wherein the Canny feature points passing the left-right consistency check become initial seed points.
5. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 4 specifically comprises:
step 4.1: according to the local shape characteristics of the human face, the nth initial seed point s obtained in the step 3.6 is used n Selecting a fitting window W for the center n
Step 4.2: w is to be n All initial seed points in the set G n In the order of G n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points;
step 4.3: screening out G n Initial seed points with the difference between the inner parallax value and the fitting parallax value exceeding a threshold value T;
step 4.4: after all the fitting windows are fitted, calculating a fitting disparity value of each non-seed point according to a general quadric surface obtained by fitting;
step 4.5: calculating the NCC matching cost of each non-seed point in 4.4 under All fitting parallax values, selecting an optimal matching point according to a Winner-Takes-All criterion, and recording the corresponding matching cost as the most matched cost;
step 4.6: arranging the optimal matching costs of all the non-seed points in the step 4.5 in a descending order, and selecting the first 40% of the non-seed points as newly added seed points;
step 4.7: and performing left-right consistency test on the matching results of the screened residual initial seed points and the newly added seed points, and obtaining region growing seed points through the left-right consistency test.
6. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 5 specifically comprises:
step 5.1: to I L On each face pixel point p to be matched k Determining p according to the region growing principle and the actual parallax range of each region in the step 2 k The final disparity search range of;
and step 5.2: to p is p k Searching the optimal matching point in the parallax searching range;
step 5.3: with p k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window to a general quadric surface, and setting p k Corresponding fitting disparity value is d fit Calculating only d fit ,d fit Selecting an optimal matching point according to a Winner-Takes-All criterion for the matching cost of the point under +/-1;
step 5.4: and optimizing the parallax value according to a parallax optimization algorithm to obtain a final dense face parallax image.
7. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 6 specifically comprises:
step 6.1: according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate (u) of the pixel point 1 ,v 1 ) And (3) calculating the three-dimensional coordinates (X, Y, Z) of each pixel point with the parallax value in the face parallax image in the step 5 by adopting a stereo matching triangularization formula:
Figure FDA0002396862920000041
step 6.2: based on texture mapping and point cloud meshing, using I L And reconstructing the texture of the three-dimensional face point cloud to obtain a final three-dimensional face model.
CN202010134543.8A 2020-03-02 2020-03-02 Binocular vision-based three-dimensional face reconstruction method Active CN111354077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010134543.8A CN111354077B (en) 2020-03-02 2020-03-02 Binocular vision-based three-dimensional face reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010134543.8A CN111354077B (en) 2020-03-02 2020-03-02 Binocular vision-based three-dimensional face reconstruction method

Publications (2)

Publication Number Publication Date
CN111354077A CN111354077A (en) 2020-06-30
CN111354077B true CN111354077B (en) 2022-11-18

Family

ID=71197198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010134543.8A Active CN111354077B (en) 2020-03-02 2020-03-02 Binocular vision-based three-dimensional face reconstruction method

Country Status (1)

Country Link
CN (1) CN111354077B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633096B (en) * 2020-12-14 2024-08-23 深圳云天励飞技术股份有限公司 Passenger flow monitoring method and device, electronic equipment and storage medium
CN113965742B (en) * 2021-02-28 2022-04-19 北京中科慧眼科技有限公司 Dense disparity map extraction method and system based on multi-sensor fusion and intelligent terminal
CN113592791B (en) * 2021-07-16 2024-02-13 华中科技大学 Contour stereo matching method and system based on local energy minimization
CN113450460A (en) * 2021-07-22 2021-09-28 四川川大智胜软件股份有限公司 Phase-expansion-free three-dimensional face reconstruction method and system based on face shape space distribution
CN116309442B (en) * 2023-03-13 2023-10-24 北京百度网讯科技有限公司 Method for determining picking information and method for picking target object

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866497A (en) * 2010-06-18 2010-10-20 北京交通大学 Binocular stereo vision based intelligent three-dimensional human face rebuilding method and system
CN106910222A (en) * 2017-02-15 2017-06-30 中国科学院半导体研究所 Face three-dimensional rebuilding method based on binocular stereo vision

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866497A (en) * 2010-06-18 2010-10-20 北京交通大学 Binocular stereo vision based intelligent three-dimensional human face rebuilding method and system
CN106910222A (en) * 2017-02-15 2017-06-30 中国科学院半导体研究所 Face three-dimensional rebuilding method based on binocular stereo vision

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
双目立体视觉的三维人脸重建方法;贾贝贝等;《智能系统学报》;20091215(第06期);全文 *

Also Published As

Publication number Publication date
CN111354077A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN111354077B (en) Binocular vision-based three-dimensional face reconstruction method
CN110569704B (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
CN108564041B (en) Face detection and restoration method based on RGBD camera
CN106780726A (en) The dynamic non-rigid three-dimensional digital method of fusion RGB D cameras and colored stereo photometry
WO2018000752A1 (en) Monocular image depth estimation method based on multi-scale cnn and continuous crf
CN104463899B (en) A kind of destination object detection, monitoring method and its device
CN102665086B (en) Method for obtaining parallax by using region-based local stereo matching
CN112733950A (en) Power equipment fault diagnosis method based on combination of image fusion and target detection
CN112884682B (en) Stereo image color correction method and system based on matching and fusion
Correal et al. Automatic expert system for 3D terrain reconstruction based on stereo vision and histogram matching
CN117036641A (en) Road scene three-dimensional reconstruction and defect detection method based on binocular vision
CN110232389A (en) A kind of stereoscopic vision air navigation aid based on green crop feature extraction invariance
CN104021548A (en) Method for acquiring scene 4D information
CN115272271A (en) Pipeline defect detecting and positioning ranging system based on binocular stereo vision
CN110246151B (en) Underwater robot target tracking method based on deep learning and monocular vision
CN108010075B (en) Local stereo matching method based on multi-feature combination
WO2018053952A1 (en) Video image depth extraction method based on scene sample library
CN112990063B (en) Banana maturity grading method based on shape and color information
KR20110014067A (en) Method and system for transformation of stereo content
CN108470178B (en) Depth map significance detection method combined with depth credibility evaluation factor
CN107800965A (en) Image processing method, device, computer-readable recording medium and computer equipment
CN109889799B (en) Monocular structure light depth perception method and device based on RGBIR camera
CN113538569A (en) Weak texture object pose estimation method and system
CN114996814A (en) Furniture design system based on deep learning and three-dimensional reconstruction
JP5561786B2 (en) Three-dimensional shape model high accuracy method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant