CN111354077B

CN111354077B - Binocular vision-based three-dimensional face reconstruction method

Info

Publication number: CN111354077B
Application number: CN202010134543.8A
Authority: CN
Inventors: 达飞鹏; 夏颖
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2022-11-18
Anticipated expiration: 2040-03-02
Also published as: CN111354077A

Abstract

The invention discloses a binocular vision-based three-dimensional face reconstruction method, which comprises the following steps: a left camera and a right camera are used for shooting a face image respectively at different visual angles at the same time, epipolar rectification is carried out on the face image by utilizing the internal and external parameters of the cameras, and the brightness of the rectified face image is balanced; dividing the face into 6 regions according to the contour feature points on the image and calculating the parallax range of each region; extracting Canny characteristic points on the image, matching the Canny characteristic points within the limit of parallax ranges of different areas, and carrying out left-right consistency inspection; optimizing the quality and quantity of the seed points by adopting a mode of fitting a general quadric surface according to the local shape characteristics of the human face; obtaining a dense face disparity map through region growing and disparity optimization; and obtaining three-dimensional point cloud of the face according to the camera parameters and the dense face disparity map, and obtaining a three-dimensional face model through point cloud meshing and texture mapping. The method has the advantages of simple and efficient algorithm, small scene limitation, simple equipment and higher robustness.

Description

Binocular vision-based three-dimensional face reconstruction method

Technical Field

The invention relates to a method for reconstructing a three-dimensional face model by using face images shot by a binocular camera, belonging to the technical field of computer vision.

Background

At present, the common methods for obtaining the three-dimensional face model mainly comprise laser three-dimensional scanning and structured light three-dimensional scanning, and although the three-dimensional face model reconstructed by the method has high precision, the method has the defects of expensive hardware equipment, complex application scene and difficulty in wide application in a common environment. The stereo matching algorithm based on binocular vision can also obtain human face three-dimensional point cloud data, establish a more accurate three-dimensional human face model, and does not need to project a grating to a measured object.

The principle of the stereo matching algorithm is that a disparity value corresponding to a pixel point is obtained by searching pixel points matched with each other in a left image and a right image, and then the disparity is restored to depth information of the pixel point in a world coordinate system through a triangularization function, so that a three-dimensional model is reconstructed, and the three-dimensional model is mainly divided into two categories of global stereo matching and local stereo matching. The global stereo matching method establishes a global energy function and solves an optimal solution based on algorithms such as dynamic programming, image segmentation, belief propagation and the like. Each pixel point in the image is subjected to global constraint of a parallax value and pixel information when being matched, and the method has the advantages of high matching precision, high calculation cost and difficulty in application in an actual scene. The local stereo matching algorithm calculates the matching cost among the pixel points through cost functions such as absolute error sum, error square sum, normalized product and the like, selects a local window optimization cost value around the pixel points according to different cost aggregation strategies, and finally selects the optimal matching point by point through a Winner-Takes-All criterion. Because only the constraint in the local range of the pixel point is utilized, the error matching rate of the local stereo matching method is high, especially in the low texture area of the image, but the calculation is simple and the real-time performance is high. Different from a common scene image, the face image has a large low-texture area, most pixel points in the face image have a plurality of highly similar matching points, and the traditional stereo matching algorithm establishes a matching relation between each point to be matched in the image and another image in the whole epipolar line range, so that the overall or local stereo matching algorithm is easy to cause mismatching.

In the method for reconstructing the three-dimensional face model based on binocular stereo matching, the region growing algorithm is more in application and the actual result is better. In addition, due to the low texture characteristic of the face image, the matching accuracy can also be effectively improved by reducing the parallax searching range of each pixel point in the actual matching process, and the common method is to extract the contour feature points of the face through a cascade regression tree algorithm and calculate the parallax value of each contour feature point so as to determine the whole face parallax range. The following documents, namely Face stereo matching and disparity calculation in stereo vision system [ C ]//2010 2nd International Conference on Industrial and Information systems, dalian, IEEE Press, 2010. The document is based on a three-dimensional face reconstruction method of document binocular stereo vision of a region growing algorithm [ J ]. Intelligent system report, 2009,4 (6): 513-520. A parallax search range is not limited, pixel points are matched in the whole polar line range to obtain seed points, the precision of the seed points is not high and the quantity of the seed points is too small, when the seed points have more mismatching, the mismatching points are transmitted to a surrounding pixel point group to be matched when the region grows, when the quantity of the seed points is less, accumulated mismatching of the region growth is easily caused, the stereo matching of the whole face part is difficult to guide, and finally, a face parallax image with high signal to noise ratio is difficult to generate. Therefore, the traditional face stereo matching method based on region growing has various defects of less seed point quantity, low seed point accuracy and the like, influences the final parallax region growing result, and is difficult to reconstruct a more accurate three-dimensional face model.

Disclosure of Invention

The technical problem is as follows: aiming at the problems in the prior art, the invention aims to provide a binocular vision-based three-dimensional face reconstruction method, which divides a face into different regions to respectively limit parallax ranges of the different regions and optimizes the quality and quantity of seed points in a mode of fitting a general quadric surface.

The technical scheme is as follows: a binocular vision-based three-dimensional face reconstruction method comprises the following steps

The method comprises the following steps:

step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles at the same time, and carrying out binocular calibration on the two cameras to obtain internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, the two face images after the polar line correction are subjected to brightness balance, and the image after the brightness balance corresponding to the left camera is recorded as I _L And the image with balanced brightness corresponding to the right camera is marked as I _R 。

And 2, step: using cascaded regression treesBy extracting I separately _L And I _R The upper 68 contour feature points are extracted, and I is respectively calculated according to the extracted 68 contour feature points _L And I _R Dividing a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region, marking the rest face regions except the five regions as non-five sense organ regions, and respectively calculating parallax ranges of the six regions;

and step 3: respectively extracting I by using Canny edge operator _L And I _R The extracted Canny feature points are uniformly distributed on the whole face and are matched with I within the limitation of the parallax range of each region obtained in the step 2 _L And I _R Screening out the Canny feature points which are mismatched by adopting left and right consistency test, marking the Canny feature points which pass the left and right consistency test as initial seed points, and taking the rest unmatched face pixel points except the initial seed points as non-seed points;

and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting the non-seed points in a certain proportion to become newly added seed points; carrying out left-right consistency check on the initial seed points and the newly added seed points which are left after screening, wherein the seed points passing the check become seed points for regional growth;

and 5: using a region growing algorithm to take the seed point obtained in the step (4) as a center to diverge outwards, and matching the remaining face pixel points which are not matched to obtain the parallax value of each face pixel point; optimizing the obtained parallax value according to a parallax optimization algorithm to obtain a dense face parallax image;

step 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping.

The step 1 specifically comprises the following steps:

step 1.1: respectively shooting a human face image at different viewing angles by using a left camera and a right camera;

step 1.2: performing binocular calibration on the left camera and the right camera, and respectively calculating internal reference matrixes K of the left camera and the right camera _L 、K _R And an external reference rotation matrix R _L 、R _R And a translation vector t _L 、t _R ；

Step 1.3: polar line correction is carried out on the two face images shot in the step 1.1 by utilizing the internal reference matrix and the external reference rotation matrix of the two cameras;

step 1.4: and carrying out brightness equalization on the two human face images after the polar line correction.

The step 2 is specifically as follows:

step 2.1: respectively extracting I by using cascade regression tree algorithm _L And I _R 68 contour feature points around the upper eyebrows, eyes, nose, upper and lower lips, and outer contour of the face;

step 2.2: respectively calculating the parallax values, I, of 68 profile feature points _L Upper m-th contour feature point f _m Has a parallax value of

Wherein m ∈ {1, \8230;, 68}, (x) _m ,y _m ) Is f _m (x) pixel coordinates of (c) _m ′,y _m ') is I _R Upper m-th contour feature point f _m ' pixel coordinates;

step 2.3: dividing the human face into a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region by using 68 contour characteristic points, wherein the parts of the whole human face except the left eye region, the right eye region, the nose region, the upper lip region and the lower lip region are non-five-sense-organ regions, all the regions are closed regions, and the boundaries of all the regions are continuous and closed contour lines with single pixel width;

step 2.4: respectively determining the parallax ranges of the 6 regions in the step 2.3, wherein the parallax ranges are not in the five sense organ regionParallax range D _face Comprises the following steps:

D _face ＝[min(d(f _m )),max(d(f _m ))],m∈{1,…,68}，

parallax range D of left eye region _leye Comprises the following steps:

D _leye ＝[min(d(f _m )),max(d(f _m ))],m∈{37,...,42}，

parallax range D of right eye region _reye Comprises the following steps:

D _reye ＝[min(d(f _m )),max(d(f _m ))],m∈{43,...,48}，

parallax range D of nasal region _nose Comprises the following steps:

D _nose ＝[min(d(f _m )),max(d(f _m ))],m∈{28,...,36}，

parallax range D of the upper lip region _umouth Comprises the following steps:

D _umouth ＝[min(d(f _m )),max(d(f _m ))],m∈{49,...,55,62,63,64}，

parallax range D of lower lip region _dmouth Comprises the following steps:

D _dmouth ＝[min(d(f _m )),max(d(f _m ))],m∈{56,...,61,65,...,68}。

the step 3 is specifically as follows:

step 3.1: respectively extracting I by using Canny edge operator _L And I _R The extracted Canny feature points are uniformly distributed on the whole face;

step 3.2: match I _L Canny feature points and I of _R In the process of the Canny feature point, the matching cost is measured by adopting an NCC cost function:

wherein C (x, y, d) is I _L The matching cost of the Canny feature point with the upper pixel coordinate of (x, y), d is the disparity value of (x, y), N (x, y) is a neighborhood window with (x, y) as the center, I _L (I, j) represents I _L The gray value of the pixel point (i, j) in the upper N (x, y),

represents I _L Mean value of gray values of all pixel points in the upper N (x, y), I _R (I-d, j) represents I _R The gray value of the upper pixel point (i-d, j),

represents I _R Taking (x-d, y) as the mean value of the gray values of all the pixel points in the neighborhood window;

step 3.3: for I _L C (x, y) of Canny feature point above, and calculating the sum of the c (x, y) and the I _R All Canny feature points in the parallax range corresponding to the face area where the c (x, y) is located and the NCC matching cost of two adjacent points on the left and right of each Canny feature point;

step 3.4: for I _R Each Canny feature point c ' (x ', y ') above, which is calculated with I _L All Canny feature points in the parallax range corresponding to the face region where the c ' (x ', y ') is located, and the NCC matching cost of two adjacent points around each Canny feature point;

step 3.5: according to the Winner-Takes-All criterion for I _L And I _R Selecting a pixel point with the matching cost closest to 1 as a matching point of each Canny feature point, and completing the matching of the point;

step 3.6: and completing the matching of all Canny feature points, and carrying out left-right consistency test on the matching result, wherein the Canny feature points passing the left-right consistency test become initial seed points.

The step 4 is specifically as follows:

step 4.1: according to the local shape characteristics of the human face, the nth initial seed point s obtained in the step 3.6 is used _n Selecting a fitting window W for the center _n ；

And 4.2: w is to be _n All initial seed points in the set G _n In G with _n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points;

step 4.3: screening out G _n Initial seed points with the difference between the inner parallax value and the fitting parallax value exceeding a threshold value T;

step 4.4: after all the fitting windows are fitted, calculating a fitting disparity value of each non-seed point according to a general quadric surface obtained by fitting;

step 4.5: calculating the NCC matching cost of each non-seed point in 4.4 under All fitting parallax values, selecting an optimal matching point according to a Winner-Takes-All rule, and recording the corresponding matching cost as the most matched cost;

step 4.6: arranging the optimal matching costs of all the non-seed points in the step 4.5 in a descending order, and selecting the first 40% of the non-seed points as newly-added seed points;

step 4.7: and performing left-right consistency check on the matching results of the screened residual initial seed points and the newly added seed points, and forming region growing seed points through the left-right consistency check.

The step 5 is specifically as follows:

step 5.1: to I _L On each face pixel point p to be matched _k Determining p according to the region growing principle and the actual parallax range of each region in the step 2 _k The final disparity search range of (a);

step 5.2: to p is p _k Searching the optimal matching point in the parallax searching range;

step 5.3: with p _k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window with a general quadric surface, and setting p _k Corresponding fitting disparity value is d _fit Calculating only d _fit ，d _fit Selecting an optimal matching point according to a Winner-Takes-All criterion for the matching cost of the point under +/-1;

step 5.4: and optimizing the parallax value according to a parallax optimization algorithm to obtain a final dense face parallax image.

The step 6 is specifically as follows:

step 6.1: according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate (u) of the pixel point ₁ ,v ₁ ) Calculating the human face parallax in the step 5 by adopting a stereo matching triangularization formulaThree-dimensional coordinates (X, Y, Z) of each pixel of the graph having a disparity value:

step 6.2: based on texture mapping and point cloud meshing, using I _L And reconstructing textures of the three-dimensional face point cloud to obtain a final three-dimensional face model.

Advantageous effects

Compared with the prior art, the invention has the following remarkable advantages: the method overcomes the defect that the traditional stereo matching method is difficult to reconstruct the three-dimensional face model with higher accuracy; the parallax range of each region of the human face is simply and efficiently limited through the division of the facial region based on the contour feature points, and the accuracy of the matching process is improved; a new method for optimizing the quality and quantity of seed points based on surface fitting is defined.

Drawings

FIG. 1 is a complete flow chart of the present invention;

FIG. 2 is a schematic view of an image acquisition system;

FIG. 3 shows distribution positions and sequence numbers of 68 contour feature points extracted from a cascade regression tree;

FIG. 4 is a schematic view of a facial region partition;

FIG. 5 is a schematic diagram of searching matching points by the stereo matching algorithm, wherein (a) is a left image of a face image, and (b) is a right image of the face image;

FIG. 6 is a schematic diagram of the stereo matching left and right consistency check;

FIG. 7 is a schematic view of a surface fitting window;

FIG. 8 is a graph of the results of a fit to the initial seed points within the fitting window of FIG. 7;

FIG. 9 is a schematic diagram of overlapping fitting windows corresponding to different seed points;

fig. 10 is a schematic diagram of a single seed point restricted parallax search range when a region is increased, wherein (a) corresponds to a left image of a face image, and (b) corresponds to a right image of the face image;

fig. 11 is a schematic diagram illustrating a parallax search range limited by multiple sub-points when a region grows, wherein (a) corresponds to a left image of a face image, and (b) corresponds to a right image of the face image;

fig. 12 is a schematic diagram of a three-dimensional face reconstruction experiment result, in which (a) is a left image of a face image, (b) to (d) are face disparity maps obtained by ST, GF, and NL algorithms, respectively, (e) is a face disparity map obtained by the algorithm of the present invention, and (f) is a three-dimensional face model reconstructed by the algorithm herein.

Detailed Description

The technical solution of the present invention will be further described in detail with reference to the following examples and accompanying drawings.

1. The whole experimental process is as follows:

a binocular vision-based three-dimensional face reconstruction method is shown in figure 1 and comprises the following specific steps:

step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles, carrying out binocular calibration on the two cameras, and acquiring internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, the two face images after the polar line correction are subjected to brightness balance, and the image after the brightness balance corresponding to the left camera is recorded as I _L And recording the image with balanced brightness corresponding to the right camera as I _R 。

Step 2: respectively extracting I by adopting cascade regression tree algorithm _L And I _R The upper 68 contour feature points are extracted, and I is respectively calculated according to the extracted 68 contour feature points _L And I _R Dividing a left eye region, a right eye region, a nose region, an upper lip region and a lower lip region, marking the rest face regions except the five regions as non-five sense organ regions, and respectively calculating parallax ranges of the six regions;

and step 3: respectively extracting I by using Canny edge operator _L And I _R The extracted Canny feature points are uniformly distributed on the whole face and are matched with I within the limitation of the parallax range of each region obtained in the step 2 _L And I _R And screening out the mismatched Canny feature points by adopting left and right consistency tests, and marking the Canny feature points passing the left and right consistency tests as initial seed pointsThe other unmatched face pixel points except the initial seed point are non-seed points;

and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value of each initial seed point, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting a certain proportion of non-seed points to become newly-added seed points; carrying out left-right consistency inspection on the remaining initial seed points and the newly added seed points after screening, wherein the seed points passing the inspection become seed points for regional growth;

step 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping. .

2. Acquiring a human face image pair:

the method uses a left camera and a right camera with the same model to respectively shoot a human face image at different visual angles, and the optical centers of the two cameras are O _L And O _R The linear distance between the optical centers of the two cameras is a base line b, and the imaging planes of the two cameras are S _L And S _R Setting a point p on the face to be S under the world coordinate _L And S _R The projected points on are respectively p _L And p _R ，p _L And p _R I.e. the pairs of pixel points that match each other, as shown in fig. 2; performing binocular calibration on the two cameras, and respectively calculating internal reference matrixes K of the left camera and the right camera _L 、K _R And an external reference rotation matrix R _L 、R _R And a translation vector t _L 、t _R . The internal reference matrix is in the form of

Where f is the focal length of the camera, dx and dy are the pixel sizes, u ₀ And v ₀ Is the coordinate value of the center of the image; then, performing epipolar line correction on the two collected face images by using the internal and external parameters obtained by calibration; finally, according to the document Midway image equalisation [ J]119-134. Luminance equalization is performed on the two facial images after polar line correction, and the image with equalized luminance corresponding to the left is recorded as I _L And recording the image with balanced brightness corresponding to the right camera as I _R 。

2. Dividing the face area and solving the parallax range of different areas of the face:

respectively extracting I by adopting cascade regression tree algorithm _L And I _R Upper 68 contour feature points, as shown in FIG. 3, I _L Top mth contour feature point f _m Has a parallax value of

Where m ∈ { 1., 68}, (x) _m ,y _m ) Is f _m (x) pixel coordinates of (c) _m ′,y _m ') is I _R Top mth contour feature point f _m ' pixel coordinates. The human face is divided into a left eye area, a right eye area, a nose area, an upper lip area and a lower lip area by using 68 contour feature points, the areas of the whole human face area except the left eye area, the right eye area, the nose area, the upper lip area and the lower lip area are non-five-sense-organ areas, the total number of the six face areas is six, as shown in fig. 4, all the areas are closed areas, and the boundaries of all the areas are continuous and closed contour lines with single pixel width.

The parallax ranges of the six face regions are calculated respectively, wherein the parallax range D of the non-five sense organ region _face Comprises the following steps:

D _face ＝[min(d(f _m )),max(d(f _m ))],m∈{1,...,68}，

parallax range D of left eye region _leye Comprises the following steps:

D _leye ＝[min(d(f _m )),max(d(f _m ))],m∈{37,…,42}，

parallax range D of right eye region _reye Comprises the following steps:

D _reye ＝[min(d(f _m )),max(d(f _m ))],m∈{43,…,48}，

parallax range D of the nasal region _nose Comprises the following steps:

D _nose ＝[min(d(f _m )),max(d(f _m ))],m∈{28,…,36}，

parallax range D of the upper lip region _umouth Comprises the following steps:

D _umouth ＝[min(d(f _m )),max(d(f _m ))],m∈{49,...,55,62,63,64}，

parallax range D of lower lip region _dmouth Comprises the following steps:

D _dmouth ＝[min(d(f _m )),max(d(f _m ))],m∈{56,...,61,65,…,68}。

3. obtaining an initial seed point pair:

respectively extracting I by using Canny edge operator _L And I _R The extracted Canny feature points are uniformly distributed on the face.

For I _L C (x, y) of Canny, and calculating the sum of the c (x, y) and the I _R All Canny feature points in the parallax range corresponding to the face area where the C (x, y) is located, and the NCC matching cost C (x, y, d) of two adjacent points around each Canny feature point:

wherein C (x, y, d) is I _L Matching cost of Canny feature point with upper pixel coordinate of (x, y), d is parallax value of (x, y), N (x, y) is neighborhood window taking (x, y) as center, I _L (I, j) represents I _L The gray value of the pixel point (i, j) in the upper N (x, y),

represents I _R And (d) calculating the gray value average value of all pixel points in the neighborhood window taking (x-d, y) as the center.

For I _R Each Canny feature point c ' (x ', y ') above, which is calculated with I _L And all Canny feature points in the parallax range corresponding to the face region where the C ' (x ', y ') is located and the NCC matching cost C (x ', y ', d) of two points adjacent to the left and right of each Canny feature point.

Determination of I by means of the Winner-Takes-All criterion _L And I _R Matching points of each Canny feature point; as shown in (a) and (b) of fig. 5, if c (x, y) and c ' (x ', y) match with each other, the parallax value corresponding to the point c (x, y) is D = x-x ', D in the figure is the parallax range corresponding to the face region where c (x, y) is located, c ″ (x, y) is the position map of c (x, y) in the left figure in the right figure, and the two points have the same coordinate value.

Performing left-right consistency check on the matching result of the Canny feature points, wherein the Canny feature points passing the left-right consistency check become initial seed points, fig. 6 is a schematic diagram of left-right consistency check principle, and two solid lines are I respectively _L And I _R Two middle and two corresponding polar lines, pair I _L And matching the point p (x, y), wherein possible matching points are located between line segments AB, AB is the parallax range of the face area where p (x, y) is located, the optimal matching point of p (x, y) is p '(x', y), then matching p '(x', y), possible matching points are located between line segments EF, EF is the parallax range of the face area where p '(x', y) is located, and if the optimal matching point of p '(x', y) is also p (x, y), the matching of p (x, y) passes the left-right consistency check.

4. Screening out mismatching seed points:

according to the local shape characteristics of the human face, the nth initial seed point s is used _n Selecting a fitting window W for the center _n (ii) a W is to be _n All the initial seed points in the set G _n In G with _n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points; screening out G _n Initial seed points with the difference between the inner parallax value and the fitting parallax value exceeding a threshold value T; fig. 7 is a schematic diagram of a fitting window centered on an initial seed point on the left face, where the fitting window is a white square frame in the drawing, pixel points in the square frame that are not black in color are initial seed points, fig. 8 is a fitting result of the initial seed points in the fitting window shown in fig. 7, and gray points are mistakenly matched initial seed points that are filtered out.

5. Newly adding seed points:

the non-seed points located in a single fitting window will obtain a fitting disparity value according to a general quadric surface obtained by fitting when the initial seed points which are not matched are screened out, and the non-seed points located in a plurality of fitting windows will obtain a plurality of fitting disparity values due to the fact that overlapping possibly exists between the fitting windows which take different initial seed points as centers. As shown in FIG. 9, let a non-seed point q in the graph _j Simultaneously located in the fitting window W ₁ And fitting window W ₂ Inner, q _j At W ₁ And W ₂ The fitting disparity values obtained in the interior are respectively d _1j And d _2j . And calculating the NCC matching cost of each non-seed point under All fitting parallax values, selecting the optimal matching point according to a Winner-Takes-All criterion, sequencing the optimal matching costs corresponding to All the non-seed points, and selecting the non-seed points with a certain proportion to become newly added seed points.

6. Generating a dense face disparity map:

the principle of region growing in the stereo matching algorithm is to add range constraint and smooth constraint for matching of pixel points, and limit the parallax search range of the remaining pixel points to be matched by using the parallax of seed points. Let s _L Is I _L Seed point of (1), which is at _R Upper corresponding matching point is s _R Then and s _L Adjacent points p to be matched _L Is matched with the point p _R Is inevitably at s _R Is adjacent toA near zone. Through s _L Known parallax of d _L I.e. p can be reduced _L The disparity search range of (1). The specific realization of the region growing algorithm is mainly divided into single seed point limited growth and multi-seed point limited growth, wherein the single seed point limited growth refers to the case of p _L Has only one seed point s in the four neighborhoods _L Then the disparity search range is only limited by s _L Restriction, as shown in (a) and (b) in fig. 10. Multiple seed points restrict growth by indicating if p _L Has a plurality of control points s in the four neighborhoods _L1 ，s _L2 Etc. then p _L S as a disparity search window _L1 S and _L2 the intersection of the limit ranges of (a) and (b) in fig. 11. p is a radical of _L The intersection of the corresponding parallax search window and the epipolar line is p _L The final parallax searching range is calculated, and all pixel points and p in the range are calculated _L And selecting the optimal matching point according to a Winner-Takes-All rule to become a new seed point and adding the new seed point into the seed point sequence until the matching of the whole face is completed. For each point p to be matched _k And searching the optimal matching point in the parallax searching range. If the accumulated matching error occurs in the matching process, the current point p to be matched is used _k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window to a general quadric surface, and setting p _k Corresponding fitting disparity value is d _fit Calculating only d _fit ，d _fit And selecting the optimal matching point according to the Winner-Takes-All criterion for the matching cost of the point under +/-1. Finally, according to a High-quality single-shot capture of facial geometry [ J ]]The method comprises the following steps of (1) ACM Transactions on Graphics,2010,29 (4) and (40) in a ratio of (1-40).

7. Obtaining a three-dimensional face model:

according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate value (u) of the pixel point ₁ ,v ₁ ) And calculating the three-dimensional coordinates of each pixel point with the parallax value of the human face parallax image by adopting a stereo matching triangularization formula:

based on texture mapping and point cloud meshing, utilizing I _L And reconstructing textures of the three-dimensional face point cloud to obtain a final three-dimensional face model.

And (3) experimental comparison: in order to verify the feasibility and the effectiveness of the invention, the experimental result of the invention is compared with the existing ST stereo matching algorithm, GF stereo matching algorithm and NL stereo matching algorithm, as shown in (a) to (f) of fig. 12, wherein (a) of fig. 12 is a left image of a human face image, (b) to (d) of fig. 12 are human face disparity maps obtained by ST, GF and NL algorithms, respectively, (e) of fig. 12 is a human face disparity map obtained by the context algorithm, and (f) of fig. 12 is a three-dimensional human face model reconstructed by the context algorithm.

The invention provides a face region division algorithm based on contour feature points extracted by a cascade regression tree, and improves the matching accuracy of seed point acquisition and parallax region growth by limiting the parallax range in each region of the face. According to the local shape characteristics of the human face, the quality and the number of the seed points are optimized by adopting a mode of fitting a general quadric surface, local connection and smoothness constraint are added among the seed points, sufficient and reliable seed points are provided for region growth, a more accurate three-dimensional human face model can be obtained, the reconstruction result is smoother, the precision is improved, and the robustness is better.

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions should be included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A binocular vision-based three-dimensional face reconstruction method is characterized by comprising the following steps:

step 1: using a left camera and a right camera to respectively shoot a face image at different visual angles at the same time, and carrying out binocular calibration on the two cameras to obtain internal and external parameters of the cameras; polar line correction is carried out on the two shot face images by utilizing the internal and external parameters of the camera, brightness balance is carried out on the two face images after the polar line correction, and the image after the brightness balance corresponding to the left camera is marked as I _L And recording the image with balanced brightness corresponding to the right camera as I _R ；

and 4, step 4: according to the local shape characteristics of the human face, selecting a fitting window by taking each initial seed point obtained in the step 3 as a center, and fitting a general quadric surface to the parallax values of all the initial seed points in the fitting window; comparing the parallax value of each initial seed point in the fitting window with the fitting parallax value, and screening out the corresponding initial seed point if the difference between the two initial seed points exceeds a threshold value; estimating the parallax of the non-seed points in the step 3 by using a fitting value obtained by fitting the general quadric surface, and selecting a certain proportion of non-seed points to become newly-added seed points; carrying out left-right consistency inspection on the remaining initial seed points and the newly added seed points after screening, wherein the seed points passing the inspection become seed points for regional growth;

and 6: and calculating three-dimensional point cloud information of the face pixel points according to the internal and external parameters of the camera and the parallax value of each face pixel point in the parallax map, and acquiring a three-dimensional face model through point cloud meshing and texture mapping.

2. The binocular vision-based three-dimensional face reconstruction method according to claim 1, wherein the step 1 specifically comprises:

step 1.2: carrying out binocular calibration on the left camera and the right camera, and respectively calculating the internal reference matrixes K of the left camera and the right camera _L 、K _R And an external reference rotation matrix R _L 、R _R And a translation vector t _L 、t _R ；

3. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 2 specifically comprises:

Where m ∈ { 1., 68}, (x) _m ,y _m ) Is f _m (x) pixel coordinates of (c) _m ′,y _m ') is I _R Top mth contour feature point f _m ' pixel coordinates;

step 2.4: respectively obtaining the parallax ranges of the 6 areas in the step 2.3, wherein the parallax range D of the non-five sense organ area _face Comprises the following steps:

D _face ＝[min(d(f _m )),max(d(f _m ))],m∈{1,...,68}，

parallax range D of left eye region _leye Comprises the following steps:

D _leye ＝[min(d(f _m )),max(d(f _m ))],m∈{37,...,42}，

parallax range D of right eye region _reye Comprises the following steps:

D _reye ＝[min(d(f _m )),max(d(f _m ))],m∈{43,...,48}，

parallax range D of nasal region _nose Comprises the following steps:

D _nose ＝[min(d(f _m )),max(d(f _m ))],m∈{28,...,36}，

parallax range D of the upper lip region _umouth Comprises the following steps:

D _umouth ＝[min(d(f _m )),max(d(f _m ))],m∈{49,…,55,62,63,64}，

parallax range D of lower lip region _dmouth Comprises the following steps:

D _dmouth ＝[min(d(f _m )),max(d(f _m ))],m∈{56,…,61,65,…,68}。

4. the binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 3 specifically comprises:

step 3.2: matching I _L Canny feature points and I of _R In the above Canny feature point process, the matching cost is measured by using the NCC cost function:

step 3.4: for I _R Each Canny feature point c ' (x ', y ') above, which is calculated with I _L All Canny feature points in the parallax range corresponding to the face area where the c ' (x ', y ') is located and two points adjacent to the left and right of each Canny feature pointNCC matching cost;

step 3.6: and completing the matching of all Canny feature points, and performing left-right consistency check on the matching result, wherein the Canny feature points passing the left-right consistency check become initial seed points.

5. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 4 specifically comprises:

Step 4.2: w is to be _n All initial seed points in the set G _n In the order of G _n Fitting a general quadric surface for the data set by using the coordinates and the parallax values of all the initial seed points;

step 4.5: calculating the NCC matching cost of each non-seed point in 4.4 under All fitting parallax values, selecting an optimal matching point according to a Winner-Takes-All criterion, and recording the corresponding matching cost as the most matched cost;

step 4.6: arranging the optimal matching costs of all the non-seed points in the step 4.5 in a descending order, and selecting the first 40% of the non-seed points as newly added seed points;

step 4.7: and performing left-right consistency test on the matching results of the screened residual initial seed points and the newly added seed points, and obtaining region growing seed points through the left-right consistency test.

6. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 5 specifically comprises:

step 5.1: to I _L On each face pixel point p to be matched _k Determining p according to the region growing principle and the actual parallax range of each region in the step 2 _k The final disparity search range of;

and step 5.2: to p is p _k Searching the optimal matching point in the parallax searching range;

step 5.3: with p _k Selecting a neighborhood window for the center, fitting the parallax value of the matched pixel points in the window to a general quadric surface, and setting p _k Corresponding fitting disparity value is d _fit Calculating only d _fit ，d _fit Selecting an optimal matching point according to a Winner-Takes-All criterion for the matching cost of the point under +/-1;

7. The binocular vision based three-dimensional face reconstruction method according to claim 1, wherein the step 6 specifically comprises:

step 6.1: according to the focal length f of the cameras, the baseline distance b between the cameras, the parallax value d of the pixel point and the coordinate (u) of the pixel point ₁ ,v ₁ ) And (3) calculating the three-dimensional coordinates (X, Y, Z) of each pixel point with the parallax value in the face parallax image in the step 5 by adopting a stereo matching triangularization formula:

step 6.2: based on texture mapping and point cloud meshing, using I _L And reconstructing the texture of the three-dimensional face point cloud to obtain a final three-dimensional face model.