CN112115953A

CN112115953A - Optimized ORB algorithm based on RGB-D camera combined with plane detection and random sampling consistency algorithm

Info

Publication number: CN112115953A
Application number: CN202010985540.5A
Authority: CN
Inventors: 程明; 司雨晨
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2020-12-22
Anticipated expiration: 2040-09-18
Also published as: CN112115953B

Abstract

The invention provides an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling consistency algorithm, which comprises the following steps: s1: acquiring image data by using an RGB-D camera, wherein the image data comprises a color image and a depth image; s2: using an ORB algorithm to extract the feature points of the image data, and using a feature point uniformity evaluation method to judge the distribution uniformity of the feature points; s3: generating point clouds and performing down-sampling on the point clouds for the image data part with uniformly distributed characteristic points; s4: performing plane detection extraction on the point cloud subjected to down-sampling, and eliminating mismatching by using a random sampling consistency algorithm; s5: and for the image data part with non-uniform distribution of the feature points, using a set threshold value to extract the feature points and removing overlapped feature points by a non-maximum value inhibition method. The method can reduce the calculation amount, improve the accuracy of feature point extraction and reduce the mismatching, thereby realizing the requirements of the mobile robot on the accuracy and the real-time performance.

Description

Optimized ORB algorithm based on RGB-D camera combined with plane detection and random sampling consistency algorithm

Technical Field

The invention discloses an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling consistency algorithm, and belongs to the technical field of indoor mobile robot path planning navigation.

Background

In recent years, smart mobile robot technology has been rapidly developed and has been widely used in the fields of industry, military, logistics, office, home service, and the like. With the advent of RGB-D sensors, research into mobile robot positioning or SLAM using RGB-D sensors has rapidly progressed. The improvement of image processing, point cloud processing and other related technologies, and the advantages of rich acquired information, non-contact measurement, easy installation and use, low cost and the like of the RGB-D sensor, so that the RGB-D sensor is widely applied to the fields of target recognition, tracking and the like. The first problem in the robot navigation problem is how to determine a scene model, and many solutions have relied on two-dimensional sensors such as lasers, radars, etc. for map construction and robot pose estimation over the past decade. With the advent of RGB-D cameras, more and more researchers have been concerned about solving the problem of robot indoor environment model building using RGB-D cameras, and have produced many influential research results.

At present, the visual SLAM technology gradually becomes a mainstream positioning scheme, however, the monocular SLAM cannot acquire depth information of a pixel point from one image, and the depth of the pixel point needs to be estimated by a triangulation or inverse depth method. And the depth information estimated by monocular SLAM has scale uncertainty, and along with the accumulation of positioning error, the phenomenon of 'scale drift' easily occurs. The binocular SLAM obtains matched feature points by matching images of the left camera and the right camera, and then estimates the depth information of the feature points according to a parallax method. The binocular SLAM has the advantages of large measurement range and the like, but the calculation amount is large, the precision requirement on the camera is high, and the real-time requirement can be met by accelerating the GPU. An RGB-D camera is a new type of camera that has emerged in recent years, and the camera can actively acquire depth information of pixel points in an image through physical hardware. Compared with a monocular camera and a binocular camera, the RGB-D camera does not need to consume a large amount of computing resources to compute the depth of the pixel point, can directly carry out three-dimensional measurement on the surrounding environment and the obstacle, generates a dense point cloud map through the RGB-D SLAM technology, and provides convenience for subsequent navigation planning.

The feature point extraction and matching method based on the RGB-D camera commonly used at present comprises an ORB algorithm and an ICP algorithm. ORB (organized FAST and rotaed BRIEF) is an algorithm for FAST feature point extraction and description. The ORB algorithm is divided into two parts, namely feature point extraction and feature point description. The feature extraction is developed by the fast (features from accessed Segment test) algorithm, and the feature point description is improved according to the brief (binary Robust Independent feature features) feature description algorithm. The ORB feature is to combine the detection method of FAST feature points with BRIEF feature descriptors and make improvements and optimization on the original basis. The ORB algorithm is characterized by fast calculation speed. This first benefits from using FAST to detect feature points. And again, the BRIEF algorithm is used for calculating the descriptor, and the expression form of the 2-system string specific to the descriptor not only saves the storage space, but also greatly shortens the matching time. The ICP algorithm is proposed by Besl and McKay 1992, Method for registration of 3-D maps. The basic principle of the ICP algorithm is: and respectively finding out the nearest points (pi, qi) in the target point cloud P and the source point cloud Q with matching according to a certain constraint condition, and then calculating optimal matching parameters R and t to ensure that an error function is minimum. This method can improve the efficiency of producing an optimal solution, but is less practical for a mobile robot due to a large amount of calculation, resulting in an increase in cost.

Disclosure of Invention

In view of this, the present invention provides an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling consensus algorithm, which is used to reduce the amount of computation, improve the accuracy of feature point extraction, and reduce mismatching, thereby meeting the requirements of a mobile robot on accuracy and real-time performance.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling consensus algorithm, the method comprising the steps of:

s1: acquiring image data by using an RGB-D camera, wherein the image data comprises a color image and a depth image;

s2: using an ORB algorithm to extract the feature points of the image data, and using a feature point uniformity evaluation method to judge the distribution uniformity of the feature points;

s3: generating point clouds and performing down-sampling on the point clouds for the image data part with uniformly distributed characteristic points;

s4: performing plane detection extraction on the point cloud subjected to down-sampling, and eliminating mismatching by using a random sampling consistency algorithm;

s5: for the image data part with non-uniform distribution of the feature points, using a set threshold value to extract the feature points and remove overlapped feature points by a non-maximum value inhibition method;

s6: and (5) for the point cloud after the mismatching is eliminated in the step S4 and the feature points after the overlapped feature points are eliminated in the step S5, projecting the point cloud and the feature points back to a two-dimensional image plane, and reconstructing and equalizing the gray level image.

Preferably, in S1, the RGB-D camera includes a Kinect camera.

Preferably, S2 specifically includes the following steps:

s21: using the feature point to extract an Oriented FAST algorithm, judging whether the feature point x is a feature point, when the feature point x is judged to be a feature point, calculating the main direction of the feature point, and naming the feature point as a key point, so that the key point has directionality;

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the feature point x as a center, wherein the circle passes through n pixel points, and whether the distance between at least m continuous pixel points in the n pixel points arranged on the circumference and the feature point x satisfies a mean ratio l_x+ t is greater or equal to l_x-t is small, and if such a requirement is met, the feature point x is judged to be a feature point;

wherein l_xRepresenting the distance between the feature point x and the pixel point on the circle; l represents a distance; t represents a threshold value, which is an adjustment amount of the range; n is 16; m is more than or equal to 9 and less than or equal to 12;

s22: using a characteristic point description rBRIEF algorithm to perform Gaussian smoothing processing on image data, selecting pixel points y in a neighborhood by taking key points as centers, forming n point pairs (xi, yi), wherein I (x, y) are compared with gray values, I represents the gray value, x > y takes 1, otherwise, 0 is taken, generating an n-dimensional characteristic descriptor, defining the n point pairs (xi, yi) as a 2 n matrix S,

the rotation of S is performed by means of theta,

S_θ＝R_θS (2)

in the formula (2), S_θRepresenting a matrix with the rotation angle theta, wherein theta is the angle of rotating theta along the main direction of the characteristic point;

the neighborhood is a circle which takes the key point as the center and passes k pixel points to select a pixel point y; wherein 0< k < n;

s23: the characteristic point distribution uniformity evaluation method comprises the steps of segmenting image data through different dividing modes, and obtaining an image data part with uniformly distributed characteristic points and an image data part with non-uniformly distributed characteristic points after segmentation.

Further preferably, the method for evaluating the distribution uniformity of the feature points comprises the following steps: firstly, image data is preliminarily divided into a plurality of sub-regions S_iFor each sub-region S_iIs divided into several secondary subregions S again_ijSecond sub-region S_ijComprising S_i1To S_ijIndividual region according to secondary sub-region S_ijEvaluating whether the characteristic points in the area are uniformly distributed or not according to the number of the characteristic points in the area; if S_i1To S_ijThe number of the characteristic points is similar, and the similar calculation method comprises the following steps: by calculating the variance value of the feature points statistically distributed in the secondary subareas and judging according to the variance value, when the variance value is less than 15, S can be judged_iAnd the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is not uniform.

Further preferably, the image data segmentation method includes segmentation into center and peripheral directions, segmentation from top left to bottom right, or segmentation from bottom left to top right.

Preferably, S3 specifically includes the following steps:

s31: according to the color image and the depth image with uniformly distributed characteristic points, the point cloud is obtained by adopting the following formula (3),

equation (3) uses the pixel coordinate system o-u-v; c. C_X,c_Y,f_X,f_YS is camera internal reference; u and v are pixel coordinates of the characteristic points; d being a characteristic pointDepth; the coordinates of the characteristic points are (X, Y, Z), and a plurality of points defined by the coordinates form a point cloud;

s32: and (4) processing by using a grid filter, and extracting a plane from the point cloud.

Preferably, S4 specifically includes the following steps:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

s42: extracting a plane from the point cloud data with noise by using a random sampling consistency method, wherein the extraction judgment conditions are as follows: when the number of the remaining point clouds is larger than a threshold value g of the total number of the point clouds or the extraction plane is smaller than a threshold value h, extracting feature points, and after extracting the feature points, performing surface extraction on the remaining points again until the number of the remaining points reaches the threshold value h of the number of the extraction planes or the number of the remaining points is smaller than the threshold value g;

wherein g is more than or equal to 20 percent and less than or equal to 40 percent; h is more than or equal to 3 and less than or equal to 5.

Preferably, S5 specifically includes the following steps:

s51: changing the threshold t according to the oFAST algorithm of S2, wherein the changed threshold t is t', and reducing l_x+ t' and l_xThe range between t 'is adjusted according to the characteristic point extraction result, and the changed value of t' is determined to be optimal;

s52: if a plurality of key points exist in the neighborhood of a certain key point, comparing the values J of the feature points, wherein the value J is reserved in the largest mode, and the rest are deleted, wherein the value J is defined as follows:

in the formula (5), l_xy-l_xAnd l_x-l_xyRepresent the distance between the keypoint and the known feature point.

Preferably, in S6, the projection formula is:

in the formula (6), s is a scale factor, after projection, the gray level image of each plane is reconstructed, and after gray level histogram equalization is performed on the gray level image, the image is clearer and clearer, and noise brought by depth can be reduced.

The invention has the beneficial effects that: the accuracy of the algorithm is obviously higher than that of the ORB algorithm and the ICP algorithm which are used independently, and although parameters such as calibration errors of a camera have a little influence on a result, the accuracy of the algorithm is still far higher than that of other algorithms. In addition, the algorithm of the invention combines the ORB algorithm, and inserts the uniformity judgment of the distribution of the regional characteristic points and combines the plane extraction and random sampling consistency method, so the running time is slightly longer than that of the ORB algorithm which is used alone, but the practicability of the algorithm of the invention is not influenced. The ICP algorithm, however, has too much computation to run for its practical use and increases the hardware requirements of the mobile robot. The algorithm only sacrifices negligible extra running time, and can greatly improve the accuracy of extracting the characteristic points of the picture. Therefore, the invention can improve the accuracy of feature point extraction under the condition of keeping less running time, thereby improving the positioning precision and the real-time performance of the robot.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a drawing of a freiburg1_ desk picture before (left) and after (right) extraction in the present invention;

FIG. 3 is a front (left) and back (right) extracted pictures from the cherry 1_ room picture in the present invention;

fig. 4 is a picture before (left) and after (right) extraction of the freiburg1_ teddy picture in the present invention.

Detailed Description

The following describes in detail a machine learning-based SCR denitration system prediction model optimization method according to the present invention with reference to the accompanying drawings and specific embodiments.

Example 1

As shown in fig. 1, an optimized ORB algorithm based on RGB-D camera combined with plane detection and random sampling consensus algorithm, the method comprises the following steps:

Preferably, in S1, the RGB-D camera includes a Kinect camera.

Preferably, S2 specifically includes the following steps:

s21: using the feature point to extract an organized FAST algorithm (namely an oFAST algorithm), judging whether the feature point x is a feature point, calculating the main direction of the feature point when the feature point x is judged to be a feature point, and naming the feature point as a key point to ensure that the key point (namely a detector) has directionality;

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the feature point x as a center, wherein the circle passes through 16 pixel points, and whether the distance between at least 12 continuous pixel points in the 16 pixel points arranged on the circumference and the feature point x satisfies the average ratio l_x+ t is greater or equal to l_x-t is small, and if such a requirement is met, the feature point x is judged to be a feature point;

wherein l_xStand for special purposeDistances between the characteristic point x and 12 continuous pixel points on the circle respectively; l represents a distance; t represents a threshold value, is an adjustment quantity of a range, sets a t value to screen out pixel points with the distance close to an x point, if the t value is not set or is too small, the pixel points meeting the judgment condition are too many to cause too many characteristic points, and if the t value is too large, the judged characteristic points are too few;

s22: using a characteristic point description rBRIEF algorithm to perform Gaussian smoothing processing on image data, selecting pixel points y in a neighborhood by taking key points as centers, and forming 6 point pairs (xi, yi), wherein the point pairs are randomly selected, I (x, y) compares gray values with each other, I represents the gray value, x > y takes 1, otherwise 0 is taken, an n-dimensional characteristic descriptor is generated, n point pairs (xi, yi) are defined as a 2 & lt6 & gt matrix S,

the rotation of S is performed by means of theta,

S_θ＝R_θS (2)

Further preferably, the method for evaluating the distribution uniformity of the feature points comprises the following steps: firstly, image data is preliminarily divided into a plurality of sub-regions S_iFor each sub-region S_iIs divided into several secondary subregions S again_ijSecond sub-region S_ijComprising S_i1To S_ijIndividual region according to secondary sub-region S_ijEvaluating whether the characteristic points in the area are uniformly distributed or not according to the number of the characteristic points in the area; if S_i1To S_ijThe number of the characteristic points are all close to each otherThe calculation method comprises the following steps: by calculating the variance value of the feature points statistically distributed in the secondary subareas and judging according to the variance value, when the variance value is less than 15, S can be judged_iAnd the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is not uniform.

In the embodiment, the feature point distribution condition of each region is judged by image segmentation, so that different algorithms are applied according to different distribution conditions.

Preferably, S3 specifically includes the following steps:

equation (3) uses the pixel coordinate system o-u-v; c. C_X,c_Y,f_X,f_YS is camera internal reference; u and v are pixel coordinates of the characteristic points; d is the depth of the feature point; the coordinates of the characteristic points are (X, Y, Z), and a plurality of points defined by the coordinates form a point cloud;

s32: a grid filter is used for processing, a plane is extracted from the point cloud, and points with longer distances are filtered out by using a z-direction interval filter. Points that are further away are planes that are too far away from other points, resulting in only one point in the extracted plane if not filtered out, thereby increasing the number of non-conforming feature points.

The present embodiment obtains color information (i.e., gray scale) through a color image and distance information through a depth image, so that 3D camera coordinates of pixels can be calculated to generate a point cloud. The present embodiment generates a point cloud from an RGB image and a depth image.

Preferably, S4 specifically includes the following steps:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

s42: extracting a plane from the point cloud data with noise by using a random sampling consistency method, wherein the extraction judgment conditions are as follows: and extracting the feature points when the number of the remaining points of the point cloud is more than 30% of the total number or the extraction plane is less than 3, and after extracting the feature points, performing surface extraction on the remaining points again until the number of the extracted planes is 3 or the number of the remaining points is less than 30% of the total number.

Preferably, S5 specifically includes the following steps:

Preferably, in S6, the projection formula is:

in the formula (6), s is a scale factor and can be selected according to actual conditions, after projection, the gray level image of each plane is reconstructed, and after gray level histogram equalization is performed on the gray level image, the image is clearer and clearer, and noise brought by depth can be reduced.

And (3) comparing the extraction accuracy of the feature points of different algorithms and the algorithm with the running time according to the values in the tables 1 and 2.

As shown in fig. 2 to 4, the invention respectively extracts feature points of the desk, the room and the teddy bear, and the extracted feature points are distributed uniformly and have representativeness and accuracy, so that the images reconstructed subsequently are clearer.

Table 1 is a comparison table of feature point extraction accuracy rates of 3 pictures of the algorithm, the ORB algorithm and the ICP algorithm

Image classification	freiburg1_desk	freiburg1_room	freiburg1_teddy
				ORB algorithm	83.45％	76.09％	85.64％
ICP algorithm	83.18％	83.56％	85.08％
				Algorithm of the invention	93.44％	90.28％	97.21％

Table 2 shows the run-time comparison table of 3 pictures of the algorithm of the present invention, the ORB algorithm and the ICP algorithm

Image classification	freiburg1_desk	freiburg1_room	freiburg1_teddy
				ORB algorithm	0.7523	0.9735	0.6285
ICP algorithm	1.0028	1.5236	0.9658
				Algorithm of the invention	0.7886	1.0032	0.6422

From table 1, it can be seen that, regardless of the feature point extraction of a single object or the feature point extraction of a plurality of objects, the accuracy of the algorithm of the present invention is significantly higher than that of the ORB algorithm and the ICP algorithm used alone, and although parameters such as calibration error of the camera have some influence on the result, the accuracy of the algorithm of the present invention is still much higher than that of other algorithms.

As can be seen from table 2, since the algorithm of the present invention combines the ORB algorithm, and inserts the uniformity determination for the distribution of the region feature points and combines the plane extraction and random sampling consistency method, the running time is slightly longer than that of the ORB algorithm used alone, but the practicability of the algorithm of the present invention is not affected. The ICP algorithm, however, has too much computation to run for its practical use and increases the hardware requirements of the mobile robot. The algorithm only sacrifices negligible extra running time, and can greatly improve the accuracy of extracting the characteristic points of the picture.

Example 2

This example differs from example 1 only in that: in S21, m is 9; and in S42, extracting feature points when the number of the remaining point cloud is more than 40% of the total number or the extraction plane is less than 5.

Example 3

This example differs from example 1 only in that: in S21, m is 10; in S42, if the remaining point number of the point cloud is more than 20% of the total number or the extraction plane is less than 4, extracting the feature points.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

In the description of the present invention, it is to be understood that the terms "upper", "lower", "front", "rear", "left", "right", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims

1. An optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling consensus algorithm is characterized by comprising the following steps:

2. The optimized ORB algorithm based on the combination of RGB-D camera with planar detection and random sampling consensus algorithm as claimed in claim 1, wherein in S1, the RGB-D camera comprises a Kinect camera.

3. The optimized ORB algorithm based on the RGB-D camera combined with the planar detection and random sampling consensus algorithm as claimed in claim 1, wherein S2 comprises the following steps:

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the feature point x as a center, wherein the circle passes through n pixel points, and whether the distance between at least m continuous pixel points in the n pixel points arranged on the circumference and the feature point x satisfies a mean ratio l_x+ t is greater or equal to l_x-t is small, and if such a requirement is met, the feature point x is judged to be a feature point; wherein l_xRepresenting the distance between the feature point x and the pixel point on the circle; l represents a distance; t represents a threshold value, which is an adjustment amount of the range; n is 16; m is more than or equal to 9 and less than or equal to 12;

the rotation of S is performed by means of theta,

S_θ＝R_θS (2)

4. The optimized ORB algorithm based on the RGB-D camera combined with the planar detection and random sampling consensus algorithm as claimed in claim 3, wherein the evaluation method of the feature point distribution uniformity is as follows: firstly, image data is preliminarily divided into a plurality of sub-regions S_iFor each sub-region S_iIs divided into several secondary subregions S again_ijSecond sub-region S_ijComprising S_i1To S_ijIndividual region according to secondary sub-region S_ijEvaluating whether the characteristic points in the area are uniformly distributed or not according to the number of the characteristic points in the area; if it isS_i1To S_ijThe number of the characteristic points is similar, and the similar calculation method comprises the following steps: by calculating the variance value of the feature points statistically distributed in the secondary subareas and judging according to the variance value, when the variance value is less than 15, S can be judged_iAnd the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is not uniform.

5. The optimized ORB algorithm based on the RGB-D camera combined with the flat panel detection and random sampling consensus algorithm as claimed in claim 3, wherein the image data segmentation method comprises segmentation into center and peripheral directions, from top left to bottom right or from bottom left to top right.

6. The optimized ORB algorithm based on the RGB-D camera combined with the planar detection and random sampling consensus algorithm as claimed in claim 5, wherein S3 comprises the following steps:

7. The optimized ORB algorithm based on the RGB-D camera combined with the planar detection and random sampling consensus algorithm as claimed in claim 6, wherein S4 comprises the following steps:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

8. The optimized ORB algorithm based on the RGB-D camera combined with the planar detection and random sampling consensus algorithm as claimed in claim 7, wherein S5 comprises the following steps:

The optimized ORB algorithm based on the RGB-D camera combined with the flat panel detection and random sampling consensus algorithm as claimed in claim 1, wherein in S6, the projection formula is: