CN109559320A - Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network - Google Patents
Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network Download PDFInfo
- Publication number
- CN109559320A CN109559320A CN201811388678.6A CN201811388678A CN109559320A CN 109559320 A CN109559320 A CN 109559320A CN 201811388678 A CN201811388678 A CN 201811388678A CN 109559320 A CN109559320 A CN 109559320A
- Authority
- CN
- China
- Prior art keywords
- semanteme
- point
- builds
- vision slam
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of methods for building figure function based on empty convolution deep neural network realization vision SLAM semanteme, obtain the colour information and depth information of current environment by RGB-D camera including (1) embedded development processor;(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud data;(3) Pixel-level semantic segmentation is carried out to image using deep learning, is mapped by image coordinate system and world coordinate system, and make spatial point that there is semantic tagger information;(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud semanteme map being made of intensive discrete point.The invention further relates to a kind of systems for building figure function based on empty convolution deep neural network realization vision SLAM semanteme.Using this method and system, spatial network map has more advanced semantic information, more meets use demand during building figure in real time.
Description
Technical field
The present invention relates to the semantic segmentation field that unmanned systems positioned and built in real time figure field more particularly to image procossing,
Specifically refer to a kind of method and system that figure function is built based on empty convolution deep neural network realization vision SLAM semanteme.
Background technique
Unmanned systems are quickly grown in recent years, and automatic Pilot, robot and unmanned plane are all typical unmanned systems.Vision
SLAM (Simultaneous Localization and Mapping is positioned immediately and built figure) system has been widely used
In the positioning and path planning of unmanned systems, the ORB-SLAM (Mur-Artal proposed in 2015 is such as equal to by Mur-Artal
R,Montiel J M M,Tardós J D.ORB-SLAM:A Versatile and Accurate Monocular SLAM
System[J].IEEE Transactions on Robotics,2015,31(5):1147-116).Institute in vision SLAM system
The spatial network map of foundation only includes low-level information, such as color information and range information, is unfavorable for robot pair in this way
The understanding of current scene, so we introduce the semantic segmentation net based on deep learning in the building process of vision SLAM system
Network realizes robot to the semanteme and scene understanding of current scene.
The purpose of semantic segmentation is to realize the Accurate Segmentation between all kinds of targets for scene understanding, be can be used for certainly
Driving or robot are moved to help to identify target and relationship by objective (RBO), the DeepLab depth nerve such as proposed by GoogLe company
Network structure be now widely used for semantic segmentation field (L.-C.Chen, G.Papandreou, I.Kokkinos,
K.Murphy,and A.L.Yuille.Deeplab:Semantic image segmentation with deep
convolutional nets,atrous convolution,and fully connected crfs.arXiv:
1606.00915,2016) but due to the general semantics segmentation network query function real-time it is poor, be difficult to apply in embedded systems.
Meanwhile semantic segmentation can also carry out situations such as unobvious belt edge contours segmentation, erroneous detection and missing inspection.
Semantic segmentation is applied and is built in figure in vision SLAM system semantics by we, so that the spatial network established
Each of figure network coordinate point all has advanced semantic information, and robot is allowed to manage current scene target with semantic class
Solution, and error brought by semantic segmentation is optimized by space manifold clustering algorithm, so that the semantic map of building is more quasi-
Really.
Summary of the invention
The purpose of the present invention is overcoming the above-mentioned prior art, provide a kind of by deep learning and vision SLAM
Combine, make robot to scene objects have semantic class understand, reduce semantic segmentation error based on empty convolution depth
Neural fusion vision SLAM semanteme builds the method and system of figure function.
To achieve the goals above, of the invention to realize that vision SLAM semanteme builds figure based on empty convolution deep neural network
The method and system of function are as follows:
This realizes the method that vision SLAM semanteme builds figure function, main feature based on empty convolution deep neural network
Be, the method the following steps are included:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud number
According to;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is reflected by image coordinate system and world coordinate system
It penetrates, and makes spatial point that there is semantic tagger information;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud being made of intensive discrete point semantically
Figure.
Wherein, the embeded processor in the step (1) includes NVIDIA JETSON TX2 system.
Preferably, the step (2) the following steps are included:
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data.
Preferably, in the step (3) to image carry out Pixel-level semantic segmentation specifically includes the following steps:
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.3) classified according to extraction result to image.
Preferably, the step (3.1) further includes the design process of feature extraction layer, specifically includes the following steps:
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) is by Inception (4a), Inception (4b), Inception in GoogLeNet network structure
(4c), Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3).
Preferably, the step (3.2) further includes the design process of multiple dimensioned extract layer, specifically includes the following steps:
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature is merged into module, by the characteristic image by 1 × 1 convolution
To feature, and it is put into Softmax layers of progress pixel semantic classification.
Preferably, the step (4) specifically includes the following steps:
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step
(4.5);Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector
vjAngle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and
By the x for the condition that meetsjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step
(4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle.
Preferably, the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1), specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
Preferably, the step (5) the following steps are included:
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of space
Point apart from mean value, remove the spatial point excessive apart from mean value;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice is only
Retain a spatial point.
This is based on what empty convolution deep neural network realization vision SLAM semanteme built figure function based on the above method
System, is mainly characterized by, the system includes:
Embedded development processor, for constructing vision SLAM semanteme map;
RGB-D camera is connected, for acquiring color data and depth data with the embedded development processor;
Graph builder, the graph builder according to deep learning and vision SLAM, pass through embedded development at runtime
Processor and RGB-D camera realize that vision SLAM semanteme builds figure, specifically follow the steps below processing:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud number
According to;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is reflected by image coordinate system and world coordinate system
It penetrates, and makes spatial point that there is semantic tagger information;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud being made of intensive discrete point semantically
Figure.
Preferably, the embeded processor in the step (1) includes NVIDIA JETSON TX2 system.
Preferably, the step (2) the following steps are included:
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data.
Preferably, in the step (3) to image carry out Pixel-level semantic segmentation specifically includes the following steps:
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.3) classified according to extraction result to image.
Preferably, the step (3.1) further includes the design process of feature extraction layer, specifically includes the following steps:
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) is by Inception (4a), Inception (4b), Inception in GoogLeNet network structure
(4c), Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3).
Preferably, the step (3.2) further includes the design process of multiple dimensioned extract layer, specifically includes the following steps:
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature is merged into module, by the characteristic image by 1 × 1 convolution
To feature, and it is put into Softmax layers of progress pixel semantic classification.
Preferably, the step (4) specifically includes the following steps:
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step
(4.5);Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector
vjAngle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and
By the x for the condition that meetsjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step
(4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle.
Preferably, the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1), specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
Preferably, the step (5) the following steps are included:
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of space
Point apart from mean value, remove the spatial point excessive apart from mean value;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice is only
Retain a spatial point.
Using the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
And system, system use embedded development processor, by the collected color data of RGB-D camera and depth data, benefit
With vision SLAM technology, image characteristic point is extracted, carries out characteristic matching, the method for recycling Bundle Adjustment obtains
More accurate robot pose estimation, the cumulative errors for eliminating interframe are detected using winding.Letter is positioned in real time obtaining robot
It is refreshing using depth is improved using a kind of empty convolution design method for GoogLeNet deep neural network while breath
Semantic segmentation result combination vision SLAM system is obtained building for semantic class by the feature extraction through the real-time semantic segmentation of network implementations
Figure.And clustered by manifold and eliminate error brought by optimization semantic segmentation, after building figure by Octree, spatial network map tool
There is more advanced semantic information, and the semantic map constructed is more accurate.The improvement of network improves the real-time place of system
Time loss of the semantic segmentation network of reason ability, this method and system on NVIDIA Jetson TX2 platform is 0.099s/
Width meets use demand during building figure in real time.
Detailed description of the invention
Fig. 1 is the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
Flow chart.
Fig. 2 is the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
Semantic segmentation flow chart.
Fig. 3 is the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
Empty convolution schematic diagram.
Fig. 4 is the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
Experimental result schematic diagram.
Fig. 5 be it is of the invention based on empty convolution deep neural network realize vision SLAM semanteme build figure function method and
The NVIDIA Jetson TX2 processor schematic diagram of system.
Specific embodiment
It is further to carry out combined with specific embodiments below in order to more clearly describe technology contents of the invention
Description.
This realizes the method that vision SLAM semanteme builds figure function based on empty convolution deep neural network, wherein described
Method the following steps are included:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud number
According to;
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is reflected by image coordinate system and world coordinate system
It penetrates, and makes spatial point that there is semantic tagger information;
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) is by Inception (4a), Inception (4b), Inception in GoogLeNet network structure
(4c), Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3);
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature is merged into module, by the characteristic image by 1 × 1 convolution
To feature, and it is put into Softmax layers of progress pixel semantic classification;
(3.3) classified according to extraction result to image;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step
(4.5);Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector
vjAngle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and
By the x for the condition that meetsjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step
(4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud being made of intensive discrete point semantically
Figure;
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of space
Point apart from mean value, remove the spatial point excessive apart from mean value;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice is only
Retain a spatial point.
As the preferred embodiment of the present invention, the embeded processor in the step (1) includes
NVIDIAJETSON TX2 system.
As the preferred embodiment of the present invention, the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1),
Specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
This is based on what empty convolution deep neural network realization vision SLAM semanteme built figure function based on the above method
System, wherein the system includes:
Embedded development processor, for constructing vision SLAM semanteme map;
RGB-D camera is connected, for acquiring color data and depth data with the embedded development processor;
Graph builder, the graph builder according to deep learning and vision SLAM, pass through embedded development at runtime
Processor and RGB-D camera realize that vision SLAM semanteme builds figure, specifically follow the steps below processing:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud number
According to;
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is reflected by image coordinate system and world coordinate system
It penetrates, and makes spatial point that there is semantic tagger information;
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) is by Inception (4a), Inception (4b), Inception in GoogLeNet network structure
(4c), Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3);
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature is merged into module, by the characteristic image by 1 × 1 convolution
To feature, and it is put into Softmax layers of progress pixel semantic classification;
(3.3) classified according to extraction result to image;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step
(4.5);Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector
vjAngle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and
By the x for the condition that meetsjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step
(4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud being made of intensive discrete point semantically
Figure;
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of space
Point apart from mean value, remove the spatial point excessive apart from mean value;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice is only
Retain a spatial point.
As the preferred embodiment of the present invention, the embeded processor in the step (1) includes NVIDIA
JETSON TX2 system.
As the preferred embodiment of the present invention, the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1),
Specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
In a specific embodiment of the invention, the present invention relates to the technology necks that unmanned robot system positioned and built in real time figure
Domain is that a kind of vision SLAM semanteme based on empty convolution deep neural network builds drawing method and system.System is using embedded
Development process device, by extracting figure using vision SLAM technology to the collected color data of RGB-D camera and depth data
As characteristic point, characteristic matching is carried out, the method for recycling Bundle Adjustment obtains more accurate robot pose and estimates
Meter detects the cumulative errors for eliminating interframe using winding.While obtaining robot real-time positioning information, a kind of needle is used
To the empty convolution design method of GoogLeNet deep neural network, semantic in real time point is realized using deep neural network is improved
Semantic segmentation result combination vision SLAM system is obtained the figure of building of semantic class, and is disappeared by manifold cluster by the feature extraction cut
Except error brought by optimization semantic segmentation, after building figure by Octree, spatial network map has more advanced semantic information,
And the semantic map constructed is more accurate.
The method that figure function is built based on empty convolution deep neural network realization vision SLAM semanteme based on above system,
Wherein, comprising the following steps:
(1) embedded development processor is used, the colour information of current environment is obtained by RGB-D camera and depth is believed
Breath;
(2) image characteristic point is extracted using vision SLAM technology to the image collected by camera, carries out feature
With obtaining Feature Points Matching pair;Using 3D point to the solution current pose of camera;Utilize the side of figure optimization Bundle Adjustment
Method carries out more accurate pose estimation;The cumulative errors for eliminating interframe are detected using winding, and obtain scene space point cloud data;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, utilizes image coordinate system and world coordinate system
Relationship map is into space, so that each spatial point has semantic tagger information;
(4) using manifold cluster optimization semantic segmentation bring error;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, finally obtain the point being made of intensive discrete point
Cloud semanteme map.
In the above-described example, embeded processor described in the step (1) includes NVIDIA Jetson TX2 system
And same category of device.
In the above-described example, general vision SLAM and its local improvement technology are used in the step (2).
In the above-described example, semantic segmentation network is specifically included with flowering structure in the step (3):
(31) feature extraction layer;
(32) multiple dimensioned extract layer;
(33) classification layer;
In the above-described example, feature extraction layer described in the step (31) is specifically included with flowering structure:
(311) front end features extract layer of the GoogLeNet network structure as DeepLab model is used;
(312) the maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1, from
And characteristic dimension is expanded, guarantee that output resolution ratio is constant;
(313) Inception (4a) in GoogLeNet network structure is partially replaced using empty convolution, setting
Dilation is 2,5 × 5 Pool, to expand characteristic dimension;
(314) Inception (4b) in GoogLeNet network structure is partially replaced using empty convolution, setting
Dilation is 2,5 × 5 Pool, to expand characteristic dimension;
(315) Inception (4c) in GoogLeNet network structure is partially replaced using empty convolution, setting
Dilation is 2,5 × 5 Pool, to expand characteristic dimension;
(316) Inception (4d) in GoogLeNet network structure is partially replaced using empty convolution, setting
Dilation is 2,5 × 5 Pool, to expand characteristic dimension;
(317) Inception (4e) in GoogLeNet network structure is partially replaced using empty convolution, setting
Dilation is 2,5 × 5 Pool, to expand characteristic dimension;
(316) the maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1, from
And characteristic dimension is expanded, guarantee that output resolution ratio is constant;
Wherein having a size of 224, feature Output Size is 7 for original GoogLeNet input, is equivalent to and reduces 32 times, will
The step-length of two layers of pond layer is revised as 1 afterwards, and original common convolution is revised as empty convolution, in this way for input having a size of
321, the Output Size of characteristic pattern is 41, is equivalent to and reduces 8 times, to expand characteristic dimension.
In the above-described example, multiple dimensioned layer described in the step (32) is specifically included with flowering structure:
(321) based on spatial pyramid pondization mode carries out multiple dimensioned processing;
(322) spatial pyramid pond model is optimized, uses 1 × 1 convolution and different sample rates (6,12,18)
Empty convolution extract different scale receptive field feature;
(323) by image pond Fusion Features into module, then obtained characteristic image is all passed through to 1 × 1 volume
Fusion (Concat) obtains feature to the end after product, places into Softmax layers of progress pixel semantic classification.
In the above-described example, manifold described in described ground step (4) cluster specifically includes the following steps:
(41) the Surface by Tangent Plane Method vector of each spatial point is calculated, if currently cluster classification c=0;
(42) the point x for being assigned classification not yet is searched foriIf all the points have clustered, (85) are thened follow the steps,
Otherwise, if xiClassification is c=c+1, and creates an empty queue q;
(43) spatial point x is calculatediSurface by Tangent Plane Method vector viWith its distance less than all the points x in 0.01 rangejNormal vector
vjAngle αijIf αij< σ or αij175 ° of >, then xjAnd xiIt is classified as one kind, xjClassification is c, and by the x for the condition that meetsj
It is pressed into queue q;
(44) if queue q non-empty, enables xi=q1, step 3 is continued to execute, step 1 is otherwise jumped to;
(45) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle.
Wherein, in step (41) Surface by Tangent Plane Method vector calculating step are as follows:
If n spatial point forms matrixCovariance matrix ∑=E [(X- μ) (X- μ) of XT]
If w ∈ R3×1For the unit normal vector of this plane, Z=wTX is throwing of this n point on this unit normal vector
Shadow length, establishes model:
s.t.wTW=1
It is solved using method of Lagrange multipliers:
Partial derivative is asked to obtain above formula:
W needs are unitization, a corresponding eigenvalue in above formula hasAnd
And covariance matrix is positive semidefinite matrix, so space vector w is the smallest unit of the more corresponding eigenvalues of ∑ of covariance matrix
Feature vector.
In the above-described example, build nomography described in the step (5) specifically includes the following steps:
(51) when generating each frame point cloud information, according to the precision characteristic of RGB-D camera, remove that depth value is too big or nothing
The point cloud of effect;
(52) spatial point isolated using the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of space
Point apart from mean value, remove the spatial point excessive apart from mean value and eliminate isolated noise spot to retain dense space point;
(53) space lattice principle is utilized, all spatial point clouds are filled into space lattice, guarantee each space networks
Lattice only retain a spatial point, be equivalent to spatial point cloud carry out it is down-sampled, to save many memory spaces.
Wherein, spatial network map is established using octotree data structure.
For a spatial cuboids, eight regions are classified as, identical, each subregion continues to be divided into eight areas
Domain dynamically creates an Octree map in this way.
With reference to the accompanying drawing and specific embodiment is discussed in detail, the vision of the invention based on empty convolutional neural networks
SLAM semanteme builds drawing method.
Vision SLAM semanteme based on empty convolutional neural networks builds drawing method and system flow as shown in Figure 1:
The image data acquired by RGB-D camera selects the not high frame of similarity as key frame, and key frame includes coloured silk
Chromatic graph picture, depth image and current pose carry out semantic segmentation to color image, first by using based on the empty convolution of improvement
GoogLeNet feature extraction layer, multiple dimensioned layer obtains original semantic point cloud.Operation is filtered to original semantic point cloud,
Popular cluster is carried out in conjunction with depth image, finally combines posture information to carry out Octree together and builds figure, the improvement of network improves
The processing capability in real time of system, can on the embedded platform based on NVIDIAJETSON TX2 real-time implementation.
Vision SLAM semanteme based on empty convolutional neural networks is built in drawing method and system flow, and deep learning language is passed through
Justice segmentation network obtains the semantic information of image, and system flow is as shown in Fig. 2, be broadly divided into feature extraction, multiple dimensioned extraction
With three parts of classification.
Vision SLAM semanteme based on empty convolutional neural networks builds cavity convolution used in drawing method and system flow
It is as shown in Figure 3:
Convolution sum pond is regarded as operation of the same race, it is assumed that intermediate violet spot part is general as input, the green portion of figure
Logical convolution process obtains feature after step-length is respectively 2,1,2,1 convolution (or pond) process.The characteristic point of top layer
Corresponding receptive field is entire input layer.
In order to expand characteristic size, step-length is all changed to 1, first layer volume for pink colour part in figure using empty convolution
After product step-size change, enabling dilation is 1, and obtained number of features expands twice, when carrying out second layer convolution operation, is enabled
Dilation is 2, that is, when doing convolution operation, is spaced 1 point and convolution nuclear convolution, obtains the two of feature or original common convolution
Times, and the receptive field of characteristic point is constant, continues third layer convolution operation, is changed to 1 with by step-length, in order to keep identical sense
By open country, dilation at this time equally should be 2.In the 4th layer of convolution operation, dilation will just be able to maintain sense for 4 at this time
By wild constant.
It is needed to pay attention to when using empty convolution:
S1. in the step-length of upper one layer of convolution operation by strideoldBecome stridenew, in order to keep receptive field constant, connect
The convolution layer operation for getting off all will carry out voidageConvolution with holes;
S2. the voidage of current layer cavity convolution operation such as following formula.
The wherein front layer step-size change number of N representative,For the change of n-th step-length.
It is as shown in Figure 4 that vision SLAM semanteme based on empty convolutional neural networks builds figure result.Image is at two in figure
It is being tested in scene as a result, it is left be office scenarios, the right side be laboratory scene.What first behavior this system exported in figure has language
Adopted information builds figure as a result, wherein chair, people, plant are respectively with red, pink colour, green mark;Second behavior Conventional visual
What SLAM was established builds figure result without semantic information.The experimental results showed that the present invention, which can be such that robot is well understood by, works as front court
Main target in scape.Software and the algorithm according to the present invention item on NVIDIA Jetson TX2 embedded platform,
Its processor diagram is as shown in Figure 5.
Using the method for the invention for building figure function based on empty convolution deep neural network realization vision SLAM semanteme
And system, system use embedded development processor, by the collected color data of RGB-D camera and depth data, benefit
With vision SLAM technology, image characteristic point is extracted, carries out characteristic matching, the method for recycling Bundle Adjustment obtains
More accurate robot pose estimation, the cumulative errors for eliminating interframe are detected using winding.Letter is positioned in real time obtaining robot
It is refreshing using depth is improved using a kind of empty convolution design method for GoogLeNet deep neural network while breath
Semantic segmentation result combination vision SLAM system is obtained building for semantic class by the feature extraction through the real-time semantic segmentation of network implementations
Figure.And clustered by manifold and eliminate error brought by optimization semantic segmentation, after building figure by Octree, spatial network map tool
There is more advanced semantic information, and the semantic map constructed is more accurate.The improvement of network improves the real-time place of system
Time loss of the semantic segmentation network of reason ability, this method and system on NVIDIA Jetson TX2 platform is 0.099s/
Width meets use demand during building figure in real time.
In this description, the present invention is described with reference to its specific embodiment.But it is clear that can still make
Various modifications and alterations are without departing from the spirit and scope of the invention.Therefore, the description and the appended drawings should be considered as illustrative
And not restrictive.
Claims (18)
1. a kind of method for realizing that vision SLAM semanteme builds figure function based on empty convolution deep neural network, which is characterized in that
The method the following steps are included:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud data;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is mapped by image coordinate system and world coordinate system, and
So that spatial point has semantic tagger information;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud semanteme map being made of intensive discrete point.
2. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the embeded processor in the step (1) includes NVIDIA JETSON TX2 system.
3. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the step (2) the following steps are included:
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data.
4. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that in the step (3) to image carry out Pixel-level semantic segmentation specifically includes the following steps:
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.3) classified according to extraction result to image.
5. according to claim 4 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the step (3.1) further includes the design process of feature extraction layer, specifically includes the following steps:
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) by Inception (4a) in GoogLeNet network structure, Inception (4b), Inception (4c),
Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3).
6. according to claim 4 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the step (3.2) further includes the design process of multiple dimensioned extract layer, specifically includes the following steps:
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature merges to obtain spy into module, by the characteristic image by 1 × 1 convolution
Sign, and it is put into Softmax layers of progress pixel semantic classification.
7. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the step (4) specifically includes the following steps:
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step (4.5);
Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector vjFolder
Angle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and will be met
The x of conditionjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step (4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle.
8. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1), specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
9. according to claim 1 realize that vision SLAM semanteme builds the side of figure function based on empty convolution deep neural network
Method, which is characterized in that the step (5) the following steps are included:
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of spatial point
Apart from mean value, the spatial point excessive apart from mean value is removed;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice only retains
One spatial point.
10. a kind of system for realizing that vision SLAM semanteme builds figure function based on empty convolution deep neural network, which is characterized in that
The system includes:
Embedded development processor, for constructing vision SLAM semanteme map;
RGB-D camera is connected, for acquiring color data and depth data with the embedded development processor;
Graph builder, the graph builder according to deep learning and vision SLAM, are handled at runtime by embedded development
Device and RGB-D camera realize that vision SLAM semanteme builds figure, specifically follow the steps below processing:
(1) embedded development processor obtains the colour information and depth information of current environment by RGB-D camera;
(2) Feature Points Matching pair is obtained by the image acquired, the estimation of line position of going forward side by side appearance, and obtain scene space point cloud data;
(3) Pixel-level semantic segmentation is carried out to image using deep learning, is mapped by image coordinate system and world coordinate system, and
So that spatial point has semantic tagger information;
(4) it is clustered by manifold and eliminates error brought by optimization semantic segmentation;
(5) it carries out semanteme and builds figure, spatial point cloud is spliced, obtain the point cloud semanteme map being made of intensive discrete point.
11. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the embeded processor in the step (1) includes NVIDIA JETSON TX2 system.
12. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the step (2) the following steps are included:
(2.1) image characteristic point is extracted by vision SLAM technology, carries out characteristic matching and obtains Feature Points Matching pair;
(2.2) by 3D point to the solution current pose of camera;
(2.3) more accurate pose estimation is carried out by the method for figure optimization Bundle Adjustment;
(2.4) cumulative errors for eliminating interframe are detected by winding, and obtain scene space point cloud data.
13. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that in the step (3) to image carry out Pixel-level semantic segmentation specifically includes the following steps:
(3.1) pass through the feature extraction layer based on the GoogLeNet for improving empty convolution;
(3.2) pass through the multiple dimensioned extract layer based on the GoogLeNet for improving empty convolution;
(3.3) classified according to extraction result to image.
14. according to claim 13 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the step (3.1) further includes the design process of feature extraction layer, specifically includes the following steps:
Maximum pond layer step-length after Inception (3b) in GoogLeNet network structure is revised as 1 by (3.1.1);
(3.1.2) by Inception (4a) in GoogLeNet network structure, Inception (4b), Inception (4c),
Inception (4d), Inception (4e) are partially replaced using empty convolution, and be arranged empty convolution be 5 × 5 and
The Pool that dilation is 2;
Maximum pond layer step-length after Inception (4e) in GoogLeNet network structure is revised as 1 by (3.1.3).
15. according to claim 13 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the step (3.2) further includes the design process of multiple dimensioned extract layer, specifically includes the following steps:
(3.2.1) is based on spatial pyramid pondization and carries out multiple dimensioned processing;
(3.2.2) extracts the characteristic image of different scale by the empty convolution of 1 × 1 convolution sum difference sample rate;
(3.2.3) blending image pond feature merges to obtain spy into module, by the characteristic image by 1 × 1 convolution
Sign, and it is put into Softmax layers of progress pixel semantic classification.
16. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the step (4) specifically includes the following steps:
(4.1) the Surface by Tangent Plane Method vector of spatial point is calculated;
(4.2) the point x of unassigned classification is searched fori, judge whether that all the points have clustered, if it is, continuing step (4.5);
Otherwise, xiClassification is c=c+1, and creates empty queue q;
(4.3) spatial point x is calculatediSurface by Tangent Plane Method vector viWith away from it less than all the points x in 0.01 rangejNormal vector vjFolder
Angle αij, judge whether there is αij< σ or αij175 ° of >, if it is, xjAnd xiIt is classified as one kind, xjClassification is c, and will be met
The x of conditionjIt is pressed into queue q;Otherwise, continue step (4.4);
(4.4) judge queue q whether non-empty, if it is, enabling xi=q1, continue step (4.3);Otherwise continue step (4.1);
(4.5) the preceding k class point that points are most in cluster is extracted, remaining point is sorted out according to nearby principle.
17. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the Surface by Tangent Plane Method vector of the calculating spatial point in the step (4.1), specifically:
The Surface by Tangent Plane Method vector of spatial point is calculated according to the following formula:
Wherein, w ∈ R3×1For the unit normal vector of the plane, a is characterized value.
18. according to claim 10 realize that vision SLAM semanteme builds figure function based on empty convolution deep neural network
System, which is characterized in that the step (5) the following steps are included:
(5.1) according to the precision characteristic of RGB-D camera, the too big or invalid point cloud of depth value is removed;
(5.2) spatial point isolated by the removal of statistical zero-knowledge method calculates each spatial point and its nearest N number of spatial point
Apart from mean value, the spatial point excessive apart from mean value is removed;
(5.3) by space lattice principle, all spatial point clouds are filled into space lattice, so that each space lattice only retains
One spatial point.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811088531 | 2018-09-18 | ||
CN2018110885315 | 2018-09-18 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109559320A true CN109559320A (en) | 2019-04-02 |
CN109559320B CN109559320B (en) | 2022-11-18 |
Family
ID=65866933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811388678.6A Active CN109559320B (en) | 2018-09-18 | 2018-11-21 | Method and system for realizing visual SLAM semantic mapping function based on hole convolution deep neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109559320B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046677A (en) * | 2019-04-26 | 2019-07-23 | 山东大学 | Data preprocessing method, map constructing method, winding detection method and system |
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
CN110146099A (en) * | 2019-05-31 | 2019-08-20 | 西安工程大学 | A kind of synchronous superposition method based on deep learning |
CN110146098A (en) * | 2019-05-06 | 2019-08-20 | 北京猎户星空科技有限公司 | A kind of robot map enlargement method, device, control equipment and storage medium |
CN110197215A (en) * | 2019-05-22 | 2019-09-03 | 深圳市牧月科技有限公司 | A kind of ground perception point cloud semantic segmentation method of autonomous driving |
CN110264572A (en) * | 2019-06-21 | 2019-09-20 | 哈尔滨工业大学 | A kind of terrain modeling method and system merging geometrical property and mechanical characteristic |
CN110276286A (en) * | 2019-06-13 | 2019-09-24 | 中国电子科技集团公司第二十八研究所 | A kind of embedded panoramic video splicing system based on TX2 |
CN110297491A (en) * | 2019-07-02 | 2019-10-01 | 湖南海森格诺信息技术有限公司 | Semantic navigation method and its system based on multiple structured light binocular IR cameras |
CN110363178A (en) * | 2019-07-23 | 2019-10-22 | 上海黑塞智能科技有限公司 | The airborne laser point cloud classification method being embedded in based on part and global depth feature |
CN110378345A (en) * | 2019-06-04 | 2019-10-25 | 广东工业大学 | Dynamic scene SLAM method based on YOLACT example parted pattern |
CN110533716A (en) * | 2019-08-20 | 2019-12-03 | 西安电子科技大学 | A kind of semantic SLAM system and method based on 3D constraint |
CN110544307A (en) * | 2019-08-29 | 2019-12-06 | 广州高新兴机器人有限公司 | Semantic map construction method based on convolutional neural network and computer storage medium |
CN110619299A (en) * | 2019-09-12 | 2019-12-27 | 北京影谱科技股份有限公司 | Object recognition SLAM method and device based on grid |
CN110781262A (en) * | 2019-10-21 | 2020-02-11 | 中国科学院计算技术研究所 | Semantic map construction method based on visual SLAM |
CN110827305A (en) * | 2019-10-30 | 2020-02-21 | 中山大学 | Semantic segmentation and visual SLAM tight coupling method oriented to dynamic environment |
CN110910405A (en) * | 2019-11-20 | 2020-03-24 | 湖南师范大学 | Brain tumor segmentation method and system based on multi-scale cavity convolutional neural network |
CN110956651A (en) * | 2019-12-16 | 2020-04-03 | 哈尔滨工业大学 | Terrain semantic perception method based on fusion of vision and vibrotactile sense |
CN111670417A (en) * | 2019-07-05 | 2020-09-15 | 深圳市大疆创新科技有限公司 | Semantic map construction method, semantic map construction system, mobile platform and storage medium |
CN111797938A (en) * | 2020-07-15 | 2020-10-20 | 燕山大学 | Semantic information and VSLAM fusion method for sweeping robot |
CN113191367A (en) * | 2021-05-25 | 2021-07-30 | 华东师范大学 | Semantic segmentation method based on dense scale dynamic network |
WO2021249575A1 (en) * | 2020-06-09 | 2021-12-16 | 全球能源互联网研究院有限公司 | Area semantic learning and map point identification method for power transformation operation scene |
CN115240115A (en) * | 2022-07-27 | 2022-10-25 | 河南工业大学 | Visual SLAM loop detection method combining semantic features and bag-of-words model |
CN116657348A (en) * | 2023-06-02 | 2023-08-29 | 浙江正源丝绸科技有限公司 | Silk pretreatment method and system |
CN118629003A (en) * | 2024-05-23 | 2024-09-10 | 中兵智能创新研究院有限公司 | Dynamic environment vision SLAM method based on previous frame memory and DCP network layer |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102024262A (en) * | 2011-01-06 | 2011-04-20 | 西安电子科技大学 | Method for performing image segmentation by using manifold spectral clustering |
CN105787510A (en) * | 2016-02-26 | 2016-07-20 | 华东理工大学 | System and method for realizing subway scene classification based on deep learning |
CN107358189A (en) * | 2017-07-07 | 2017-11-17 | 北京大学深圳研究生院 | It is a kind of based on more object detecting methods under the indoor environments of Objective extraction |
CN107480603A (en) * | 2017-07-27 | 2017-12-15 | 大连和创懒人科技有限公司 | Figure and method for segmenting objects are synchronously built based on SLAM and depth camera |
CN108230337A (en) * | 2017-12-31 | 2018-06-29 | 厦门大学 | A kind of method that semantic SLAM systems based on mobile terminal are realized |
CN109636905A (en) * | 2018-12-07 | 2019-04-16 | 东北大学 | Environment semanteme based on depth convolutional neural networks builds drawing method |
CN111462135A (en) * | 2020-03-31 | 2020-07-28 | 华东理工大学 | Semantic mapping method based on visual S L AM and two-dimensional semantic segmentation |
WO2021018690A1 (en) * | 2019-07-31 | 2021-02-04 | Continental Automotive Gmbh | Method for determining an environmental model of a scene |
-
2018
- 2018-11-21 CN CN201811388678.6A patent/CN109559320B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102024262A (en) * | 2011-01-06 | 2011-04-20 | 西安电子科技大学 | Method for performing image segmentation by using manifold spectral clustering |
CN105787510A (en) * | 2016-02-26 | 2016-07-20 | 华东理工大学 | System and method for realizing subway scene classification based on deep learning |
CN107358189A (en) * | 2017-07-07 | 2017-11-17 | 北京大学深圳研究生院 | It is a kind of based on more object detecting methods under the indoor environments of Objective extraction |
CN107480603A (en) * | 2017-07-27 | 2017-12-15 | 大连和创懒人科技有限公司 | Figure and method for segmenting objects are synchronously built based on SLAM and depth camera |
CN108230337A (en) * | 2017-12-31 | 2018-06-29 | 厦门大学 | A kind of method that semantic SLAM systems based on mobile terminal are realized |
CN109636905A (en) * | 2018-12-07 | 2019-04-16 | 东北大学 | Environment semanteme based on depth convolutional neural networks builds drawing method |
WO2021018690A1 (en) * | 2019-07-31 | 2021-02-04 | Continental Automotive Gmbh | Method for determining an environmental model of a scene |
CN111462135A (en) * | 2020-03-31 | 2020-07-28 | 华东理工大学 | Semantic mapping method based on visual S L AM and two-dimensional semantic segmentation |
Non-Patent Citations (7)
Title |
---|
ANESTIS ZAGANIDIS等: "Integrating Deep Semantic Segmentation Into 3-D Point Cloud Registration", 《IEEE ROBOTICS AND AUTOMATION LETTERS》 * |
SUPERVAN: "深度学习结合SLAM的研究思路/成果整理之", 《HTTPS:https://WWW.CNBLOGS.COM/CHAOFN/P/9334685.HTML》 * |
YU ZHU等: "Real-Time Semantic Mapping of Visual SLAM Based on DCNN", 《COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE》 * |
林智鹏等: "流形降维最小二乘回归子空间分割", 《信息技术与网络安全》 * |
潘琢金等: "融合空洞卷积神经网络的语义SLAM研究", 《现代电子技术》 * |
白云汉: "基于SLAM算法和深度神经网络的语义地图构建研究", 《计算机应用与软件》 * |
英特尔中国研究院: "专栏|语义SLAM的重要性,你造吗?", 《HTTPS:https://ZHIDX.COM/P/92828.HTML》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
CN110046677A (en) * | 2019-04-26 | 2019-07-23 | 山东大学 | Data preprocessing method, map constructing method, winding detection method and system |
CN110146098A (en) * | 2019-05-06 | 2019-08-20 | 北京猎户星空科技有限公司 | A kind of robot map enlargement method, device, control equipment and storage medium |
CN110146098B (en) * | 2019-05-06 | 2021-08-20 | 北京猎户星空科技有限公司 | Robot map extension method and device, control equipment and storage medium |
CN110197215A (en) * | 2019-05-22 | 2019-09-03 | 深圳市牧月科技有限公司 | A kind of ground perception point cloud semantic segmentation method of autonomous driving |
CN110146099A (en) * | 2019-05-31 | 2019-08-20 | 西安工程大学 | A kind of synchronous superposition method based on deep learning |
CN110378345B (en) * | 2019-06-04 | 2022-10-04 | 广东工业大学 | Dynamic scene SLAM method based on YOLACT instance segmentation model |
CN110378345A (en) * | 2019-06-04 | 2019-10-25 | 广东工业大学 | Dynamic scene SLAM method based on YOLACT example parted pattern |
CN110276286B (en) * | 2019-06-13 | 2022-03-04 | 中国电子科技集团公司第二十八研究所 | Embedded panoramic video stitching system based on TX2 |
CN110276286A (en) * | 2019-06-13 | 2019-09-24 | 中国电子科技集团公司第二十八研究所 | A kind of embedded panoramic video splicing system based on TX2 |
CN110264572A (en) * | 2019-06-21 | 2019-09-20 | 哈尔滨工业大学 | A kind of terrain modeling method and system merging geometrical property and mechanical characteristic |
CN110264572B (en) * | 2019-06-21 | 2021-07-30 | 哈尔滨工业大学 | Terrain modeling method and system integrating geometric characteristics and mechanical characteristics |
CN110297491A (en) * | 2019-07-02 | 2019-10-01 | 湖南海森格诺信息技术有限公司 | Semantic navigation method and its system based on multiple structured light binocular IR cameras |
CN111670417A (en) * | 2019-07-05 | 2020-09-15 | 深圳市大疆创新科技有限公司 | Semantic map construction method, semantic map construction system, mobile platform and storage medium |
WO2021003587A1 (en) * | 2019-07-05 | 2021-01-14 | 深圳市大疆创新科技有限公司 | Semantic map building method and system, and movable platforms and storage medium |
CN110363178A (en) * | 2019-07-23 | 2019-10-22 | 上海黑塞智能科技有限公司 | The airborne laser point cloud classification method being embedded in based on part and global depth feature |
CN110363178B (en) * | 2019-07-23 | 2021-10-15 | 上海黑塞智能科技有限公司 | Airborne laser point cloud classification method based on local and global depth feature embedding |
CN110533716A (en) * | 2019-08-20 | 2019-12-03 | 西安电子科技大学 | A kind of semantic SLAM system and method based on 3D constraint |
CN110533716B (en) * | 2019-08-20 | 2022-12-02 | 西安电子科技大学 | Semantic SLAM system and method based on 3D constraint |
CN110544307A (en) * | 2019-08-29 | 2019-12-06 | 广州高新兴机器人有限公司 | Semantic map construction method based on convolutional neural network and computer storage medium |
CN110619299A (en) * | 2019-09-12 | 2019-12-27 | 北京影谱科技股份有限公司 | Object recognition SLAM method and device based on grid |
CN110781262A (en) * | 2019-10-21 | 2020-02-11 | 中国科学院计算技术研究所 | Semantic map construction method based on visual SLAM |
CN110827305A (en) * | 2019-10-30 | 2020-02-21 | 中山大学 | Semantic segmentation and visual SLAM tight coupling method oriented to dynamic environment |
CN110910405B (en) * | 2019-11-20 | 2023-04-18 | 湖南师范大学 | Brain tumor segmentation method and system based on multi-scale cavity convolutional neural network |
CN110910405A (en) * | 2019-11-20 | 2020-03-24 | 湖南师范大学 | Brain tumor segmentation method and system based on multi-scale cavity convolutional neural network |
CN110956651A (en) * | 2019-12-16 | 2020-04-03 | 哈尔滨工业大学 | Terrain semantic perception method based on fusion of vision and vibrotactile sense |
WO2021249575A1 (en) * | 2020-06-09 | 2021-12-16 | 全球能源互联网研究院有限公司 | Area semantic learning and map point identification method for power transformation operation scene |
CN111797938B (en) * | 2020-07-15 | 2022-03-15 | 燕山大学 | Semantic information and VSLAM fusion method for sweeping robot |
CN111797938A (en) * | 2020-07-15 | 2020-10-20 | 燕山大学 | Semantic information and VSLAM fusion method for sweeping robot |
CN113191367A (en) * | 2021-05-25 | 2021-07-30 | 华东师范大学 | Semantic segmentation method based on dense scale dynamic network |
CN115240115A (en) * | 2022-07-27 | 2022-10-25 | 河南工业大学 | Visual SLAM loop detection method combining semantic features and bag-of-words model |
CN116657348A (en) * | 2023-06-02 | 2023-08-29 | 浙江正源丝绸科技有限公司 | Silk pretreatment method and system |
CN116657348B (en) * | 2023-06-02 | 2023-11-21 | 浙江正源丝绸科技有限公司 | Silk pretreatment method and system |
CN118629003A (en) * | 2024-05-23 | 2024-09-10 | 中兵智能创新研究院有限公司 | Dynamic environment vision SLAM method based on previous frame memory and DCP network layer |
Also Published As
Publication number | Publication date |
---|---|
CN109559320B (en) | 2022-11-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109559320A (en) | Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network | |
JP6830707B1 (en) | Person re-identification method that combines random batch mask and multi-scale expression learning | |
CN106127204B (en) | A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks | |
CN108052896B (en) | Human body behavior identification method based on convolutional neural network and support vector machine | |
CN109740665A (en) | Shielded image ship object detection method and system based on expertise constraint | |
CN113221625B (en) | Method for re-identifying pedestrians by utilizing local features of deep learning | |
CN110097553A (en) | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system | |
CN108734143A (en) | A kind of transmission line of electricity online test method based on binocular vision of crusing robot | |
CN110852182B (en) | Depth video human body behavior recognition method based on three-dimensional space time sequence modeling | |
CN107832672A (en) | A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information | |
CN110378997A (en) | A kind of dynamic scene based on ORB-SLAM2 builds figure and localization method | |
CN114972418A (en) | Maneuvering multi-target tracking method based on combination of nuclear adaptive filtering and YOLOX detection | |
CN109341703A (en) | A kind of complete period uses the vision SLAM algorithm of CNNs feature detection | |
CN108537121B (en) | Self-adaptive remote sensing scene classification method based on meteorological environment parameter and image information fusion | |
CN109035329A (en) | Camera Attitude estimation optimization method based on depth characteristic | |
US11361534B2 (en) | Method for glass detection in real scenes | |
CN109816714A (en) | A kind of point cloud object type recognition methods based on Three dimensional convolution neural network | |
CN110334584A (en) | A kind of gesture identification method based on the full convolutional network in region | |
CN114241226A (en) | Three-dimensional point cloud semantic segmentation method based on multi-neighborhood characteristics of hybrid model | |
CN117197676B (en) | Target detection and identification method based on feature fusion | |
CN113076891A (en) | Human body posture prediction method and system based on improved high-resolution network | |
Li et al. | An aerial image segmentation approach based on enhanced multi-scale convolutional neural network | |
CN117011380A (en) | 6D pose estimation method of target object | |
CN111339967B (en) | Pedestrian detection method based on multi-view graph convolution network | |
CN114358133B (en) | Method for detecting looped frames based on semantic-assisted binocular vision SLAM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |