WO2024113078A1 - Local context feature extraction module for semantic segmentation in 3d point cloud scenario - Google Patents
Local context feature extraction module for semantic segmentation in 3d point cloud scenario Download PDFInfo
- Publication number
- WO2024113078A1 WO2024113078A1 PCT/CN2022/134619 CN2022134619W WO2024113078A1 WO 2024113078 A1 WO2024113078 A1 WO 2024113078A1 CN 2022134619 W CN2022134619 W CN 2022134619W WO 2024113078 A1 WO2024113078 A1 WO 2024113078A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- point
- local
- local context
- centroid
- extraction module
- Prior art date
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 31
- 230000011218 segmentation Effects 0.000 title claims abstract description 29
- 238000011176 pooling Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 abstract description 8
- 238000000034 method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
Definitions
- the present invention belongs to the field of computer vision technology, and in particular relates to a local context feature extraction module for semantic segmentation of 3D point cloud scenes.
- PointNet mainly solves how to use 2D neural networks directly to process 3D point clouds themselves, and can stably extract point set features even if the point clouds fluctuate, are noisy or missing.
- PointNet++ has designed an SA layer to extract features of surrounding points, the extraction effect is not good and the amount of calculation is relatively poor.
- FPS furthest point sampling
- the purpose of the embodiments of this specification is to provide a local context feature extraction module for 3D point cloud scene semantic segmentation.
- the present application provides a local context feature extraction module for 3D point cloud scene semantic segmentation, the local context feature extraction module comprising:
- a rotationally invariant local representation receives local spatial information, and the local spatial information includes coordinate information of a plurality of points;
- the local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point.
- the local spatial information is recorded as (K, 3), and the local rotation invariance of each point includes the rotation invariance of the X-axis, Y-axis, and Z-axis;
- the rotation invariance about the Z axis means:
- k (1, 2, ..., K)
- x im , y im , z im are the coordinates of the centroid of the point cloud where point i is located.
- calculating the relative position codes of the centroid point and the neighboring points includes:
- the coordinate difference and Euclidean distance between the centroid and the adjacent points are calculated
- the relative position coding of the centroid point and the neighboring points is determined.
- the coordinates of the centroid point Pi , the coordinates of the kth neighboring point Pik , then the relative position code rik of the centroid point and the kth neighboring point is :
- determining the local context feature based on the local rotation invariant representation of each point and the position encoding of the relative point includes:
- Weights are introduced into the rotation invariance representation of the X-axis and the Z-axis, and the local context features are determined based on the rotation invariance representation of the Y-axis of each point, the position encoding of the relative point, and the rotation invariance representation of the X-axis and the Z-axis after the weights are introduced.
- the local context feature for:
- ⁇ is the weight expressed by the X-axis rotation invariance
- ⁇ is the weight expressed by the Z-axis rotation invariance
- the local context feature extraction module further includes: attention pooling, which weights the local context features by learning attention weights through geometric distance and feature distance to obtain enhanced local context features.
- the local context features are weighted by learning attention weights through geometric distance and feature distance to obtain enhanced local context features, including:
- the weights of the attention pool are combined with the local context features, and the attention weights are obtained by sharing the MLP and normalized exponential function;
- Enhanced local context features are obtained based on the attention weights and neighboring point features.
- the present application provides a 3D point cloud scene semantic segmentation network, which includes the local context feature extraction module and encoder architecture for 3D point cloud scene semantic segmentation of the first aspect;
- the local context feature extraction module for 3D point cloud scene semantic segmentation is embedded in the encoder architecture.
- the solution learns local features with X-Y-Z three-axis rotation invariance, while compensating for the fact that random sampling may cause the loss of many useful point features.
- FIG1 is a schematic diagram of the structure of a local context feature extraction module for 3D point cloud scene semantic segmentation provided by the present application
- FIG2 is a schematic diagram of the structure of the 3D point cloud scene semantic segmentation network provided in this application.
- PointNet has become one of the most promising methods for directly processing 3D point clouds. It uses a shared multi-layer perceptron (MLP) to learn point-by-point features and has achieved good results. Later, the optimized derivative model PointNet++ emerged, which further improved the performance of point cloud segmentation.
- MLP multi-layer perceptron
- PointNet++ emerged, which further improved the performance of point cloud segmentation.
- the farthest point sampling method used is only suitable for small-scale point clouds, while large-scale point clouds have large data volumes, and the farthest point sampling (FPS) will take up more memory and reduce computing and network efficiency; 2)
- the point cloud itself is rotationally invariant.
- the segmentation results of point clouds input at different angles should be consistent. For example, chairs in different positions in a conference room must have different directions. No matter from which angle the network segmentation is input, the segmentation result should be classified as a chair. This shows that the features learned by 3D point clouds are direction-sensitive, and this direction sensitivity will affect the effect of point cloud segment
- this application adopts random sampling to adapt to large-scale scene-level point cloud data, and proposes a local context feature aggregation module with rotation invariance to learn local features with X-Y-Z three-axis rotation invariance, while compensating for the fact that random sampling may cause the loss of many useful point features.
- FIG. 1 there is shown a schematic diagram of the structure of a local context feature extraction module for 3D point cloud scene semantic segmentation provided in an embodiment of the present application.
- the local context feature extraction module for 3D point cloud scene semantic segmentation may include:
- a rotationally invariant local representation receives local spatial information, and the local spatial information includes coordinate information of a plurality of points;
- the local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point.
- the local context feature extraction module As a geometric object, the learned representation of a point set should remain unchanged to rotation transformations. Points rotated together should not change the category of the global point cloud, nor should they change the segmentation of the partial structure of the point cloud. In many real scenes, such as common chairs, objects belonging to the same category usually have different orientations. In addition, it can be clearly understood that the same object is not only represented by the rotation invariance of the Z axis, but also has a certain rotation invariance of the X and Y axes. To solve this problem, we propose to learn a new local representation with X-Y-Z axis rotation invariance, which uses polar coordinates to represent the local geometric structure of each point. The overall structure of LCC is shown in Figure 1.
- the local spatial information (K, 3) is input into the LCC block, and the output is a local representation with X, Y and Z axis rotation invariant features, which are respectively
- the rotation invariance about the Z axis means:
- k (1, 2, ..., K)
- x im , y im , z im are the coordinates of the centroid of the point cloud where point i is located.
- calculating the relative position encoding of the centroid point and the neighboring points includes:
- the coordinate difference and Euclidean distance between the centroid and the adjacent points are calculated
- the relative position coding of the centroid point and the neighboring points is determined.
- the coordinates of the centroid point Pi the coordinates of the kth neighboring point
- the relative position encoding of the centroid point and the kth neighboring point is for:
- the local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point, including:
- Weights are introduced into the rotation invariance representation of the X-axis and the Z-axis, and the local context features are determined based on the rotation invariance representation of the Y-axis of each point, the position encoding of the relative point, and the rotation invariance representation of the X-axis and the Z-axis after the weights are introduced.
- ⁇ is the weight expressed by the X-axis rotation invariance
- ⁇ is the weight expressed by the Z-axis rotation invariance
- the module is subjected to an ablation experiment (introducing the rotation invariance representation of the X axis, the Y axis and the Z axis respectively, and it is found that the rotation invariance representation of the X and Z axes has a more obvious effect on the overall segmentation effect, so the weights ( ⁇ , ⁇ ) are introduced here to increase the proportion of the X and Z axis representations, and the local context features output by the LCC module are obtained.
- the local context feature extraction module further includes: attention pooling, which weights the local context features by learning attention weights through geometric distance and feature distance to obtain enhanced local context features, specifically including:
- the weights of the attention pool are combined with the local context features, and the attention weights are obtained by sharing the MLP and normalized exponential function;
- Enhanced local context features are obtained based on the attention weights and neighboring point features.
- the weight of the attention pool is learned by calculating the negative exponential of the two, and its instability is adjusted by adding the parameter ⁇ .
- the weight of the learned attention pool is:
- the attention weights are obtained by sharing MLP and normalized exponential function:
- the LCC and DP modules together form the LD module to extract and enhance local context features.
- the local context feature extraction module for 3D point cloud scene semantic segmentation learns local features that are invariant to X-Y-Z three-axis rotation, while compensating for the possibility that random sampling may cause the loss of many useful point features.
- FIG. 2 there is shown a schematic diagram of the structure of a 3D point cloud scene semantic segmentation network applicable to an embodiment of the present application.
- the 3D point cloud scene semantic segmentation network includes a local context feature extraction module and an encoder architecture for 3D point cloud scene semantic segmentation;
- the local context feature extraction module for 3D point cloud scene semantic segmentation is embedded in the encoder architecture.
- LD-Net i.e. 3D point cloud scene semantic segmentation network
- the input of the network is a point cloud of size n ⁇ d, where n is the number of points and d is the input feature dimension.
- the point cloud is first fed to the shared MLP layer to extract the features of each point, and the feature dimension is uniformly set to 8.
- the overall network structure is shown in Figure 2.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Provided in the present application is a local context feature extraction module for semantic segmentation in a 3D point cloud scenario. The local context feature extraction module comprises: a local representation of rotation invariance, which is used for: receiving local spatial information, which comprises coordinate information of several points; calculating a local rotation invariance representation of each point according to the local spatial information; searching for a centroid point of a local neighborhood, and calculating a location code of a relative point of the centroid point with respect to a neighboring point; and determining local context features according to the local rotation invariance representation of each point and the location code of the relative point. The solution learns local features having X-Y-Z three-axis rotation invariance, and also compensates for the fact that random sampling may result in the loss of many useful point features.
Description
本发明属于计算机视觉技术领域,特别涉及一种用于3D点云场景语义分割的局部上下文特征提取模块。The present invention belongs to the field of computer vision technology, and in particular relates to a local context feature extraction module for semantic segmentation of 3D point cloud scenes.
由于卷积网络在2D图像上的兴起,很多研究者开始将神经网络应用于3D数据,但是,大部分工作都是将3D点云体素化或者转换为多个视角的2D图像,然后应用常规的卷积神经网络,PointNet主要解决了如何将2D神经网络直接用于处理3D点云本身,即使点云出现波动,噪声或者缺失的情况也能稳定的提取点集特征。但对于PointNet和PointNet++而言最大的不足就是他们只提取了全局特征,丢失了很多信息,虽然PointNet++设计了SA层,用于提取周围点特征,但是提取的效果不好,而且计算量比较差,另一方面由于采用的是FPS(最远点采样),这样的好处是可以尽可能地保留点云空间的特征,但是坏处就是计算大场景的点云的计算速度慢,内存要求高。Due to the rise of convolutional networks on 2D images, many researchers have begun to apply neural networks to 3D data. However, most of the work is to voxelize 3D point clouds or convert them into 2D images from multiple perspectives, and then apply conventional convolutional neural networks. PointNet mainly solves how to use 2D neural networks directly to process 3D point clouds themselves, and can stably extract point set features even if the point clouds fluctuate, are noisy or missing. However, the biggest drawback of PointNet and PointNet++ is that they only extract global features and lose a lot of information. Although PointNet++ has designed an SA layer to extract features of surrounding points, the extraction effect is not good and the amount of calculation is relatively poor. On the other hand, because FPS (furthest point sampling) is used, the advantage is that the features of the point cloud space can be retained as much as possible, but the disadvantage is that the calculation speed of point clouds in large scenes is slow and the memory requirement is high.
发明内容Summary of the invention
本说明书实施例的目的是提供一种用于3D点云场景语义分割的局部上下文特征提取模块。The purpose of the embodiments of this specification is to provide a local context feature extraction module for 3D point cloud scene semantic segmentation.
为解决上述技术问题,本申请实施例通过以下方式实现的:To solve the above technical problems, the embodiments of the present application are implemented in the following ways:
第一方面,本申请提供一种用于3D点云场景语义分割的局部上下文特征提取模块,该局部上下文特征提取模块包括:In a first aspect, the present application provides a local context feature extraction module for 3D point cloud scene semantic segmentation, the local context feature extraction module comprising:
旋转不变性的局部表示,旋转不变性的局部表示接收局部空间信息,局部空间信息包括若干点的坐标信息;A rotationally invariant local representation receives local spatial information, and the local spatial information includes coordinate information of a plurality of points;
根据局部空间信息,计算每个点的局部旋转不变性表示;Based on the local spatial information, calculate the local rotation invariant representation of each point;
查找局部邻域的质心点,并计算质心点与邻近点的相对点的位置编码;Find the centroid of the local neighborhood and calculate the relative position encoding of the centroid and neighboring points;
根据每个点的局部旋转不变性表示及相对点的位置编码,确定局部上下文特征。The local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point.
在其中一个实施例中,局部空间信息记为(K,3),每个点的局部旋转不变性 表示包括X轴、Y轴、Z轴的旋转不变性表示;In one embodiment, the local spatial information is recorded as (K, 3), and the local rotation invariance of each point includes the rotation invariance of the X-axis, Y-axis, and Z-axis;
其中,关于Z轴的旋转不变性表示:Among them, the rotation invariance about the Z axis means:
关于X轴的旋转不变性表示:The rotation invariance about the X axis is expressed as:
关于Y轴的旋转不变性表示:The rotation invariance about the Y axis is expressed as:
其中,k=(1,2,…,K),
为i点的第k邻近点的坐标,(x
im,y
im,z
im)为i点所在点云的质心点的坐标。
Where k = (1, 2, ..., K), are the coordinates of the kth neighboring point of point i, and (x im , y im , z im ) are the coordinates of the centroid of the point cloud where point i is located.
在其中一个实施例中,计算质心点与邻近点的相对点的位置编码,包括:In one embodiment, calculating the relative position codes of the centroid point and the neighboring points includes:
确定质心点的坐标;Determine the coordinates of the centroid point;
确定邻近点的坐标;Determine the coordinates of neighboring points;
根据质心点的坐标及邻近点的坐标,计算质心点与邻近点的坐标差及欧氏距离;According to the coordinates of the centroid and the adjacent points, the coordinate difference and Euclidean distance between the centroid and the adjacent points are calculated;
根据质心点的坐标、邻近点的坐标、坐标差及欧式距离,确定质心点与邻近点的相对点的位置编码。According to the coordinates of the centroid point, the coordinates of the neighboring points, the coordinate difference and the Euclidean distance, the relative position coding of the centroid point and the neighboring points is determined.
在其中一个实施例中,质心点的坐标P
i,第k个邻近点的坐标P
i
k,则质心点与第k个邻近点的相对点的位置编码r
i
k为:
In one embodiment, the coordinates of the centroid point Pi , the coordinates of the kth neighboring point Pik , then the relative position code rik of the centroid point and the kth neighboring point is :
其中,
表示质心点与第k个邻近点的坐标差,
表示质心点与第k个邻近点的欧式距离。
in, represents the coordinate difference between the centroid and the kth neighboring point, Represents the Euclidean distance between the centroid and the kth neighboring point.
在其中一个实施例中,根据每个点的局部旋转不变性表示及相对点的位置编码,确定局部上下文特征,包括:In one embodiment, determining the local context feature based on the local rotation invariant representation of each point and the position encoding of the relative point includes:
对X轴、Z轴的旋转不变性表示引入权重,根据每个点的Y轴的旋转不变性表示、相对点的位置编码及引入权重后的X轴、Z轴的旋转不变性表示,确定局部上下文特征。Weights are introduced into the rotation invariance representation of the X-axis and the Z-axis, and the local context features are determined based on the rotation invariance representation of the Y-axis of each point, the position encoding of the relative point, and the rotation invariance representation of the X-axis and the Z-axis after the weights are introduced.
其中,λ为X轴旋转不变性表示的权重,μ为Z轴旋转不变性表示的权重,
为相对点的位置编码。
Among them, λ is the weight expressed by the X-axis rotation invariance, μ is the weight expressed by the Z-axis rotation invariance, Encodes the position of a relative point.
在其中一个实施例中,局部上下文特征提取模块还包括:注意力池化,注意力池化通过几何距离和特征距离学习注意力权重对局部上下文特征进行加权获得增强的局部上下文特征。In one of the embodiments, the local context feature extraction module further includes: attention pooling, which weights the local context features by learning attention weights through geometric distance and feature distance to obtain enhanced local context features.
在其中一个实施例中,通过几何距离和特征距离学习注意力权重对局部上下文特征进行加权获得增强的局部上下文特征,包括:In one embodiment, the local context features are weighted by learning attention weights through geometric distance and feature distance to obtain enhanced local context features, including:
计算局部空间信息中点与点之间的几何距离;Calculate the geometric distance between points in local spatial information;
计算特征距离;Calculate feature distance;
根据几何距离和特征距离学习注意力池的权重;Learn the weights of the attention pool based on geometric distance and feature distance;
将注意力池的权重和局部上下文特征级联合并,并共享MLP和归一化指数函数得到注意力权重;The weights of the attention pool are combined with the local context features, and the attention weights are obtained by sharing the MLP and normalized exponential function;
根据注意力权重和相邻点特征得到增强的局部上下文特征。Enhanced local context features are obtained based on the attention weights and neighboring point features.
第二方面,本申请提供一种3D点云场景语义分割网络,该网络包括第一方面的用于3D点云场景语义分割的局部上下文特征提取模块和编码器架构;In a second aspect, the present application provides a 3D point cloud scene semantic segmentation network, which includes the local context feature extraction module and encoder architecture for 3D point cloud scene semantic segmentation of the first aspect;
用于3D点云场景语义分割的局部上下文特征提取模块嵌入编码器架构中。The local context feature extraction module for 3D point cloud scene semantic segmentation is embedded in the encoder architecture.
由以上本说明书实施例提供的技术方案可见,该方案:学习具有X-Y-Z三轴旋转不变性的局部特征,同时弥补随机采样可能导致许多有用的点特征丢失。It can be seen from the technical solution provided in the above embodiments of this specification that the solution: learns local features with X-Y-Z three-axis rotation invariance, while compensating for the fact that random sampling may cause the loss of many useful point features.
为了更清楚地说明本说明书实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of this specification or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative labor.
图1为本申请提供的用于3D点云场景语义分割的局部上下文特征提取模块的结构示意图;FIG1 is a schematic diagram of the structure of a local context feature extraction module for 3D point cloud scene semantic segmentation provided by the present application;
图2为本申请提供的3D点云场景语义分割网络的结构示意图。FIG2 is a schematic diagram of the structure of the 3D point cloud scene semantic segmentation network provided in this application.
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。In order to enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the drawings in the embodiments of this specification. Obviously, the described embodiments are only part of the embodiments of this specification, not all of the embodiments. Based on the embodiments in this specification, all other embodiments obtained by ordinary technicians in this field without creative work should fall within the scope of protection of this specification.
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, specific details such as specific system structures, technologies, etc. are provided for the purpose of illustration rather than limitation, so as to provide a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application may also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to prevent unnecessary details from obstructing the description of the present application.
在不背离本申请的范围或精神的情况下,可对本申请说明书的具体实施方式做多种改进和变化,这对本领域技术人员而言是显而易见的。由本申请的说明书得到的其他实施方式对技术人员而言是显而易见得的。本申请说明书和实施例仅是示例性的。It will be apparent to those skilled in the art that various modifications and variations may be made to the specific embodiments of the present application description without departing from the scope or spirit of the present application. Other embodiments derived from the present application description will be apparent to those skilled in the art. The present application description and examples are merely exemplary.
关于本文中所使用的“包含”、“包括”、“具有”、“含有”等等,均为开放性的用语,即意指包含但不限于。The words “include,” “including,” “have,” “contain,” etc. used in this document are open-ended terms, meaning including but not limited to.
本申请中的“份”如无特别说明,均按质量份计。Unless otherwise specified, "parts" in this application are calculated by mass.
相关技术中,开创性工作PointNet已经成为直接处理3D点云一种最有前途 的方法之一,它通过使用共享多层感知机(MLP)学习逐点特征,且取得了不错的效果,后来优化后衍生模型PointNet++的出现,进一步的提升了点云分割的性能,但是仍然存在一些问题:1)采用的最远点采样的方法,只适合小规模点云,而大规模点云数据量大,最远点采样(FPS)会占用更多的内存和计算和网络的效率;2)点云本身具有旋转不变性,不同角度输入的点云,分割结果应该一致,例如一间会议室不同位置的椅子,方向肯定不同,无论从哪个角度输入网络分割,其分割结果都应该归为椅子这一类别,这表明3D点云学习的特征具有方向敏感性,这种方向敏感会影响点云分割的效果。Among the related technologies, the pioneering work PointNet has become one of the most promising methods for directly processing 3D point clouds. It uses a shared multi-layer perceptron (MLP) to learn point-by-point features and has achieved good results. Later, the optimized derivative model PointNet++ emerged, which further improved the performance of point cloud segmentation. However, there are still some problems: 1) The farthest point sampling method used is only suitable for small-scale point clouds, while large-scale point clouds have large data volumes, and the farthest point sampling (FPS) will take up more memory and reduce computing and network efficiency; 2) The point cloud itself is rotationally invariant. The segmentation results of point clouds input at different angles should be consistent. For example, chairs in different positions in a conference room must have different directions. No matter from which angle the network segmentation is input, the segmentation result should be classified as a chair. This shows that the features learned by 3D point clouds are direction-sensitive, and this direction sensitivity will affect the effect of point cloud segmentation.
基于上述缺陷,本申请通过采用随机采样的方式来适应大规模场景级点云数据,同时提出了一个具有旋转不变性的局部上下文特征聚合模块来学习具有X-Y-Z三轴旋转不变性的局部特征,同时弥补随机采样可能导致许多有用的点特征丢失。Based on the above defects, this application adopts random sampling to adapt to large-scale scene-level point cloud data, and proposes a local context feature aggregation module with rotation invariance to learn local features with X-Y-Z three-axis rotation invariance, while compensating for the fact that random sampling may cause the loss of many useful point features.
下面结合附图和实施例对本发明进一步详细说明。The present invention is further described in detail below with reference to the accompanying drawings and embodiments.
参照图1,其示出了适用于本申请实施例提供的用于3D点云场景语义分割的局部上下文特征提取模块的结构示意图。Referring to FIG. 1 , there is shown a schematic diagram of the structure of a local context feature extraction module for 3D point cloud scene semantic segmentation provided in an embodiment of the present application.
如图1所示,用于3D点云场景语义分割的局部上下文特征提取模块,可以包括:As shown in FIG1 , the local context feature extraction module for 3D point cloud scene semantic segmentation may include:
旋转不变性的局部表示,旋转不变性的局部表示接收局部空间信息,局部空间信息包括若干点的坐标信息;A rotationally invariant local representation receives local spatial information, and the local spatial information includes coordinate information of a plurality of points;
根据局部空间信息,计算每个点的局部旋转不变性表示;Based on the local spatial information, calculate the local rotation invariant representation of each point;
查找局部邻域的质心点,并计算质心点与邻近点的相对点的位置编码;Find the centroid of the local neighborhood and calculate the relative position encoding of the centroid and neighboring points;
根据每个点的局部旋转不变性表示及相对点的位置编码,确定局部上下文特征。The local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point.
具体的,局部上下文特征提取模块(Local context characteristics,LCC):作为一个几何对象,点集的学习表示应该对旋转变换保持不变。一起旋转的点不应改变全局点云的类别,也不应该改变对点云部分结构的分割。在许多真实场景中,例如常见的椅子,属于同一类别的对象的方向通常不同。此外,可以清楚地理解,同一物体不仅由Z轴的旋转不变性表示,X轴和Y轴也具有一定的旋转不变性。为了解决这个问题,我们建议学习一种新的具有X-Y-Z轴旋转不变的局部表 示,它利用极坐标表示各个点局部几何结构,LCC的整体结构如图1所示。Specifically, the local context feature extraction module (LCC): As a geometric object, the learned representation of a point set should remain unchanged to rotation transformations. Points rotated together should not change the category of the global point cloud, nor should they change the segmentation of the partial structure of the point cloud. In many real scenes, such as common chairs, objects belonging to the same category usually have different orientations. In addition, it can be clearly understood that the same object is not only represented by the rotation invariance of the Z axis, but also has a certain rotation invariance of the X and Y axes. To solve this problem, we propose to learn a new local representation with X-Y-Z axis rotation invariance, which uses polar coordinates to represent the local geometric structure of each point. The overall structure of LCC is shown in Figure 1.
如图1所示,局部空间信息(K,3)被输入到LCC块中,输出是具有X、Y和Z轴旋转不变特征的局部表示,分别为
As shown in Figure 1, the local spatial information (K, 3) is input into the LCC block, and the output is a local representation with X, Y and Z axis rotation invariant features, which are respectively
其中,关于Z轴的旋转不变性表示:Among them, the rotation invariance about the Z axis means:
关于X轴的旋转不变性表示:The rotation invariance about the X axis is expressed as:
关于Y轴的旋转不变性表示:The rotation invariance about the Y axis is expressed as:
其中,k=(1,2,…,K),
为i点的第k邻近点的坐标,(x
im,y
im,z
im)为i点所在点云的质心点的坐标。
Where k = (1, 2, ..., K), are the coordinates of the kth neighboring point of point i, and (x im , y im , z im ) are the coordinates of the centroid of the point cloud where point i is located.
可以理解的,上述所有反三角函数运算为将笛卡尔坐标系下的表示转换为极坐标系下的表示。It can be understood that all the above inverse trigonometric function operations are to convert the representation in the Cartesian coordinate system into the representation in the polar coordinate system.
一个实施例中,计算质心点与邻近点的相对点的位置编码,包括:In one embodiment, calculating the relative position encoding of the centroid point and the neighboring points includes:
确定质心点的坐标;Determine the coordinates of the centroid point;
确定邻近点的坐标;Determine the coordinates of neighboring points;
根据质心点的坐标及邻近点的坐标,计算质心点与邻近点的坐标差及欧氏距离;According to the coordinates of the centroid and the adjacent points, the coordinate difference and Euclidean distance between the centroid and the adjacent points are calculated;
根据质心点的坐标、邻近点的坐标、坐标差及欧式距离,确定质心点与邻近点的相对点的位置编码。According to the coordinates of the centroid point, the coordinates of the neighboring points, the coordinate difference and the Euclidean distance, the relative position coding of the centroid point and the neighboring points is determined.
具体为,质心点的坐标P
i,第k个邻近点的坐标
则质心点与第k个邻近点的相对点的位置编码
为:
Specifically, the coordinates of the centroid point Pi , the coordinates of the kth neighboring point Then the relative position encoding of the centroid point and the kth neighboring point is for:
其中,
表示质心点与第k个邻近点的坐标差,
表示质心点与第k个邻近点的欧式距离。
in, represents the coordinate difference between the centroid and the kth neighboring point, Represents the Euclidean distance between the centroid and the kth neighboring point.
一个实施例中,根据每个点的局部旋转不变性表示及相对点的位置编码,确定局部上下文特征,包括:In one embodiment, the local context feature is determined based on the local rotation invariant representation of each point and the position encoding of the relative point, including:
对X轴、Z轴的旋转不变性表示引入权重,根据每个点的Y轴的旋转不变性表示、相对点的位置编码及引入权重后的X轴、Z轴的旋转不变性表示,确定局部上下文特征。Weights are introduced into the rotation invariance representation of the X-axis and the Z-axis, and the local context features are determined based on the rotation invariance representation of the Y-axis of each point, the position encoding of the relative point, and the rotation invariance representation of the X-axis and the Z-axis after the weights are introduced.
其中,λ为X轴旋转不变性表示的权重,μ为Z轴旋转不变性表示的权重,
为相对点的位置编码。
Among them, λ is the weight expressed by the X-axis rotation invariance, μ is the weight expressed by the Z-axis rotation invariance, Encodes the position of a relative point.
具体的,将模块进行消融实验(分别单独引入X轴的旋转不变性表示,Y轴旋转不变性表示和Z轴旋转不变性表示,发现X、Z轴的旋转不变性表示对整体分割的效果影响更明显,所以这里引入权重(λ,μ)增加X,Z轴表示的比重,得到LCC模块输出的局部上下文特征
Specifically, the module is subjected to an ablation experiment (introducing the rotation invariance representation of the X axis, the Y axis and the Z axis respectively, and it is found that the rotation invariance representation of the X and Z axes has a more obvious effect on the overall segmentation effect, so the weights (λ, μ) are introduced here to increase the proportion of the X and Z axis representations, and the local context features output by the LCC module are obtained.
一个实施例中,局部上下文特征提取模块还包括:注意力池化,注意力池化通过几何距离和特征距离学习注意力权重对局部上下文特征进行加权获得增强的局部上下文特征,具体包括:In one embodiment, the local context feature extraction module further includes: attention pooling, which weights the local context features by learning attention weights through geometric distance and feature distance to obtain enhanced local context features, specifically including:
计算局部空间信息中点与点之间的几何距离;Calculate the geometric distance between points in local spatial information;
计算特征距离;Calculate feature distance;
根据几何距离和特征距离学习注意力池的权重;Learn the weights of the attention pool based on geometric distance and feature distance;
将注意力池的权重和局部上下文特征级联合并,并共享MLP和归一化指数 函数得到注意力权重;The weights of the attention pool are combined with the local context features, and the attention weights are obtained by sharing the MLP and normalized exponential function;
根据注意力权重和相邻点特征得到增强的局部上下文特征。Enhanced local context features are obtained based on the attention weights and neighboring point features.
具体的,对于LCC给出的点云局部上下文特征采用传统的最大池化或者平均池化可能会导致大部分信息的丢失,在这里我们设计一个类似于注意力池化的模块对局部上下文特征进行处理。首先我们认为距离越近的点相关性越大,我们计算出点与点之间的几何相对距离
和特征距离
为:
Specifically, using traditional maximum pooling or average pooling for the local context features of the point cloud given by LCC may result in the loss of most of the information. Here we design a module similar to attention pooling to process the local context features. First, we believe that the closer the distance, the greater the correlation between the points. We calculate the geometric relative distance between the points. and feature distance for:
然后通过计算两者的负指数来学习注意力池的权重,通过加入参数ζ来调整其不稳定性,则学习注意力池的权重为:Then the weight of the attention pool is learned by calculating the negative exponential of the two, and its instability is adjusted by adding the parameter ζ. The weight of the learned attention pool is:
学习到双重距离参数和局部上下文特征通过级联合并:The learned dual distance parameters and local context features are combined by cascading:
通过共享MLP和归一化指数函数得到注意力权重:The attention weights are obtained by sharing MLP and normalized exponential function:
最后注意力权重和相邻点特征得到加强后的局部上下文特征
Finally, the local context features after the attention weights and neighboring point features are strengthened
LCC和DP模块共同组成LD模块完成局部上下文特征的提取和增强。The LCC and DP modules together form the LD module to extract and enhance local context features.
本申请实施例提供的用于3D点云场景语义分割的局部上下文特征提取模块,学习X-Y-Z三轴旋转不变性的局部特征,同时弥补随机采样可能导致许多有用的点特征丢失。The local context feature extraction module for 3D point cloud scene semantic segmentation provided in the embodiment of the present application learns local features that are invariant to X-Y-Z three-axis rotation, while compensating for the possibility that random sampling may cause the loss of many useful point features.
参照图2,其示出了适用于本申请实施例提供的3D点云场景语义分割网络的结构示意图。Referring to FIG. 2 , there is shown a schematic diagram of the structure of a 3D point cloud scene semantic segmentation network applicable to an embodiment of the present application.
如图2所示,3D点云场景语义分割网络包括用于3D点云场景语义分割的局部上下文特征提取模块和编码器架构;As shown in FIG2 , the 3D point cloud scene semantic segmentation network includes a local context feature extraction module and an encoder architecture for 3D point cloud scene semantic segmentation;
用于3D点云场景语义分割的局部上下文特征提取模块嵌入编码器架构中。The local context feature extraction module for 3D point cloud scene semantic segmentation is embedded in the encoder architecture.
具体的,我们将所提出的LD模块嵌入到广泛使用的编码器架构中,从而形成了一个新的网络,我们将其命名为LD-Net(即3D点云场景语义分割网络), 如图2所示。网络的输入是大小为n×d的点云,其中n是点的数量,d是输入特征维度。点云首先被馈送到共享MLP层以提取每个点的特征,并且特征维度统一设置为8。我们使用五个编码器-解码器层来学习每个点的特性,最后使用三个连续的全连接层一个来预测每个点的语义标签。整体网络结构图如图二所示。Specifically, we embed the proposed LD module into the widely used encoder architecture to form a new network, which we named LD-Net (i.e. 3D point cloud scene semantic segmentation network), as shown in Figure 2. The input of the network is a point cloud of size n×d, where n is the number of points and d is the input feature dimension. The point cloud is first fed to the shared MLP layer to extract the features of each point, and the feature dimension is uniformly set to 8. We use five encoder-decoder layers to learn the characteristics of each point, and finally use three consecutive fully connected layers to predict the semantic label of each point. The overall network structure is shown in Figure 2.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner, and the same or similar parts between the embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the partial description of the method embodiment.
Claims (9)
- 一种用于3D点云场景语义分割的局部上下文特征提取模块,其特征在于,所述局部上下文特征提取模块包括:A local context feature extraction module for 3D point cloud scene semantic segmentation, characterized in that the local context feature extraction module comprises:旋转不变性的局部表示,所述旋转不变性的局部表示接收局部空间信息,所述局部空间信息包括若干点的坐标信息;A rotationally invariant local representation, wherein the rotationally invariant local representation receives local spatial information, wherein the local spatial information includes coordinate information of a plurality of points;根据所述局部空间信息,计算每个点的局部旋转不变性表示;Calculating a local rotation invariant representation of each point based on the local spatial information;查找局部邻域的质心点,并计算所述质心点与邻近点的相对点的位置编码;Find the centroid point of the local neighborhood and calculate the relative position code of the centroid point and the neighboring points;根据每个点的所述局部旋转不变性表示及所述相对点的位置编码,确定局部上下文特征。Local context features are determined based on the local rotation invariant representation of each point and the position encoding of the relative point.
- 根据权利要求1所述的局部上下文特征提取模块,其特征在于,所述局部空间信息记为(K,3),每个点的所述局部旋转不变性表示包括X轴、Y轴、Z轴的旋转不变性表示;The local context feature extraction module according to claim 1, characterized in that the local spatial information is recorded as (K, 3), and the local rotation invariant representation of each point includes the rotation invariant representation of the X axis, the Y axis, and the Z axis;其中,关于Z轴的旋转不变性表示:Among them, the rotation invariance about the Z axis means:关于X轴的旋转不变性表示:The rotation invariance about the X axis is expressed as:关于Y轴的旋转不变性表示:The rotation invariance about the Y axis is expressed as:
- 根据权利要求1所述的局部上下文特征提取模块,其特征在于,所述计算所述质心点与邻近点的相对点的位置编码,包括:The local context feature extraction module according to claim 1, characterized in that the step of calculating the relative point position encoding of the centroid point and the neighboring points comprises:确定质心点的坐标;Determine the coordinates of the centroid point;确定邻近点的坐标;Determine the coordinates of neighboring points;根据所述质心点的坐标及所述邻近点的坐标,计算所述质心点与所述邻近点的坐标差及欧氏距离;Calculate the coordinate difference and the Euclidean distance between the centroid and the neighboring points according to the coordinates of the centroid and the neighboring points;根据所述质心点的坐标、所述邻近点的坐标、所述坐标差及所述欧式距离,确定所述质心点与邻近点的相对点的位置编码。The position codes of the relative points between the centroid point and the neighboring points are determined according to the coordinates of the centroid point, the coordinates of the neighboring points, the coordinate difference and the Euclidean distance.
- 根据权利要求3所述的局部上下文特征提取模块,其特征在于,所述质心点的坐标P i,第k个邻近点的坐标 则所述质心点与第k个邻近点的相对点的位置编码 为: The local context feature extraction module according to claim 3, characterized in that the coordinates of the centroid point P i and the coordinates of the kth neighboring point Then the relative position code of the centroid point and the kth neighboring point is for:
- 根据权利要求2所述的局部上下文特征提取模块,其特征在于,所述根据每个点的所述局部旋转不变性表示及所述相对点的位置编码,确定局部上下文特征,包括:The local context feature extraction module according to claim 2 is characterized in that the determining of the local context feature according to the local rotation invariant representation of each point and the position encoding of the relative point comprises:对X轴、Z轴的旋转不变性表示引入权重,根据每个点的Y轴的旋转不变性表示、相对点的位置编码及引入权重后的X轴、Z轴的旋转不变性表示,确定局部上下文特征。Weights are introduced into the rotation invariance representation of the X-axis and the Z-axis, and the local context features are determined based on the rotation invariance representation of the Y-axis of each point, the position encoding of the relative point, and the rotation invariance representation of the X-axis and the Z-axis after the weights are introduced.
- 根据权利要求5所述的局部上下文特征提取模块,其特征在于,所述局部上下文特征 为: The local context feature extraction module according to claim 5, characterized in that the local context feature for:
- 根据权利要求1所述的局部上下文特征提取模块,其特征在于,所述局部上下文特征提取模块还包括:注意力池化,所述注意力池化通过几何距离和特征距离学习注意力权重对所述局部上下文特征进行加权获得增强的局部上下文特征。The local context feature extraction module according to claim 1 is characterized in that the local context feature extraction module also includes: attention pooling, which weights the local context features by learning attention weights through geometric distance and feature distance to obtain enhanced local context features.
- 根据权利要求7所述的局部上下文特征提取模块,其特征在于,通过几何距离和特征距离学习注意力权重对所述局部上下文特征进行加权获得增强的局部上下文特征,包括:The local context feature extraction module according to claim 7 is characterized in that the local context feature is weighted by learning attention weights through geometric distance and feature distance to obtain enhanced local context features, comprising:计算所述局部空间信息中点与点之间的几何距离;Calculating geometric distances between points in the local spatial information;计算特征距离;Calculate feature distance;根据所述几何距离和所述特征距离学习注意力池的权重;learning a weight of an attention pool according to the geometric distance and the feature distance;将所述注意力池的权重和所述局部上下文特征级联合并,并共享MLP和归一化指数函数得到注意力权重;The weight of the attention pool and the local context feature are combined, and the MLP and normalized exponential function are shared to obtain the attention weight;根据所述注意力权重和相邻点特征得到所述增强的局部上下文特征。The enhanced local context feature is obtained according to the attention weight and the neighboring point feature.
- 一种3D点云场景语义分割网络,其特征在于,所述网络包括如权利要求1-8任一项所述的用于3D点云场景语义分割的局部上下文特征提取模块和编码器架构;A 3D point cloud scene semantic segmentation network, characterized in that the network comprises a local context feature extraction module and an encoder architecture for 3D point cloud scene semantic segmentation according to any one of claims 1 to 8;所述用于3D点云场景语义分割的局部上下文特征提取模块嵌入所述编码器架构中。The local context feature extraction module for 3D point cloud scene semantic segmentation is embedded in the encoder architecture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/134619 WO2024113078A1 (en) | 2022-11-28 | 2022-11-28 | Local context feature extraction module for semantic segmentation in 3d point cloud scenario |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/134619 WO2024113078A1 (en) | 2022-11-28 | 2022-11-28 | Local context feature extraction module for semantic segmentation in 3d point cloud scenario |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024113078A1 true WO2024113078A1 (en) | 2024-06-06 |
Family
ID=91322641
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/134619 WO2024113078A1 (en) | 2022-11-28 | 2022-11-28 | Local context feature extraction module for semantic segmentation in 3d point cloud scenario |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024113078A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118397192A (en) * | 2024-06-14 | 2024-07-26 | 中国科学技术大学 | Point cloud analysis method based on double-geometry learning and self-adaptive sparse attention |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020192431A1 (en) * | 2019-03-22 | 2020-10-01 | Huawei Technologies Co., Ltd. | System and method for ordered representation and feature extraction for point clouds obtained by detection and ranging sensor |
CN113011430A (en) * | 2021-03-23 | 2021-06-22 | 中国科学院自动化研究所 | Large-scale point cloud semantic segmentation method and system |
CN113807182A (en) * | 2021-08-17 | 2021-12-17 | 北京地平线信息技术有限公司 | Method, apparatus, medium, and electronic device for processing point cloud |
CN114529727A (en) * | 2022-04-25 | 2022-05-24 | 武汉图科智能科技有限公司 | Street scene semantic segmentation method based on LiDAR and image fusion |
CN115222951A (en) * | 2021-04-20 | 2022-10-21 | 上海交通大学 | Image processing method based on three-dimensional point cloud descriptor with rotation invariance |
-
2022
- 2022-11-28 WO PCT/CN2022/134619 patent/WO2024113078A1/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020192431A1 (en) * | 2019-03-22 | 2020-10-01 | Huawei Technologies Co., Ltd. | System and method for ordered representation and feature extraction for point clouds obtained by detection and ranging sensor |
CN113011430A (en) * | 2021-03-23 | 2021-06-22 | 中国科学院自动化研究所 | Large-scale point cloud semantic segmentation method and system |
CN115222951A (en) * | 2021-04-20 | 2022-10-21 | 上海交通大学 | Image processing method based on three-dimensional point cloud descriptor with rotation invariance |
CN113807182A (en) * | 2021-08-17 | 2021-12-17 | 北京地平线信息技术有限公司 | Method, apparatus, medium, and electronic device for processing point cloud |
CN114529727A (en) * | 2022-04-25 | 2022-05-24 | 武汉图科智能科技有限公司 | Street scene semantic segmentation method based on LiDAR and image fusion |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118397192A (en) * | 2024-06-14 | 2024-07-26 | 中国科学技术大学 | Point cloud analysis method based on double-geometry learning and self-adaptive sparse attention |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Gao et al. | LFT-Net: Local feature transformer network for point clouds analysis | |
CN108594816B (en) | Method and system for realizing positioning and composition by improving ORB-SLAM algorithm | |
WO2024060395A1 (en) | Deep learning-based high-precision point cloud completion method and apparatus | |
Jing et al. | Self-supervised feature learning by cross-modality and cross-view correspondences | |
CN111860666A (en) | 3D target detection method based on point cloud and image self-attention mechanism fusion | |
CN112489083B (en) | Image feature point tracking matching method based on ORB-SLAM algorithm | |
WO2023015409A1 (en) | Object pose detection method and apparatus, computer device, and storage medium | |
CN111831844A (en) | Image retrieval method, image retrieval device, image retrieval apparatus, and medium | |
Xu et al. | GraspCNN: Real-time grasp detection using a new oriented diameter circle representation | |
CN108305278B (en) | Image matching correlation improvement method in ORB-SLAM algorithm | |
WO2024113078A1 (en) | Local context feature extraction module for semantic segmentation in 3d point cloud scenario | |
CN113989340A (en) | Point cloud registration method based on distribution | |
CN116188825A (en) | Efficient feature matching method based on parallel attention mechanism | |
CN115222951A (en) | Image processing method based on three-dimensional point cloud descriptor with rotation invariance | |
CN113449612A (en) | Three-dimensional target point cloud identification method based on sub-flow sparse convolution | |
Gao et al. | HDRNet: High‐Dimensional Regression Network for Point Cloud Registration | |
Yu et al. | A DenseNet feature-based loop closure method for visual SLAM system | |
Wang et al. | 6D pose estimation from point cloud using an improved point pair features method | |
WO2023109069A1 (en) | Image retrieval method and apparatus | |
CN116843753A (en) | Robust 6D pose estimation method based on bidirectional matching and global attention network | |
CN118097651A (en) | Local context feature extraction module for 3D point cloud scene semantic segmentation | |
CN114266863A (en) | Point cloud-based 3D scene graph generation method, system, equipment and readable storage medium | |
CN110135340A (en) | 3D hand gestures estimation method based on cloud | |
CN114092650B (en) | Three-dimensional point cloud generation method based on efficient graph convolution | |
Liang et al. | Dual Branch PnP Based Network for Monocular 6D Pose Estimation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22966683 Country of ref document: EP Kind code of ref document: A1 |