Lee et al., 2022 - Google Patents
Self-supervised monocular depth and motion learning in dynamic scenes: Semantic prior to rescueLee et al., 2022
- Document ID
- 13021985192533269329
- Author
- Lee S
- Rameau F
- Im S
- Kweon I
- Publication year
- Publication venue
- International Journal of Computer Vision
External Links
Snippet
We introduce an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion, and depth in a monocular camera setup without geometric supervision. Our technical contributions are three-fold. First, we highlight the …
- 230000011218 segmentation 0 abstract description 27
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00791—Recognising scenes perceived from the perspective of a land vehicle, e.g. recognising lanes, obstacles or traffic signs on road scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lee et al. | Learning monocular depth in dynamic scenes via instance-aware projection consistency | |
Bai et al. | Exploiting semantic information and deep matching for optical flow | |
Ke et al. | Gsnet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision | |
Zeng et al. | Joint 3d layout and depth prediction from a single indoor panorama image | |
Sun et al. | Sc-depthv3: Robust self-supervised monocular depth estimation for dynamic scenes | |
Hoyer et al. | Improving semi-supervised and domain-adaptive semantic segmentation with self-supervised depth estimation | |
Jiao et al. | Effiscene: Efficient per-pixel rigidity inference for unsupervised joint learning of optical flow, depth, camera pose and motion segmentation | |
Zhou et al. | Self-distilled feature aggregation for self-supervised monocular depth estimation | |
Guo et al. | Context-enhanced stereo transformer | |
Chen et al. | SAANet: Spatial adaptive alignment network for object detection in automatic driving | |
Lee et al. | Self-supervised monocular depth and motion learning in dynamic scenes: Semantic prior to rescue | |
Xie et al. | Mv-map: Offboard hd-map generation with multi-view consistency | |
Yang et al. | SAM-Net: Semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications | |
Zhao et al. | Jperceiver: Joint perception network for depth, pose and layout estimation in driving scenes | |
Lin et al. | Unsupervised monocular visual odometry with decoupled camera pose estimation | |
Mehl et al. | M-fuse: Multi-frame fusion for scene flow estimation | |
Yue et al. | Self-supervised monocular depth estimation in dynamic scenes with moving instance loss | |
Wu et al. | A dynamic infrared object tracking algorithm by frame differencing | |
Wang et al. | Cbwloss: constrained bidirectional weighted loss for self-supervised learning of depth and pose | |
Han et al. | Self-supervised monocular Depth estimation with multi-scale structure similarity loss | |
Ji et al. | Stereo 3D object detection via instance depth prior guidance and adaptive spatial feature aggregation | |
Lee et al. | Instance-wise depth and motion learning from monocular videos | |
Long et al. | Detail preserving residual feature pyramid modules for optical flow | |
Yusiong et al. | AsiANet: Autoencoders in autoencoder for unsupervised monocular depth estimation | |
Zhang et al. | Unsupervised learning of monocular depth and ego-motion with space–temporal-centroid loss |