Lee et al., 2022 - Google Patents

Self-supervised monocular depth and motion learning in dynamic scenes: Semantic prior to rescue

Lee et al., 2022

Document ID
13021985192533269329
Author
Lee S
Rameau F
Im S
Kweon I
Publication year
Publication venue
International Journal of Computer Vision

External Links

Snippet

We introduce an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion, and depth in a monocular camera setup without geometric supervision. Our technical contributions are three-fold. First, we highlight the …
Continue reading at link.springer.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • G06K9/00791Recognising scenes perceived from the perspective of a land vehicle, e.g. recognising lanes, obstacles or traffic signs on road scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • G06F17/30799Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation

Similar Documents

Publication Publication Date Title
Lee et al. Learning monocular depth in dynamic scenes via instance-aware projection consistency
Bai et al. Exploiting semantic information and deep matching for optical flow
Ke et al. Gsnet: Joint vehicle pose and shape reconstruction with geometrical and scene-aware supervision
Zeng et al. Joint 3d layout and depth prediction from a single indoor panorama image
Sun et al. Sc-depthv3: Robust self-supervised monocular depth estimation for dynamic scenes
Hoyer et al. Improving semi-supervised and domain-adaptive semantic segmentation with self-supervised depth estimation
Jiao et al. Effiscene: Efficient per-pixel rigidity inference for unsupervised joint learning of optical flow, depth, camera pose and motion segmentation
Zhou et al. Self-distilled feature aggregation for self-supervised monocular depth estimation
Guo et al. Context-enhanced stereo transformer
Chen et al. SAANet: Spatial adaptive alignment network for object detection in automatic driving
Lee et al. Self-supervised monocular depth and motion learning in dynamic scenes: Semantic prior to rescue
Xie et al. Mv-map: Offboard hd-map generation with multi-view consistency
Yang et al. SAM-Net: Semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications
Zhao et al. Jperceiver: Joint perception network for depth, pose and layout estimation in driving scenes
Lin et al. Unsupervised monocular visual odometry with decoupled camera pose estimation
Mehl et al. M-fuse: Multi-frame fusion for scene flow estimation
Yue et al. Self-supervised monocular depth estimation in dynamic scenes with moving instance loss
Wu et al. A dynamic infrared object tracking algorithm by frame differencing
Wang et al. Cbwloss: constrained bidirectional weighted loss for self-supervised learning of depth and pose
Han et al. Self-supervised monocular Depth estimation with multi-scale structure similarity loss
Ji et al. Stereo 3D object detection via instance depth prior guidance and adaptive spatial feature aggregation
Lee et al. Instance-wise depth and motion learning from monocular videos
Long et al. Detail preserving residual feature pyramid modules for optical flow
Yusiong et al. AsiANet: Autoencoders in autoencoder for unsupervised monocular depth estimation
Zhang et al. Unsupervised learning of monocular depth and ego-motion with space–temporal-centroid loss