WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

Project Page

This repository is for WildRefer dataset and official implement for WildRefer: WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language.

Dataset

Our dataset can be download here.

We strongly recommend to use our pre-processed HuCenLife and STCrowd that can be downloaded here.

How to use this code

Data Preparation

Please prepare the dataset as following folder struction:

./
└── data/
    ├── liferefer_test.json
    ├── liferefer_train.json
    ├── strefer_test.json        
    └── strefer_train.json    
└── src/      
    ├── LifeRefer.zip
    └── STRefer.zip

Unzip our processed data

cd src
unzip LifeRefer.zip
unzip STRefer.zip
cd ..

Environment Installation

Our environment is based on Python 3.8 and cuda 11.3. You can install the environment with conda.

conda create -n wildrefer_env python=3.8 -y
conda activate wildrefer_env
conda install conda-forge::cudatoolkit-dev=11.3 -y
pip install torch==1.11.0 torchvision==0.12.0 --index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
python -m spacy download en_core_web_sm
cd pointnet2
python setup.py install
cd ..

Test

Our weights can be downloaded here. You can put the weights under the folder weights/.

./
└── weights/
    ├── liferefer_test.json       
    └── strefer_train.json

STRefer

python test.py --dataset strefer --pretrain weights/strefer_weights.pth --max_lang_num 50 --frame_num 2 --batch_size 36

LifeRefer

python test.py --dataset liferefer --pretrain weights/liferefer_weights.pth --frame_num 2 --batch_size 32

Train

STRefer

python train.py --dataset strefer --max_lang_num 50

LifeRefer

python train.py --dataset liferefer --max_lang_num 100

License:

All datasets are published under the Creative Commons Attribution-NonCommercial-ShareAlike. This means that you must attribute the work in the manner specified by the authors, you may not use this work for commercial purposes and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same license.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
datasets		datasets
models		models
pointnet2		pointnet2
utils		utils
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

Project Page

Dataset

How to use this code

Data Preparation

Environment Installation

Test

STRefer

LifeRefer

Train

STRefer

LifeRefer

License:

About

Releases

Packages

Languages

4DVLab/WildRefer

Folders and files

Latest commit

History

Repository files navigation

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

Project Page

Dataset

How to use this code

Data Preparation

Environment Installation

Test

STRefer

LifeRefer

Train

STRefer

LifeRefer

License:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages