Skip to content

Code for MagicPose4D: Crafting Articulated Models with Appearance and Motion Control

Notifications You must be signed in to change notification settings

haoz19/MagicPose4D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

MagicPose4D: Crafting Articulated Models
with Appearance and Motion Control

Hao Zhang1* · Di Chang2* · Fang Li1 · Mohammad Soleymani2 · Narendra Ahuja1
1University of Illinois Urbana-Champaign     2University of Southern California
*Equal Contribution     Equal Advising

Paper PDF Project Page

We introduce MagicPose4D, a novel framework for 4D generation providing more accurate and customizable 4D motion retargeting. We propose a dual-phase reconstruction process that initially uses accurate 2D and pseudo 3D supervision without skeleton constraints, and subsequently refines the model with skeleton constraints to ensure physical plausibility. We incorporate a novel Global-Local Chamfer loss function that aligns the overall distribution of mesh vertices with the supervision and maintains part-level alignment without additional annotations. Our method enables cross-category motion transfer using a kinematic-chain-based skeleton, ensuring smooth transitions between frames through dynamic rigidity and achieving robust generalization without the need for additional training.

For 3D reconstruction from monocular videos, please also check our previous work S3O & LIMR!

For 2D video motion retargeting and animation, please also check our previous work MagicPose!

News

  • [2024.5.22] Demo for the whole MagicPose4D pipeline is coming soon.
  • [2024.5.22] Release MagicPose4D paper and project page.
  • [2024.5.22] Release Python bindings for Automatic-Rigging.

Install

  • Please follow the repository: Automatic-Rigging to install the package.

    git clone https://github.com/haoz19/Automatic-Rigging.git
    cd Automatic-Rigging
    pip install .
    

    This package is for calculating skinning weights and aligning the mesh with template skeletons.

  • Install manifold remeshing:

    git clone --recursive [email protected]:hjwdzh/Manifold.git; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../
    
  • Install threestudio for image-to-3D generation

    See installation.md for additional information, including installation via Docker.

    The following steps have been tested on Ubuntu20.04.

    • You must have an NVIDIA graphics card with at least 6GB VRAM and have CUDA installed.
    • Install Python >= 3.8.
    • (Optional, Recommended) Create a virtual environment:
    python3 -m virtualenv venv
    . venv/bin/activate
    
    # Newer pip versions, e.g. pip-23.x, can be much faster than old versions, e.g. pip-20.x.
    # For instance, it caches the wheels of git packages to avoid unnecessarily rebuilding them later.
    python3 -m pip install --upgrade pip
    • Install PyTorch >= 1.12. We have tested on torch1.12.1+cu113 and torch2.0.0+cu118, but other versions should also work fine.
    # torch1.12.1+cu113
    pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
    # or torch2.0.0+cu118
    pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
    • (Optional, Recommended) Install ninja to speed up the compilation of CUDA extensions:
    pip install ninja
    • Install dependencies:
    pip install -r requirements.txt
    • (Optional) tiny-cuda-nn installation might require downgrading pip to 23.0.1

    • (Optional, Recommended) The best-performing models in threestudio use the newly-released T2I model DeepFloyd IF, which currently requires signing a license agreement. If you would like to use these models, you need to accept the license on the model card of DeepFloyd IF, and login into the Hugging Face hub in the terminal by huggingface-cli login.

Dual-phase 4D Reconstruction Module

  • Generating 3D pseudo supervision via zero-123, also you can use stable-123, Magic-123 or other image-to-3D methods.

    # Single image example:
    # Trainning:
    python launch.py --config configs/zero123.yaml --train --gpu 0 data.image_path=load/images/<input image>
    # Mesh Extraction:
    python launch.py --config outputs/zero123/<output_folder>/configs/parsed.yaml data.image_path=load/images/<input image> system.exporter.context_type=cuda --export --gpu 0 resume=outputs/zero123/<output_folder>/ckpts/last.ckpt  system.exporter_type=mesh-exporter system.geometry.isosurface_method=mc-cpu system.geometry.isosurface_resolution=256
    
    # Batch example: coming soon
    
  • First-phase: Code coming soon

    For the first-phase we only want to get a sequcense of accurate 3D meshes, you can also consider other existing 4D recosntruction methods such as: LIMR/S3O, LASR, and BANMo for reconstructing articulated objects from monocular videos.

  • Second-phase: <root>/PoseTransfer/batch_run.sh

    This step ensures the skinning weights and skeleton are physically plausible.

Cross-Category Pose Transfer Module

  • Reference to our demo.

Citing

If you find our work useful, please consider citing:

@misc{zhang2024magicpose4d,
      title={MagicPose4D: Crafting Articulated Models with Appearance and Motion Control}, 
      author={Hao Zhang and Di Chang and Fang Li and Mohammad Soleymani and Narendra Ahuja},
      year={2024},
      eprint={2405.14017},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Code for MagicPose4D: Crafting Articulated Models with Appearance and Motion Control

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages