Add multi-gpu support, model cache, fix README and deprecated containers #60

catid · 2024-05-03T04:47:25Z

This adds several improvements:
[x] Support for consumer GPUs for hosting
[x] Local model cache for docker container to avoid re-downloading the models on restart
[x] Fixes for deprecated container images and README corrections

bluestyle97 · 2024-05-03T05:36:23Z

Hi, thanks for you contribution! Could you please add a new app_2gpu.py instead of editing the original app.py directly and then make a new pull request?

catid · 2024-05-03T17:11:43Z

No I think it makes more sense in app.py for use with gradio containers. I want to be able to just deploy this not manually run a different app

bluestyle97 · 2024-05-03T17:32:26Z

I think using separate scripts for different environments can better empowers users to choose between using a single GPU or multiple GPUs.

catid · 2024-05-03T19:02:56Z

Don't see your point.. either way you'd want to use CUDA_VISIBLE_DEVICES=0 to do what you're saying. So might as well just have one script

catid · 2024-05-03T19:07:25Z

And for the docker container you can enforce single GPU usage like this:

docker run -it -p 43839:43839 --platform=linux/amd64 --gpus '"device=0"' -v $HOME/models/:/workspace/instantmesh/models instantmesh

catid · 2024-05-03T19:08:04Z

So there's no reason to have a ton of ugly code duplication for this feature.. Even if you wanted to select it via script features it's better to have an argparse --argument instead

bluestyle97 · 2024-05-04T07:07:28Z

Ok, I get your point and agree with you. Before merging the commits, I notice that the CUDA version in the docker file is upgraded from 12.1 to 12.4, while pytorch 2.1.0 is compiled with CUDA 12.1 by default. Will the change of CUDA version potentially lead to incompatible problem? How about keeping the original CUDA version unchanged?

catid · 2024-05-04T17:45:08Z

Sometimes it can lead to problems but I tested it and it's working great. Also I regularly run 12.4.1 on all my Ubuntu GPU servers so I have some trust in this version of CUDA. I have had issues with prior versions so I know somewhat what to look out for (usually wheel build errors from nvcc).

add 2gpu support

c030efb

catid added 4 commits May 3, 2024 17:42

add models volume cache

5b18954

Update from deprecated version of Nvidia container

8d0ee3e

no share]

697c96e

model cache

368393a

document gpu selection

3fb5463

catid changed the title ~~Add multi-gpu support~~ Add multi-gpu support, model cache, fix README and deprecated containers May 3, 2024

correct typo

b671a68

bluestyle97 merged commit 5dcb994 into TencentARC:main May 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-gpu support, model cache, fix README and deprecated containers #60

Add multi-gpu support, model cache, fix README and deprecated containers #60

catid commented May 3, 2024 •

edited

Loading

bluestyle97 commented May 3, 2024

catid commented May 3, 2024

bluestyle97 commented May 3, 2024

catid commented May 3, 2024 •

edited

Loading

catid commented May 3, 2024

catid commented May 3, 2024 •

edited

Loading

bluestyle97 commented May 4, 2024

catid commented May 4, 2024 •

edited

Loading

Add multi-gpu support, model cache, fix README and deprecated containers #60

Add multi-gpu support, model cache, fix README and deprecated containers #60

Conversation

catid commented May 3, 2024 • edited Loading

bluestyle97 commented May 3, 2024

catid commented May 3, 2024

bluestyle97 commented May 3, 2024

catid commented May 3, 2024 • edited Loading

catid commented May 3, 2024

catid commented May 3, 2024 • edited Loading

bluestyle97 commented May 4, 2024

catid commented May 4, 2024 • edited Loading

catid commented May 3, 2024 •

edited

Loading

catid commented May 3, 2024 •

edited

Loading

catid commented May 3, 2024 •

edited

Loading

catid commented May 4, 2024 •

edited

Loading