Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory - minimum gpu size #167

Closed
yscoffee opened this issue May 13, 2024 · 2 comments
Closed

CUDA out of memory - minimum gpu size #167

yscoffee opened this issue May 13, 2024 · 2 comments

Comments

@yscoffee
Copy link

Thanks for the amazing work.
Could you please suggest the minimum GPU size for doing offline Inference?
I tried with 24GB 4090 but it turns out not enought for running the offline inference example.

@BIGBALLON
Copy link

@yscoffee The total number of model parameters is 25.5B, consisting of the architecture of InternViT-6B-448px-V1-5 + MLP + InternLM2-Chat-20B.

  • For bf16, at least 52GB of GPU memory is required.
  • For int8, at least 26GB of GPU memory is needed. There is also some additional overhead. so 32GB memory is needed.

Examples, if you can understand Chinese, you can refer to this article:

  • 1 GPU for int8 model: image
  • 2 GPU for bf16 model: image
  • 4 GPU for bf16 model: image

@yscoffee
Copy link
Author

thank you for the detail explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants