GPU memory very high and unbalanced when testing Gemma #1892

smartliuhw · 2024-05-27T13:51:30Z

I'm running Gemma-7B with RTX 4090, It takes up a lot of GPU memory, but llama3-8B can run normally under the same prompt. When I start the tensor parallel strategy, the GPU memory is very unbalanced (as shown in the figure below). Is this a Gemma bug or is there something wrong with my usage?

My platform：Linux
Python version: 3.10.14
lm-evaluation-harness version: 0.4.2

smartliuhw · 2024-05-27T13:52:10Z

I'm running Gemma-7B with RTX 4090, It takes up a lot of GPU memory, but llama3-8B can run normally under the same prompt. When I start the tensor parallel strategy, the GPU memory is very unbalanced (as shown in the figure below). Is this a Gemma bug or is there something wrong with my usage?

My platform：Linux Python version: 3.10.14 lm-evaluation-harness version: 0.4.2

The figure is running with batch_size 1

haileyschoelkopf · 2024-05-27T21:11:06Z

hi! Is this using --model hf with parallelize=True?

If so, I think this is sometimes expected behavior from HF. adding device_map_option='balanced' to --model_args may fix this? or simply upping the batch size.

haileyschoelkopf closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU memory very high and unbalanced when testing Gemma #1892

GPU memory very high and unbalanced when testing Gemma #1892

smartliuhw commented May 27, 2024

smartliuhw commented May 27, 2024

haileyschoelkopf commented May 27, 2024

GPU memory very high and unbalanced when testing Gemma #1892

GPU memory very high and unbalanced when testing Gemma #1892

Comments

smartliuhw commented May 27, 2024

smartliuhw commented May 27, 2024

haileyschoelkopf commented May 27, 2024