Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colab: Balances the data distribution #1710

Merged
merged 3 commits into from
Mar 11, 2024
Merged

Conversation

Peppe289
Copy link
Contributor

@Peppe289 Peppe289 commented Jan 2, 2024

Analyses

After conducting several tests on Google Colab and reviewing the documentation, initiating "fooocus" with specific flags not only enables the utilization of cloud computing VRAM (15GB of VRAM), significantly reducing the consumption of system RAM, but also enhances processing speed and prevents premature process termination.

What the user notices

By default, VRAM remains unused, and the program processes solely with system memory (12GB). The program crashes when attempting to use more than one image in the prompt. The process was terminated with ^C.

After this commit

Using the extra 15GB of vram allows you to make the most of the program on colab. The use is perfectly balanced and manages to process with 4 images in the prompt.

Thanks for your work, best regards.

enables the use of VRAM memory so as not to saturate the system RAM
@mashb1t
Copy link
Collaborator

mashb1t commented Jan 2, 2024

Thank you for the contribution. While i totally agree with adding --always-high-vram we should not set --disable-offload-from-vram by default as switching models will cause them also to stay on VRAM, which may cause problems.
Did you test using Fooocus on Colab incl. switching models?

@Peppe289
Copy link
Contributor Author

Peppe289 commented Jan 2, 2024

Thank you for the contribution. While i totally agree with adding --always-high-vram we should not set --disable-offload-from-vram by default as switching models will cause them also to stay on VRAM, which may cause problems. Did you test using Fooocus on Colab incl. switching models?

If you mean advanced->model, no. I kept the default settings. For the two flags, even though I kept them I didn't notice any problems.

@mashb1t
Copy link
Collaborator

mashb1t commented Jan 2, 2024

fyi if you do not offload from VRAM and switch models, every additional model will also be kept in VRAM, which is fine.
Just checked the code, this shouldn't be an issue as in

if not ALWAYS_VRAM_OFFLOAD:
if get_free_memory(device) > memory_required:
break

models are unloaded when more VRAM is required than is free.
Please nevertheless test with multiple models and provide your results to make sure everything is working as expected. Looking forward to it :)

@Peppe289
Copy link
Contributor Author

Peppe289 commented Jan 5, 2024

fyi if you do not offload from VRAM and switch models, every additional model will also be kept in VRAM, which is fine. Just checked the code, this shouldn't be an issue as in

if not ALWAYS_VRAM_OFFLOAD:
if get_free_memory(device) > memory_required:
break

models are unloaded when more VRAM is required than is free.
Please nevertheless test with multiple models and provide your results to make sure everything is working as expected. Looking forward to it :)

Sorry for the delay. I have tested the models and everything seems to work correctly. Regarding the flags I still have to see, but even if they are both there it doesn't seem to cause any kind of problem

@mashb1t
Copy link
Collaborator

mashb1t commented Feb 26, 2024

The solution of this issue has been referenced to and has helped countless users already in enabling them to use Colab.
A final test should be conducted and a decision needs to be made if the changes will be merged to main or not.

@Peppe289
Copy link
Contributor Author

The solution of this issue has been referenced to and has helped countless users already in enabling them to use Colab. A final test should be conducted and a decision needs to be made if the changes will be merged to main or not.

okay, thanks for considering this change. best regards.

@mashb1t
Copy link
Collaborator

mashb1t commented Mar 11, 2024

Here are my extensive testing results. Tests have been conducted on Colab with a T4 instance (free tier) using 2 IP images (ImagePrompt) and a positive prompt in 1152×896, default model, default styles (irrelevant for test).

default (only --share)

Process ran out of memory

Screenshot 2024-03-11 at 18 30 47

--attention-split

Process ran out of memory

Screenshot 2024-03-11 at 18 33 46

--always-high-vram

Process did NOT run out of memory

Screenshot 2024-03-11 at 18 36 54

--always-high-vram --disable-offload-from-vram

Process did NOT run out of memory for first generation, but DID run out of memory when using upscale or different adapters afterwards

Screenshot 2024-03-11 at 18 39 44

--always-high-vram --attention-split

Process did NOT run out of memory for first generation, but DID run out of memory when using upscale or different adapters afterwards

Screenshot_2024-03-11_at_18_46_38

--always-high-vram --disable-offload-from-vram --attention-split

Process did NOT run out of memory for first generation (but overall slower), but DID run out of memory when using upscale or different adapters afterwards

Screenshot 2024-03-11 at 18 46 38

--disable-offload-from-vram

Process did NOT run out of memory

Screenshot 2024-03-11 at 18 49 40

Learnings:

  • --always-high-vram is overall beneficial as it shifts load from RAM to much faster VRAM
  • --disable-offload-from-vram alloes for faster loading when doing the same type of generation multiple times, but causes the instance to crash due to not offloading, duh, when using different adapters or functionalities.
  • --attention-split is overall beneficial and lowers RAM AND VRAM, but at the cost of performance

=> using --always-high-vram achieves the overall best balance between performance, flexibility and stability.

@mashb1t mashb1t changed the base branch from main to develop March 11, 2024 18:57
@mashb1t mashb1t merged commit 532401d into lllyasviel:develop Mar 11, 2024
@poor7
Copy link

poor7 commented Mar 18, 2024

@mashb1t I’ll add that the keys --vae-in-fp16 --unet-in-fp16 --all-in-fp16 further increase generation speed by ~10-20% and reduce memory consumption by ~10-20%, improving overall performance and stability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants