Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add autobatch feature for best batch-size estimation #5092

Merged
merged 47 commits into from
Oct 25, 2021

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Oct 8, 2021

Adds YOLOv5 autobatch feature to solve for best batch size on CUDA devices by passing --batch-size -1 to train.py:

$ python train.py --batch-size -1

YOLOv5 🚀 v6.0-67-g60e42e1 torch 1.9.0+cu111 CUDA:0 (A100-SXM4-40GB, 40536MiB)
...
autobatch: Computing optimal batch size for --imgsz 640
autobatch: CUDA:0 39.6G total, 0.0996G reserved, 0.0813G allocated, 39.4G free
      Params      GFLOPs  GPU_mem (GB)  forward (ms) backward (ms)                   input                  output
     7235389       16.53         0.296            22         19.57        (1, 3, 640, 640)                    list
     7235389       33.06         0.568         21.47         16.62        (2, 3, 640, 640)                    list
     7235389       66.13         1.007         21.79         16.77        (4, 3, 640, 640)                    list
     7235389       132.3         1.797         21.46         18.54        (8, 3, 640, 640)                    list
     7235389       264.5         3.280         22.64         23.45       (16, 3, 640, 640)                    list
autobatch: Using batch-size 179 for CUDA:0 35.6G/39.6G (90%)

Screenshot 2021-11-06 at 12 31 10

Limitations

  • Only works for Single-GPU trainings with train.py
  • Experimental and still under development
  • Auto-batch size does not get saved to opt.yaml for use with --resume.py (autobatch will run again on --resume)
  • NOT RECOMMENDED FOR PRODUCTION USE

@kalenmike @stefani-kovachevska @AyushExel

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Introduced automatic batch size estimation for YOLOv5 training on single GPU setups.

📊 Key Changes

  • Added a new utility function check_train_batch_size to estimate the optimal training batch size.
  • Modified train.py to enable batch size auto-tuning when a batch size of -1 is specified for single-GPU training.
  • Introduced a new file utils/autobatch.py which contains the autobatch function to facilitate batch size estimation based on available GPU memory.
  • Minor refinement to error handling in utils/torch_utils.py by commenting out a debug print statement.

🎯 Purpose & Impact

  • 🚀 Enhanced Usability: The new feature allows users with less technical knowledge to efficiently utilize their GPU for training without manually experimenting with batch sizes.
  • 💡 Optimized Resource Usage: Helps avoid out-of-memory errors by automating the selection of the largest possible batch size that fits in the GPU memory, ensuring better GPU utilization.
  • Streamlined Training: Simplifies the training setup process, potentially making YOLOv5 more accessible to a broader audience.
  • 🐞 Reduced Debug Noise: Adjusted the error output for smoother debugging and user experience.

@glenn-jocher glenn-jocher changed the title Feature/auto batch Add autobatch feature for best batch-size estimation Oct 8, 2021
@glenn-jocher glenn-jocher self-assigned this Oct 13, 2021
@glenn-jocher glenn-jocher added the enhancement New feature or request label Oct 13, 2021
@glenn-jocher glenn-jocher force-pushed the feature/auto_batch branch 2 times, most recently from 152a1fc to f414133 Compare October 23, 2021 12:05
This was referenced May 18, 2022
BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
…#5092)

* Autobatch

* fix mem

* fix mem2

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update train.py

* print result

* Cleanup print result

* swap fix in call

* to 64

* use total

* fix

* fix

* fix

* fix

* fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Cleanup printing

* Update final printout

* Update autobatch.py

* Update autobatch.py

* Update autobatch.py
@glenn-jocher glenn-jocher mentioned this pull request Feb 28, 2023
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant