Add `autobatch` feature for best `batch-size` estimation #5092

glenn-jocher · 2021-10-08T04:04:58Z

Adds YOLOv5 autobatch feature to solve for best batch size on CUDA devices by passing --batch-size -1 to train.py:

$ python train.py --batch-size -1

YOLOv5 🚀 v6.0-67-g60e42e1 torch 1.9.0+cu111 CUDA:0 (A100-SXM4-40GB, 40536MiB)
...
autobatch: Computing optimal batch size for --imgsz 640
autobatch: CUDA:0 39.6G total, 0.0996G reserved, 0.0813G allocated, 39.4G free
      Params      GFLOPs  GPU_mem (GB)  forward (ms) backward (ms)                   input                  output
     7235389       16.53         0.296            22         19.57        (1, 3, 640, 640)                    list
     7235389       33.06         0.568         21.47         16.62        (2, 3, 640, 640)                    list
     7235389       66.13         1.007         21.79         16.77        (4, 3, 640, 640)                    list
     7235389       132.3         1.797         21.46         18.54        (8, 3, 640, 640)                    list
     7235389       264.5         3.280         22.64         23.45       (16, 3, 640, 640)                    list
autobatch: Using batch-size 179 for CUDA:0 35.6G/39.6G (90%)

Limitations

Only works for Single-GPU trainings with train.py
Experimental and still under development
Auto-batch size does not get saved to opt.yaml for use with --resume.py (autobatch will run again on --resume)
NOT RECOMMENDED FOR PRODUCTION USE

@kalenmike @stefani-kovachevska @AyushExel

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Introduced automatic batch size estimation for YOLOv5 training on single GPU setups.

📊 Key Changes

Added a new utility function check_train_batch_size to estimate the optimal training batch size.
Modified train.py to enable batch size auto-tuning when a batch size of -1 is specified for single-GPU training.
Introduced a new file utils/autobatch.py which contains the autobatch function to facilitate batch size estimation based on available GPU memory.
Minor refinement to error handling in utils/torch_utils.py by commenting out a debug print statement.

🎯 Purpose & Impact

🚀 Enhanced Usability: The new feature allows users with less technical knowledge to efficiently utilize their GPU for training without manually experimenting with batch sizes.
💡 Optimized Resource Usage: Helps avoid out-of-memory errors by automating the selection of the largest possible batch size that fits in the GPU memory, ensuring better GPU utilization.
✨ Streamlined Training: Simplifies the training setup process, potentially making YOLOv5 more accessible to a broader audience.
🐞 Reduced Debug Noise: Adjusted the error output for smoother debugging and user experience.

…#5092) * Autobatch * fix mem * fix mem2 * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update * Update train.py * print result * Cleanup print result * swap fix in call * to 64 * use total * fix * fix * fix * fix * fix * Update * Update * Update * Update * Update * Update * Update * Cleanup printing * Update final printout * Update autobatch.py * Update autobatch.py * Update autobatch.py

glenn-jocher changed the title ~~Feature/auto batch~~ Add autobatch feature for best batch-size estimation Oct 8, 2021

github-actions bot force-pushed the feature/auto_batch branch from ae58dfc to 4d06df9 Compare October 8, 2021 06:29

glenn-jocher self-assigned this Oct 13, 2021

glenn-jocher added the enhancement New feature or request label Oct 13, 2021

glenn-jocher force-pushed the feature/auto_batch branch 2 times, most recently from 152a1fc to f414133 Compare October 23, 2021 12:05

glenn-jocher added 24 commits October 25, 2021 11:33

Autobatch

7798401

fix mem

38ee33c

fix mem2

9bab757

Update

8a0ee56

Update

8a68891

Update

6f67028

Update

ccd47a0

Update

05d7860

Update

78cbd2a

Update

3b34bd4

Update

cc09ecf

Update

6d0b3e9

Update

9282c21

Update

45ddb57

Update

b1a57d1

Update

122733d

Update

13c4996

Update

bd34ab8

Update

bbe56b8

Update

aef68c9

Update

3faf055

Update

831593b

Update

65e3bf6

Update

6a0c4d2

glenn-jocher mentioned this pull request Oct 28, 2021

Regarding the batch size used for different model training #5376

Closed

glenn-jocher mentioned this pull request Nov 6, 2021

Calculate GPU requirements at given batch size and image size #5528

Closed

1 task

This was referenced Nov 17, 2021

low GPU-util and high CPU-util #5681

Closed

请问这三个文件的区别？以及如何在程序中选择 #5653

Closed

glenn-jocher mentioned this pull request Dec 1, 2021

Why is 'del' called explicitly? #5842

Closed

glenn-jocher mentioned this pull request Dec 15, 2021

-- device command returns OOM, runs fine without it #5988

Closed

2 tasks

This was referenced Feb 3, 2022

YOLOv5 support model-parallel training with multi gpu? #6523

Closed

Small offset in detection and bad practice #6554

Closed

glenn-jocher mentioned this pull request Feb 22, 2022

YOLOv5 v6.1 release #6739

Merged

glenn-jocher mentioned this pull request Mar 30, 2022

why training custom data use CPU instead of GPU #7173

Closed

1 task

glenn-jocher mentioned this pull request May 3, 2022

Problems with the --multi-scale option with CUDA #7678

Closed

1 task

This was referenced May 18, 2022

yolov5 model #7879

Closed

How can I reduce GPU memory usage? #7969

Closed

glenn-jocher mentioned this pull request Jun 17, 2022

Can't train the model with multi gpus #8227

Closed

1 task

glenn-jocher mentioned this pull request Jul 5, 2022

Train YOLOV5 in 2 or 3 times, because of insufficient memory to train the all dataset in one time #8477

Closed

1 task

glenn-jocher mentioned this pull request Jul 26, 2022

GPU Memory Usage during Tensor Inference #8719

Closed

1 task

glenn-jocher mentioned this pull request Aug 30, 2022

RuntimeError: CUDA error: the launch timed out and was terminated #9087

Closed

2 tasks

This was referenced Sep 7, 2022

GPU usage out of memory #9320

Closed

Tensorrt Export of yolov5x - insufficient memory error #9336

Closed

glenn-jocher mentioned this pull request Sep 15, 2022

Training script is not utlizing 100% of GPU memory. #9394

Closed

2 tasks

This was referenced Oct 25, 2022

Problem on running Hyperparameter Evolution on Big Dataset #9916

Closed

GPU not utilizing 100% memory #9949

Closed

glenn-jocher mentioned this pull request Nov 17, 2022

How can we know how many resources my training is using? #10186

Closed

1 task

glenn-jocher mentioned this pull request Dec 5, 2022

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED #10408

Closed

1 task

glenn-jocher mentioned this pull request Dec 19, 2022

Cuda tried allocating an enormous amount of memory (1936GiB) #10528

Closed

2 tasks

glenn-jocher mentioned this pull request Jan 18, 2023

Training a few epoch memory suddenly OOM ultralytics/ultralytics#467

Closed

2 tasks

glenn-jocher mentioned this pull request Feb 28, 2023

Layer Wise Training #11081

Closed

1 task

glenn-jocher mentioned this pull request Mar 27, 2023

CUDA out of memory issue ultralytics/ultralytics#1654

Closed

2 tasks

UltralyticsAssistant mentioned this pull request Jan 3, 2022

Update export format docstrings #6151

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `autobatch` feature for best `batch-size` estimation #5092

Add `autobatch` feature for best `batch-size` estimation #5092

glenn-jocher commented Oct 8, 2021 •

edited by UltralyticsAssistant

Add autobatch feature for best batch-size estimation #5092

Add autobatch feature for best batch-size estimation #5092

Conversation

glenn-jocher commented Oct 8, 2021 • edited by UltralyticsAssistant

Limitations

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

Add `autobatch` feature for best `batch-size` estimation #5092

Add `autobatch` feature for best `batch-size` estimation #5092

glenn-jocher commented Oct 8, 2021 •

edited by UltralyticsAssistant