This repo is based on the release v7.0 of yolov5.
git clone https://github.com/cshbli/yolov5_qat.git
Download the labels and images of coco2017, and unzip to the same level directory as the current project.
Projects
├──datasets
| └── coco # Directory for datasets
│ ├── annotations
│ │ └── instances_val2017.json
│ ├── images
│ │ ├── train2017
│ │ └── val2017
│ ├── labels
│ │ ├── train2017
│ │ └── val2017
│ ├── train2017.txt
│ └── val2017.txt
└── yolov5_qat # Quantization source code
wget https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip # Download the labels needed
wget http:https://images.cocodataset.org/zips/train2017.zip
wget http:https://images.cocodataset.org/zips/val2017.zip
wget http:https://images.cocodataset.org/annotations/annotations_trainval2017.zip
$ cd /Projects/yolov5_qat
$ cd weights
$ wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5m.pt
$ cd ..
python val.py --weights weights/yolov5m.pt --data coco.yaml
Outputs:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.452
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.644
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.489
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.278
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.504
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.581
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.354
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.581
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.632
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.451
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.689
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.777
- Make sure to change the learning rate, otherwise it will long time to converge.
- We use a new hyps yaml here hyp.m-relu-tune.yaml. It is based on
hyp.scratch-low.yaml
, changed lr to smaller value.lr0: 0.001 # initial learning rate (SGD=1E-2, Adam=1E-3), changed from 0.01 lrf: 0.001 # final OneCycleLR learning rate (lr0 * lrf), changed from 0.01 ... warmup_bias_lr: 0.01 # warmup initial bias lr, changed from 0.1 ...
- We use a new hyps yaml here hyp.m-relu-tune.yaml. It is based on
- Disable GIT info checking
- Once we changed the default_act to ReLU, we can't use auto batch size anymore.
- We need specifiy the
batch-size
- Also we can change the default
batch-size
from 16 to 64
- We need specifiy the
It takes a long time to complete the retraining, please be patient.
python train.py --data coco.yaml --epochs 50 --weights weights/yolov5m.pt --hyp data/hyps/hyp.m-relu-tune.yaml --batch-size 64
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
0/49 6.16G 0.04115 0.06202 0.01698 150 640: 100%|██████████| 1849/1849 [51:50<00:00, 1.68s/it]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 40/40 [01:26<00:00, 2.17s/it]
all 5000 36335 0.701 0.557 0.609 0.416
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
13/49 10.2G 0.03954 0.05978 0.01563 198 640: 100%|██████████| 1849/1849 [51:32<00:00, 1.67s/it]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 40/40 [01:25<00:00, 2.13s/it]
all 5000 36335 0.709 0.567 0.617 0.428
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
14/49 10.2G 0.03948 0.05968 0.01557 240 640: 100%|██████████| 1849/1849 [51:30<00:00, 1.67s/it]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 40/40 [01:25<00:00, 2.13s/it]
all 5000 36335 0.708 0.568 0.618 0.429
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
19/49 10.2G 0.03922 0.05922 0.01519 162 640: 100%|██████████| 1849/1849 [51:23<00:00, 1.67s/it]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 40/40 [01:25<00:00, 2.13s/it]
all 5000 36335 0.713 0.567 0.62 0.43
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
20/49 10.2G 0.03911 0.05934 0.01513 228 640: 100%|██████████| 1849/1849 [51:33<00:00, 1.67s/it]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 40/40 [01:25<00:00, 2.13s/it]
all 5000 36335 0.707 0.569 0.619 0.431
Here is the complete retraining log file retraining after replacing SiLU with ReLU.
Assuming the retraining result folder name is changed to relu, run validation test:
python val.py --weights runs/train/relu/weights/best.pt --data coco.yaml
We will get the following validation results:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.434
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.625
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.468
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.263
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.484
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.344
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.563
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.613
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.437
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.663
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.767
Results saved to runs/val/exp
Similar as replacing SiLU with ReLU
python train.py --data coco.yaml --epochs 50 --weights weights/yolov5m.pt --hyp data/hyps/hyp.m-relu-tune.yaml --batch-size 64
Assuming the retraining result folder name is changed to relu6, run validation test:
python val.py --weights runs/train/relu6/weights/best.pt --data coco.yaml
We will get the following validation results:
Fusing layers...
Model summary: 212 layers, 21172173 parameters, 0 gradients, 48.9 GFLOPs
val: Scanning /home/hongbing/Projects/datasets/coco/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|██████████| 5000/5000 [00:00<?, ?it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 157/157 [01:15<00:00, 2.07it/s]
all 5000 36335 0.701 0.563 0.615 0.428
Speed: 0.1ms pre-process, 10.1ms inference, 0.9ms NMS per image at shape (32, 3, 640, 640)
Evaluating pycocotools mAP... saving runs/val/exp3/best_predictions.json...
loading annotations into memory...
Done (t=0.29s)
creating index...
index created!
Loading and preparing results...
DONE (t=3.26s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=46.18s).
Accumulating evaluation results...
DONE (t=10.45s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.431
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.621
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.467
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.260
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.484
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.559
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.344
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.563
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.612
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.427
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.667
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.763
PyTorch QAT flow:
- QAT can't use multiple GPUs. We need to specify the device ID.
python train.py --data coco.yaml --epochs 20 --cfg models/yolov5m.yaml \
--weights runs/train/relu/weights/best.pt --hyp data/hyps/hyp.qat.yaml \
--batch-size 32 --qat --device 1
Result log:
Starting training for 20 epochs...
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
0/19 9.96G 0.03882 0.05641 0.0128 199 640: 100%|██████████| 3697/3697 [17:44<00:00, 3.47it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 79/79 [00:36<00:00, 2.15it/s]
all 5000 36335 0.707 0.559 0.612 0.415
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
1/19 9.1G 0.03878 0.05633 0.01265 168 640: 100%|██████████| 3697/3697 [17:02<00:00, 3.61it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 79/79 [00:35<00:00, 2.20it/s]
all 5000 36335 0.715 0.558 0.614 0.416
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
2/19 9.11G 0.03873 0.05622 0.0125 163 640: 100%|██████████| 3697/3697 [17:02<00:00, 3.61it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 79/79 [00:36<00:00, 2.16it/s]
all 5000 36335 0.709 0.559 0.613 0.414
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
18/19 9.11G 0.0384 0.05547 0.01208 191 640: 100%|██████████| 3697/3697 [17:02<00:00, 3.61it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 79/79 [00:36<00:00, 2.18it/s]
all 5000 36335 0.713 0.557 0.613 0.416
Epoch GPU_mem box_loss obj_loss cls_loss Instances Size
19/19 9.11G 0.03835 0.05547 0.01204 202 640: 100%|██████████| 3697/3697 [17:03<00:00, 3.61it/s]
Class Images Instances P R mAP50 mAP50-95: 100%|██████████| 79/79 [00:35<00:00, 2.21it/s]
all 5000 36335 0.711 0.561 0.614 0.418
Please see the quantized model structure here: Quantized mode structure