NVIDIA Object Detection Toolkit (ODTK)

Fast and accurate single stage object detection with end-to-end GPU optimization.

Description

ODTK is a single shot object detector with various backbones and detection heads. This allows performance/accuracy trade-offs.

It is optimized for end-to-end GPU processing using:

The PyTorch deep learning framework with ONNX support
NVIDIA Apex for mixed precision and distributed training
NVIDIA DALI for optimized data pre-processing
NVIDIA TensorRT for high-performance inference
NVIDIA DeepStream for optimized real-time video streams support

Rotated bounding box detections

This repo now supports rotated bounding box detections. See rotated detections training and rotated detections inference documents for more information on how to use the --rotated-bbox command.

Bounding box annotations are described by [x, y, w, h, theta].

Performance

The detection pipeline allows the user to select a specific backbone depending on the latency-accuracy trade-off preferred.

ODTK RetinaNet model accuracy and inference latency & FPS (frames per seconds) for COCO 2017 (train/val) after full training schedule. Inference results include bounding boxes post-processing for a batch size of 1. Inference measured at --resize 800 using --with-dali on a FP16 TensorRT engine.

Backbone	mAP @[IoU=0.50:0.95]	Training Time on DGX1v	Inference latency FP16 on V100	Inference latency INT8 on T4	Inference latency FP16 on A100	Inference latency INT8 on A100
ResNet18FPN	0.318	5 hrs	14 ms; 71 FPS	18 ms; 56 FPS	9 ms; 110 FPS	7 ms; 141 FPS
MobileNetV2FPN	0.333		14 ms; 74 FPS	18 ms; 56 FPS	9 ms; 114 FPS	7 ms; 138 FPS
ResNet34FPN	0.343	6 hrs	16 ms; 64 FPS	20 ms; 50 FPS	10 ms; 103 FPS	7 ms; 142 FPS
ResNet50FPN	0.358	7 hrs	18 ms; 56 FPS	22 ms; 45 FPS	11 ms; 93 FPS	8 ms; 129 FPS
ResNet101FPN	0.376	10 hrs	22 ms; 46 FPS	27 ms; 37 FPS	13 ms; 78 FPS	9 ms; 117 FPS
ResNet152FPN	0.393	12 hrs	26 ms; 38 FPS	33 ms; 31 FPS	15 ms; 66 FPS	10 ms; 103 FPS

Installation

For best performance, use the latest PyTorch NGC docker container. Clone this repository, build and run your own image:

git clone https://github.com/nvidia/retinanet-examples
docker build -t odtk:latest retinanet-examples/
docker run --gpus all --rm --ipc=host -it odtk:latest

Usage

Training, inference, evaluation and model export can be done through the odtk utility. For more details, including a list of parameters, please refer to the TRAINING and INFERENCE documentation.

Training

Train a detection model on COCO 2017 from pre-trained backbone:

odtk train retinanet_rn50fpn.pth --backbone ResNet50FPN \
    --images /coco/images/train2017/ --annotations /coco/annotations/instances_train2017.json \
    --val-images /coco/images/val2017/ --val-annotations /coco/annotations/instances_val2017.json

Fine Tuning

Fine-tune a pre-trained model on your dataset. In the example below we use Pascal VOC with JSON annotations:

odtk train model_mydataset.pth --backbone ResNet50FPN \
    --fine-tune retinanet_rn50fpn.pth \
    --classes 20 --iters 10000 --val-iters 1000 --lr 0.0005 \
    --resize 512 --jitter 480 640 --images /voc/JPEGImages/ \
    --annotations /voc/pascal_train2012.json --val-annotations /voc/pascal_val2012.json

Note: the shorter side of the input images will be resized to resize as long as the longer side doesn't get larger than max-size. During training, the images will be randomly randomly resized to a new size within the jitter range.

Inference

Evaluate your detection model on COCO 2017:

odtk infer retinanet_rn50fpn.pth --images /coco/images/val2017/ --annotations /coco/annotations/instances_val2017.json

Run inference on your dataset:

odtk infer retinanet_rn50fpn.pth --images /dataset/val --output detections.json

Optimized Inference with TensorRT

For faster inference, export the detection model to an optimized FP16 TensorRT engine:

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
csrc		csrc
extras		extras
odtk		odtk
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.deepstream		Dockerfile.deepstream
INFERENCE.md		INFERENCE.md
LICENSE		LICENSE
README.md		README.md
TRAINING.md		TRAINING.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA Object Detection Toolkit (ODTK)

Description

Rotated bounding box detections

Performance

Installation

Usage

Training

Fine Tuning

Inference

Optimized Inference with TensorRT

License

tliaresq/retinanet-examples

Folders and files

Latest commit

History

Repository files navigation

NVIDIA Object Detection Toolkit (ODTK)

Description

Rotated bounding box detections

Performance

Installation

Usage

Training

Fine Tuning

Inference

Optimized Inference with TensorRT