YOLOv3-TensorRT-INT8-KCF

Description

YOLOv3-TensorRT-INT8-KCF is a TensorRT Int8-Quantization implementation of YOLOv3 (and tiny) on NVIDIA Jetson Xavier NX Board. The dataset we provide is a red ball. So we also use this to drive a car to catch the red ball, along with KCF, a traditional Object Tracking method.

Demo Video

Dependencies

GPU server (e.g. GTX2080Ti)

See yolov3/requirements.txt.

NVIDIA Jetson Xavier NX

TensorRT >= 7.0.0 (Pre-installed on NX.)
OpenCV and opencv_contrib == 3.4.0 (See here for installation help.)

Tutorials

1. Train

On GPU server, not NX.

git clone https://github.com/lingffff/YOLOv3-TensorRT-INT8-KCF.git
cd YOLOv3-TensorRT-INT8-KCF
cd yolov3
# Download official pre-trained COCO darknet weights
sh weights/download_yolov3_weights.sh

Download redball dataset here, unzip and replace the folder redball. Then start training. Remove '--tiny' if you train YOLOv3 model.

python train.py --device 0 --tiny

2. Transfer

Now we get YOLOv3(tiny) weights in weights/best.pt. Transfer it to binary file redball(-tiny).wts, which convert weights to TensorRT for building inference engine.

python gen_wts.py --tiny

Then copy ./redball(-tiny).wts to NX Board.

3. Build Engine

On NX Platform below.

git https://github.com/lingffff/YOLOv3-TensorRT-INT8-KCF.git
cd YOLOv3-TensorRT-INT8-KCF
# Put redball(-tiny).wts in YOLOv3-TensorRT-INT8-KCF

Build the project.

mkdir build
cd build
# YOLOv3: -DTINY=OFF, tiny: -DTINY=ON
cmake -DTINY=ON ..
make -j$(nproc)

Now we get executable file build_engine and detect.
Run build_engine. Use -s argument to specify quantization options: int8, fp16, fp32(default).

./build_engine -s int8

4. Inference

Run detect to detect pictures or camera video. You can also check KCF tracking method here by other options below.

./detect -d ../samples

Options:

Argument	Description
-d <folder>	Detect pictures in the folder.
-v	Detect camera video stream.
-t	Detect video along with KCF tracking method.

Benchmark

Models	Device	BatchSize	Mode	Input Size	Speed
YOLOv3	NX	1	FP32	416x416	85ms
YOLOv3	NX	1	FP16	416x416	30ms
YOLOv3	NX	1	INT8	416x416	26ms

YOLOv3-tiny	NX	1	FP32	416x416	26ms
YOLOv3-tiny	NX	1	FP16	416x416	19ms
YOLOv3-tiny	NX	1	INT8	416x416	20ms

Wow! FP16 is amazing!!!

TODO

Convert weights to TensorRT by a more common way, like ONNX.
Run detection and tracking multi-thread-ly.
Implement a Quantization & Inference framework myself.

Acknowledge

YOLOv3 Pytorch implementation from ultralytics/yolov3.
YOLOv3 TensorRT implementation from wang-xinyu/tensorrtx.
TensorRT Int8 implementation from NVIDIA/TensorRT/samples/sampleINT8.

With my sincerely appreciation!

About me

Just call me Al (not ai but al. LOL.) / Albert / lingff.
E-mail: [email protected]
Gitee: https://gitee.com/lingff
CSDN: https://blog.csdn.net/weixin_43214408

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
include		include
samples		samples
src		src
yolov3		yolov3
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
build_engine.cpp		build_engine.cpp
detect.cpp		detect.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLOv3-TensorRT-INT8-KCF

Description

Demo Video

Dependencies

GPU server (e.g. GTX2080Ti)

NVIDIA Jetson Xavier NX

Tutorials

1. Train

2. Transfer

3. Build Engine

4. Inference

Benchmark

TODO

Acknowledge

About me

About

Releases

Packages

Languages

lingffff/YOLOv3-TensorRT-INT8-KCF

Folders and files

Latest commit

History

Repository files navigation

YOLOv3-TensorRT-INT8-KCF

Description

Demo Video

Dependencies

GPU server (e.g. GTX2080Ti)

NVIDIA Jetson Xavier NX

Tutorials

1. Train

2. Transfer

3. Build Engine

4. Inference

Benchmark

TODO

Acknowledge

About me

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages