Skip to content

Commit

Permalink
add video exporter
Browse files Browse the repository at this point in the history
  • Loading branch information
jveitchmichaelis committed Sep 19, 2021
1 parent abda251 commit 775b16d
Show file tree
Hide file tree
Showing 8 changed files with 289 additions and 10 deletions.
6 changes: 4 additions & 2 deletions DeepLabel.pro
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,8 @@ SOURCES += \
src/refinerangedialog.cpp \
src/cocoimporter.cpp \
src/tfrecordexporter.cpp \
src/tfrecordimporter.cpp
src/tfrecordimporter.cpp \
src/videoexporter.cpp

HEADERS += \
src/cliprogressbar.h \
Expand Down Expand Up @@ -142,7 +143,8 @@ HEADERS += \
src/refinerangedialog.h \
src/cocoimporter.h \
src/tfrecordexporter.h \
src/tfrecordimporter.h
src/tfrecordimporter.h \
src/videoexporter.h

FORMS += \
src/importdialog.ui \
Expand Down
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,16 @@

[![Build OS X](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_osx.yml/badge.svg)](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_osx.yml) [![Build Ubuntu](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_ubuntu.yml/badge.svg)](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_ubuntu.yml)[![Build Windows](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_windows.yml/badge.svg)](https://github.com/jveitchmichaelis/deeplabel/actions/workflows/build_windows.yml)

**If you use DeepLabel for research or commercial purposes, please cite here!** [![DOI](https://zenodo.org/badge/105791274.svg)](https://zenodo.org/badge/latestdoi/105791274)

Download the [latest release](https://github.com/jveitchmichaelis/deeplabel/releases/latest)! If you are an OS X user, check the `Actions` tab to download an automated and self-contained DMG build.

DeepLabel is a cross-platform tool for annotating images with labelled bounding boxes. A typical use-case for the program is labelling ground truth data for object-detection machine learning applications. DeepLabel runs as a standalone app and compiles on Windows, Linux and Mac.

DeepLabel is written in C++ and is resource-efficient, typically using less than 100MB RAM to run. There is a GUI or you can run on the command line for batch/automated processing.

Deeplabel also supports running inference using state-of-the-art object detection models like Faster-RCNN and YOLOv4. With support out-of-the-box for CUDA, you can quickly label an entire dataset using an existing model.

**If you use DeepLabel for research or commercial purposes, please cite here!** [![DOI](https://zenodo.org/badge/105791274.svg)](https://zenodo.org/badge/latestdoi/105791274)

Download the [latest release](https://github.com/jveitchmichaelis/deeplabel/releases/latest)! If you are an OS X user, check the `Actions` tab to download an automated and self-contained DMG build. The Github CI runs on every push and attempts to build DeepLabel for Windows, Mac and Linux. You can check the action workflows for hints on how to compile everything if you're having trouble.

**Note: Deeplabel for Windows now packages CUDA and CUDNN for inference. This results an enormous distributable size as the cudnn inference libraries are 600MB+ alone. The CUDA runtime adds another 300MB. I'm looking into uploading simultaneous non-CUDA releases to save space. The alternative is to ask you to install CUDA and CUDNN yourself, but you'd need the correct version which is a pain.**

Ready made binaries for Windows and OS X are on the release page. It is recommended that you build for Linux yourself, but it's not difficult.
Expand Down Expand Up @@ -87,22 +89,23 @@ deeplabel.exe export -i labels.lbdlb -f TFRecord -n project.names -o ./output/
Currently you can import data in the following formats:

* Darknet (provide image list and names)
* COCO (provide an annotation .json file)
* COCO (provide an annotation .json file and image folder)
* MOT
* TFRecord (parsing works, but full import is not possible yet)
* Pascal VOC Coming soon!
* Pascal VOC

## Data export

Currently you can export in:

* KITTI (e.g. for Nvidia DIGITS)
* Darknet for YOLO
* Pascal
* Pascal VOC
* COCO (experimental)
* Google Cloud Platform (e.g. for AutoML)
* TFRecord (for the Tensorflow Object Detection library)
* Note this uses protobuf directly and there is _no_ dependency on Tensorflow. I believe this is one of the few implementations of TFRecord writing in c++.
* Video (experimental, command line only)

Deeplabel treats your data as "golden" and does not make any attempt to modify it directly. This is a safe approach to avoid accidental corruption of a dataset that you spent months collating. As such, when you export labels, a copy of your data will be created with associated label files. For example, KITTI requires frames to be numerically labelled. In the future, augmentation may also be added, which is another reason to **not** modify your existing images.

Expand Down
1 change: 1 addition & 0 deletions src/baseexporter.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include <opencv2/opencv.hpp>
#include <random>

#include <cliprogressbar.h>
#include <labelproject.h>
#include <boundingbox.h>

Expand Down
70 changes: 69 additions & 1 deletion src/cliparser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ CliParser::CliParser(QObject *parent) : QObject(parent)

void CliParser::SetupOptions(){

exportFormatOption = new QCommandLineOption({"f", "format"}, "export format", "[kitti, darknet, gcp, voc, coco, mot, birdsai, tfrecord]");
exportFormatOption = new QCommandLineOption({"f", "format"}, "export format", "kitti, darknet, gcp, voc, coco, mot, birdsai, tfrecord");
exportOutputFolder = new QCommandLineOption({"o", "output"}, "output folder", "folder path");
exportInputFile = new QCommandLineOption({"i", "input"}, "label database", "file path");
exportValidationSplit = new QCommandLineOption({"s", "split"}, "validation split percentage", "percentage", "20");
Expand All @@ -20,12 +20,21 @@ void CliParser::SetupOptions(){
exportShuffleImages = new QCommandLineOption("shuffle", "shuffle images when splitting");
exportAppendLabels = new QCommandLineOption("append-labels", "append to label files");
exportUnlabelledImages = new QCommandLineOption("export-unlabelled", "export images without labels");
exportVideoFilename = new QCommandLineOption("video-filename", "video output filename", "filename", "out.mp4");
exportVideoFourcc = new QCommandLineOption("fourcc", "video codec fourcc", "codec", "h264");
exportVideoFps = new QCommandLineOption("fps", "video framerate", "fps", "10");
exportVideoColourmap = new QCommandLineOption("colourmap", "video colourmap", "colourmap", "Inferno");
exportVideoSize = new QCommandLineOption("videosize", "video size: width, height", "w,h", "1280,720");
exportVideoDisplayBoxes = new QCommandLineOption("display-boxes", "display boxes in output", "on, off", "on");
exportVideoDisplayNames = new QCommandLineOption("display-names", "display class names", "on, off", "on");
importImages = new QCommandLineOption("images", "import image path/folder", "images");
importTFRecordMask = new QCommandLineOption("records", "mask for TF Records (* wildcard)", "images");
importAnnotations = new QCommandLineOption("annotations", "import annotation path/folder", "annotations");
importUnlabelledImages = new QCommandLineOption("import-unlabelled", "import images without labels");
importOverwrite = new QCommandLineOption("overwrite", "overwrite existing databases");

configSilence = new QCommandLineOption({"q","quiet"}, "no log messages");

parser.addHelpOption();
parser.addVersionOption();
parser.setOptionsAfterPositionalArgumentsMode(QCommandLineParser::ParseAsOptions);
Expand All @@ -43,6 +52,13 @@ void CliParser::SetupOptions(){
parser.addOption(*exportPascalVOCLabelMap);
parser.addOption(*exportShuffleImages);
parser.addOption(*exportAppendLabels);
parser.addOption(*exportVideoFourcc);
parser.addOption(*exportVideoFps);
parser.addOption(*exportVideoFilename);
parser.addOption(*exportVideoColourmap);
parser.addOption(*exportVideoSize);
parser.addOption(*exportVideoDisplayBoxes);
parser.addOption(*exportVideoDisplayNames);
parser.addOption(*exportUnlabelledImages);
parser.addOption(*importImages);
parser.addOption(*importAnnotations);
Expand All @@ -63,6 +79,10 @@ bool CliParser::Run(){
qCritical() << "Unknown option: " << option;
}

if(parser.isSet(*configSilence)){
qSetMessagePattern("");
}

if(mode == "export"){
res = handleExport();
}else if(mode == "import"){
Expand Down Expand Up @@ -231,6 +251,54 @@ bool CliParser::handleExport(){
qCritical() << "No names file specifed.";
return false;
}
}else if(parser.value(*exportFormatOption) == "video"){
exporter = new VideoExporter(&project);
auto filename = parser.value(*exportVideoFilename);
auto fourcc = parser.value(*exportVideoFourcc);

bool ok = false;
auto fps = parser.value(*exportVideoFps).toDouble(&ok);
if(!ok){
qCritical() << "Invalid FPS, you specified: " << parser.value(*exportVideoFps);
return false;
}

// Check video size
auto videosize = parser.value(*exportVideoSize).split(",");
if(videosize.size() != 2){
qCritical() << "Video size should be two comma separated values: w,h";
return false;
}

auto width = parser.value(*exportVideoSize).split(",").at(0).toInt(&ok);
if(!ok){
qCritical() << "Invalid video width, you specified: " << parser.value(*exportVideoSize).split(",").at(0);
return false;
}

auto height = parser.value(*exportVideoSize).split(",").at(1).toInt(&ok);
if(!ok){
qCritical() << "Invalid video height, you specified: " << parser.value(*exportVideoSize).split(",").at(1);
return false;
}

auto colourmap = parser.value(*exportVideoColourmap);

static_cast<VideoExporter*>(exporter)->videoConfig(filename, fourcc, fps, {width, height}, colourmap);
static_cast<VideoExporter*>(exporter)->labelConfig(parser.value(*exportVideoDisplayNames) == "on",
parser.value(*exportVideoDisplayBoxes) == "on");


if(!parser.isSet(*exportOutputFolder) || parser.value(*exportOutputFolder) == ""){
qCritical() << "Please specify an output folder";
return false;
}else{
exporter->setOutputFolder(parser.value(*exportOutputFolder), parser.isSet(*exportNoSubfolders));
}

exporter->process();
return true;

}else{
qCritical() << "Invalid exporter type specified";
return false;
Expand Down
9 changes: 9 additions & 0 deletions src/cliparser.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,21 @@ class CliParser : public QObject
QCommandLineOption *exportAppendLabels;
QCommandLineOption *exportUnlabelledImages;

QCommandLineOption *exportVideoFilename;
QCommandLineOption *exportVideoFourcc;
QCommandLineOption *exportVideoFps;
QCommandLineOption *exportVideoColourmap;
QCommandLineOption *exportVideoSize;
QCommandLineOption *exportVideoDisplayNames;
QCommandLineOption *exportVideoDisplayBoxes;

QCommandLineOption *importImages;
QCommandLineOption *importAnnotations;
QCommandLineOption *importUnlabelledImages;
QCommandLineOption *importOverwrite;
QCommandLineOption *importTFRecordMask;

QCommandLineOption *configSilence;
signals:

};
Expand Down
1 change: 1 addition & 0 deletions src/exporter.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@
#include <cocoexporter.h>
#include <gcpexporter.h>
#include <tfrecordexporter.h>
#include <videoexporter.h>

#endif // EXPORTER_H
164 changes: 164 additions & 0 deletions src/videoexporter.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
#include "videoexporter.h"

std::unordered_map<std::string ,int> VideoExporter::colour_hashmap{
{ "Cividis", cv::COLORMAP_CIVIDIS },
{ "Inferno", cv::COLORMAP_INFERNO },
{ "Magma", cv::COLORMAP_MAGMA },
{ "Hot", cv::COLORMAP_HOT },
{ "Bone", cv::COLORMAP_BONE },
{ "Plasma", cv::COLORMAP_PLASMA },
{ "Jet", cv::COLORMAP_JET },
{ "Rainbow", cv::COLORMAP_RAINBOW },
{ "Ocean", cv::COLORMAP_OCEAN },
{ "Viridis", cv::COLORMAP_VIRIDIS }
};

void VideoExporter::process()
{

auto out_filename = QDir(output_folder).absoluteFilePath(filename);

cv::VideoWriter writer(out_filename.toStdString(), fourcc, fps, frame_size);
QList<QString> images;

if(export_unlabelled)
project->getImageList(images);
else
project->getLabelledImageList(images);

if(!writer.isOpened()){
qCritical() << "Failed to open video writer";
}

auto pbar = cliProgressBar();
double progress = 0;
int i = 0;

qInfo() << "Writing video to file:" << out_filename;

for(auto &abs_image_path : images){
// Read
auto image = cv::imread(abs_image_path.toStdString(), cv::IMREAD_UNCHANGED);

int w, h;
w = image.cols;
h = image.rows;

cv::Mat image_resized;
cv::resize(image, image_resized, frame_size);

if(image_resized.elemSize() == 2){
convert16(image_resized);
}

if (image_resized.channels() == 4){
cv::cvtColor(image_resized, image_resized, cv::COLOR_RGBA2RGB);
}else if(image_resized.channels() == 1){
cv::applyColorMap(image_resized, image_resized, colourmap);
}

if(this->display_boxes){
QList<BoundingBox> labels;
project->getLabels(abs_image_path, labels);

for(auto &label : labels){
double x_scale = static_cast<double>(w)/image_resized.cols;
double y_scale = static_cast<double>(h)/image_resized.rows;
drawBoundingBox(image_resized, label, x_scale, y_scale, this->box_thickness);

}
}

writer.write(image_resized);

progress = 100*static_cast<double>(i++)/images.size();
pbar.update(progress);
pbar.print();

}

writer.release();
}

void VideoExporter::drawBoundingBox(cv::Mat &source, BoundingBox box, double x_scale, double y_scale, int thickness){


int top_x = box.rect.x()/x_scale;
int top_y = box.rect.y()/y_scale;
double font_scale = 0.8;

auto rect = cv::Rect2i(top_x,
top_y,
box.rect.width()/x_scale,
box.rect.height()/y_scale);

auto colour_list = QColor::colorNames();
QColor colour = QColor(colour_list.at(std::max(0, box.classid) % colour_list.size()) );
cv::Scalar color(colour.red(), colour.green(), colour.blue());

cv::rectangle(source, rect, color, thickness);

if(display_names){
auto label_string = QString("%1").arg(box.classname).toStdString();

int baseline;
auto text_size = cv::getTextSize(label_string, cv::FONT_HERSHEY_DUPLEX, font_scale, thickness, &baseline);

cv::Rect2i label_background(top_x,
top_y,
text_size.width,
text_size.height);
cv::rectangle(source, label_background, color, -1);

auto text_colour = cv::Scalar(0,0,0);
if(colour.red() == 0
&& colour.green() == 0
&& colour.blue() == 0){
text_colour = {255,255,255};
}

cv::putText(source,
label_string,
{top_x, top_y + text_size.height},
cv::FONT_HERSHEY_DUPLEX,
this->font_scale,
text_colour,
thickness);
}
}

void VideoExporter::convert16(cv::Mat &source, double minval, double maxval){

if(minval < 0 || maxval < 0){
cv::minMaxIdx(source, &minval, &maxval);
}

double range = maxval-minval;
double scale_factor = 255.0/range;

source.convertTo(source, CV_32FC1);
source -= minval;
source *= scale_factor;
source.convertTo(source, CV_8UC1);

return;
}

void VideoExporter::labelConfig(bool display_names, bool display_boxes, int box_thickness, double font_size){
this->display_names = display_names;
this->display_boxes = display_boxes;
this->box_thickness = box_thickness;
this->font_scale = font_size;
}

void VideoExporter::videoConfig(QString filename, QString fourcc_string, double fps, cv::Size frame_size, QString colourmap)
{
this->filename = filename;
this->fps = fps;
this->fourcc = cv::VideoWriter::fourcc(fourcc_string.at(0).toLatin1(),
fourcc_string.at(1).toLatin1(),
fourcc_string.at(2).toLatin1(),
fourcc_string.at(3).toLatin1());
this->frame_size = frame_size;
this->colourmap = colour_hashmap[colourmap.toStdString()];
}
Loading

0 comments on commit 775b16d

Please sign in to comment.