Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
Signed-off-by: ramlabserver <[email protected]>
  • Loading branch information
Owen-Liuyuxuan committed Nov 10, 2023
1 parent 7fca64e commit 6458d49
Showing 1 changed file with 20 additions and 1 deletion.
21 changes: 20 additions & 1 deletion readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,22 @@ You could checkout the ROS1 version of each inference package:

In this repo, we fully re-structure the code and messages formats for ROS2 (humble), and integrate multi-thread inferencing for three vision tasks.

- Currently all pretrained models are trained using the [visionfactory](https://github.com/Owen-Liuyuxuan/visionfactory) repo. Thus focusing on out-door autonomous driving scenarios. But it is ok to plugin ONNX models that satisfiy the [interface](#onnx-model-interface).
- Currently all pretrained models are trained using the [visionfactory](https://github.com/Owen-Liuyuxuan/visionfactory) repo. Thus focusing on out-door autonomous driving scenarios. But it is ok to plugin ONNX models that satisfiy the [interface](#onnx-model-interface). Published models description:

| Model | Type | Link | Description |
| ------------------------------ | ---------------- | --------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| monodepth_res101_384_1280.onnx | MonoDepth | [link](https://github.com/Owen-Liuyuxuan/ros2_vision_inference/releases/download/v1.0/monodepth_res101_384_1280.onnx) | FSNet, res101 backbone, model input shape (384x1280) trained on KITTI/KITTI360/nuscenes |
| bisenetv1.onnx | Segmentation | [link](https://github.com/Owen-Liuyuxuan/ros2_vision_inference/releases/download/v1.0/bisenetv1.onnx) | BiSeNetV1, model input shape (512x768) trained on remapped KITTI360/ApolloScene/CityScapes/BDD100k/a2d2 |
| mono3d_yolox_576_768.onnx | Mono3D Detection | [link](https://github.com/Owen-Liuyuxuan/ros2_vision_inference/releases/download/v1.0/mono3d_yolox_576_768.onnx) | YoloX-m MonoFlex, model input (576x768) trained on KITTI/nuscenes/ONCE/bdd100k/cityscapes|
| dla34_deform_576_768.onnx | Mono3D Detection | [link](https://github.com/Owen-Liuyuxuan/ros2_vision_inference/releases/download/v1.0.1/dla34_deform_576_768.onnx) | DLA34 Deformable Upsample MonoFlex, model input (576x768) trained on KITTI/nuscenes/ONCE/bdd100k/cityscapes|


## Getting Started

This repo relies on [ROS2](https://docs.ros.org/en/humble/Installation.html) and onnxruntime:

> If you want to use ROS1, checkout to ROS1 branch with `git checkout ros1`. We tested the ROS1 code in ROS noetic Ubuntu 20.04 (we need python3 so we suggest we at least run at Ubuntu 20.04). The branch `ros1` is a standard ROS1 package built with `catkin_make` and run with `roslaunch`
```bash
pip3 install onnxruntime-gpu
```
Expand Down Expand Up @@ -66,3 +75,13 @@ Segmentation ONNX `def forward(self, normalized_images[1, 3, H, W]):->long[1, H,
Mono3D ONNX: `def forward(self, normalized_images[1, 3, H, W], P[1, 3, 4]):->scores[N], bboxes[N,12], cls_indexes[N]`

Classes definitions are from the [visionfactory](https://github.com/Owen-Liuyuxuan/visionfactory) repo.

## Data, Domain

### Reshape Scheme

We resize the input image to the length/width, and pad zero on the others. We also modify the camera intrinsic accordingly before feeding into the onnx model. The output will be de-resized to the original shape. Currently published models are all trained with various input images, so the node should work naturally with different image sources.

### Expected Data Domain

This is related to the training data of the onnx models. The published models now mainly work on autonomous driving/road scenes. Most of the data for the published segmentation models only use the front-facing camera (Detection / MonoDepth are trained with various camera views).

0 comments on commit 6458d49

Please sign in to comment.