Skip to content

Commit

Permalink
support mobile_sam in label_anything (#132)
Browse files Browse the repository at this point in the history
* support mobile_sam in label_anything

* update readme

* update readme

* remove some unused files

* modify readme.md

* update mobile_sam en doc
  • Loading branch information
YanxingLiu authored Jul 30, 2023
1 parent 7c9564a commit 5846726
Show file tree
Hide file tree
Showing 22 changed files with 3,533 additions and 85 deletions.
6 changes: 6 additions & 0 deletions label_anything/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
ARG PYTORCH="2.0.0"
ARG CUDA="11.7"
ARG CUDNN="8"

FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel

43 changes: 30 additions & 13 deletions label_anything/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ This article introduces a semi-automatic annotation solution combining Label-Stu
<img src="https://user-images.githubusercontent.com/25839884/233969712-0d9d6f0a-70b0-4b3e-b054-13eda037fb20.gif" width="80%">
</div>


<br>

- SAM (Segment Anything) is a segmentation model launched by Meta AI, designed to segment everything.
Expand Down Expand Up @@ -78,12 +77,16 @@ wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
# For better segmentation results, use the sam_vit_h_4b8939.pth weights
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

# download mobile_sam pretrained model
wget https://raw.githubusercontent.com/ChaoningZhang/MobileSAM/master/weights/mobile_sam.pt
# or manually download mobile_sam.pt in https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/, and put it into path/to/playground/label_anything

```

PS: If you are using a having trouble with the wget/curl commands, please manually download the target file (copy the URL to a browser or download tool). The same applies to the following instructions.
For example: https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth


Install Label-Studio and label-studio-ml-backend

```shell
Expand All @@ -95,21 +98,36 @@ pip install label-studio-ml==1.0.9
```

## Start the service

⚠label_anything requires the SAM backend to be enabled and then the web service to be started before the model can be loaded. (a total of two steps are required to start)

1.Start the SAM backend inference service:

```shell
cd path/to/playground/label_anything

# inference on sam
label-studio-ml start sam --port 8003 --with \
model_name=mobile_sam \
sam_config=vit_b \
sam_checkpoint_file=./sam_vit_b_01ec64.pth \
out_mask=True \
out_bbox=True \
device=cuda:0
# device=cuda:0 is for using GPU inference. If you want to use CPU inference, replace cuda:0 with cpu.
# out_poly=True returns the annotation of the bounding polygon.

# inference on mobile_sam
label-studio-ml start sam --port 8003 --with \
model_name=mobile_sam \
sam_config=vit_t \
sam_checkpoint_file=./mobile_sam.pt \
out_mask=True \
out_bbox=True \
device=cpu
# device=cuda:0 is for using GPU inference. If you want to use CPU inference, replace cuda:0 with cpu.
# out_poly=True returns the annotation of the bounding polygon.

```

PS: In Windows environment, entering the following in Anaconda Powershell Prompt is equivalent to the input above:
Expand All @@ -131,12 +149,13 @@ sam_checkpoint_file=$env:sam_checkpoint_file `
out_mask=$env:out_mask `
out_bbox=$env:out_bbox `
device=$env:device
# mobile_sam on windows have not been tested, if you are interesteed in it, please modify the shell script like upper script.
```

![image](https://user-images.githubusercontent.com/25839884/233821553-0030945a-8d83-4416-8edd-373ae9203a63.png)


At this point, the SAM backend inference service has started.
At this point, the SAM backend inference service has started.

⚠The above terminal window needs to be kept open.

Expand All @@ -162,6 +181,7 @@ set ML_TIMEOUT_SETUP=40
```

Start Label-Studio web service:

```shell
label-studio start
```
Expand Down Expand Up @@ -196,18 +216,18 @@ wget https://download.openmmlab.com/mmyolo/data/cat_dataset.zip && unzip cat_dat
![](https://cdn.vansin.top/picgo20230330133715.png)
2.Use images stored on the server:
realized through 'Cloud Storages'
① Set environment variables before launch the SAM backend:
```
export LOCAL_FILES_DOCUMENT_ROOT=path/to/playground/label_anything
```
② Set environment variables before launch the label studio backend to allow label studio to use local files:
```
export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
Expand Down Expand Up @@ -253,7 +273,8 @@ Configure Label-Studio keypoint, Mask, and other annotations in Settings/Labelin
</BrushLabels>
</View>
```
In the above XML, we have configured the annotations, where KeyPointLabels are for keypoint annotations, BrushLabels are for Mask annotations, PolygonLabels are for bounding polygon annotations, and RectangleLabels are for rectangle annotations.
In the above XML, we have configured the annotations, where KeyPointLabels are for keypoint annotations, BrushLabels are for Mask annotations, PolygonLabels are for bounding polygon annotations, and RectangleLabels are for rectangle annotations.
This example uses two categories, cat and person. If community users want to add more categories, they need to add the corresponding categories in KeyPointLabels, BrushLabels, PolygonLabels, and RectangleLabels respectively.
Expand Down Expand Up @@ -283,12 +304,10 @@ To use this feature, enable the Auto-Annotation toggle and it is recommended to
![image](https://user-images.githubusercontent.com/25839884/233833200-a44c9c5f-66a8-491a-b268-ecfb6acd5284.png)
Point2Label: As can be seen from the following gif animation, by simply clicking a point on the object, the SAM algorithm is able to segment and detect the entire object.
![SAM8](https://user-images.githubusercontent.com/25839884/233835410-29896554-963a-42c3-a523-3b1226de59b6.gif)
Bbox2Label: As can be seen from the following gif animation, by simply annotating a bounding box, the SAM algorithm is able to segment and detect the entire object.
![SAM10](https://user-images.githubusercontent.com/25839884/233969712-0d9d6f0a-70b0-4b3e-b054-13eda037fb20.gif)
Expand All @@ -305,7 +324,6 @@ You can use VS Code to open the extracted folder and see the annotated dataset,
![](https://cdn.vansin.top/picgo20230330140321.png)
### Label Studio Output Conversion to RLE Format Masks
Since the coco exported by label studio does not support rle instance labeling, it only supports polygon instances.
Expand All @@ -325,10 +343,10 @@ python tools/convert_to_rle_mask_coco.py --json_file_path path/to/LS_json --out_
--out_dir Output path
After generation the script outputs a list in the terminal that corresponds to the category ids and can be used to copy and fill the config for training.
Under the output path, there are two folders: annotations and images, annotations is the coco format json, and images is the sorted dataset.
```
Your dataset
├── annotations
Expand Down Expand Up @@ -357,7 +375,6 @@ cd mmdetection; pip install -e .; cd ..
Then use this script to output the config for training on demand, where the template `mask-rcnn_r50_fpn` is provided in `label_anything/config_template`.
```shell
#Install Jinja2
pip install Jinja2
Expand Down Expand Up @@ -401,7 +418,6 @@ The following is the result of the transformation using the transformed dataset
After the previous step a config is generated that can be used for mmdetection training, the path is ``data/my_set/config_name.py`` which we can use for training.
```shell
python tools/train.py data/my_set/mask-rcnn_r50_fpn.py
```
Expand All @@ -413,6 +429,7 @@ After training, you can use ``tools/test.py`` for testing.
```shell
python tools/test.py data/my_set/mask-rcnn_r50_fpn.py path/of/your/checkpoint --show --show-dir my_show
```
The visualization image will be saved in `work_dir/{timestamp}/my_show`
When finished, we can get the model test visualization. On the left is the annotation image, and on the right is the model output.
Expand Down
50 changes: 33 additions & 17 deletions label_anything/readme_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@

<br>


- SAM (Segment Anything) 是 Meta AI 推出的分割一切的模型。
- [Label Studio](https://github.com/heartexlabs/label-studio) 是一款优秀的标注软件,覆盖图像分类、目标检测、分割等领域数据集标注的功能。

Expand Down Expand Up @@ -65,19 +64,24 @@ pip install torch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1

```

安装 SAM 并下载预训练模型
安装 SAM 并下载预训练模型(目前支持)

```shell
cd path/to/playground/label_anything
# 在 Windows 中,进行下一步之前需要完成以下命令行
# conda install pycocotools -c conda-forge
pip install opencv-python pycocotools matplotlib onnxruntime onnx
pip install opencv-python pycocotools matplotlib onnxruntime onnx timm
pip install git+https://github.com/facebookresearch/segment-anything.git
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth

# 下载sam预训练模型
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
# 如果想要分割的效果好请使用 sam_vit_h_4b8939.pth 权重
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

# 下载mobile_sam 预训练模型
wget https://raw.githubusercontent.com/ChaoningZhang/MobileSAM/master/weights/mobile_sam.pt
# 如果下载失败请手动下载https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/ 目录下的mobile_sam.pt,将其放置到path/to/playground/label_anything目录下
```

PS: 如果您使用 Windows 环境,请忽略 wget 命令,手动下载 wget 的目标文件(复制 url 到浏览器或下载工具中)
Expand All @@ -97,25 +101,40 @@ pip install label-studio-ml==1.0.9

⚠label_anything 需要启用 SAM 后端推理后再启动网页服务才可配置模型(一共需要两步启动)

1.启动 SAM 后端推理服务:
1.启动后端推理服务:

目前label_anything支持sam和mobile_sam两种推理模型, 用户可以根据自身需求自行选择,注意模型和上一步下载的权重需要对应。mobile_sam相较于sam具有更快的推理速度和更低的显存占用,分割效果仅有轻微下滑,建议cpu推理采用mobile_sam。

```shell
cd path/to/playground/label_anything

# 采用SAM进行后端推理
label-studio-ml start sam --port 8003 --with \
model_name=sam \
sam_config=vit_b \
sam_checkpoint_file=./sam_vit_b_01ec64.pth \
out_mask=True \
out_bbox=True \
device=cuda:0 \
device=cuda:0
# device=cuda:0 为使用 GPU 推理,如果使用 cpu 推理,将 cuda:0 替换为 cpu
# out_poly=True 返回外接多边形的标注

# 采用mobile_sam进行后端推理
label-studio-ml start sam --port 8003 --with \
model_name=mobile_sam \
sam_config=vit_t \
sam_checkpoint_file=./mobile_sam.pt \
out_mask=True \
out_bbox=True \
device=cpu
# device=cuda:0 为使用 GPU 推理,如果使用 cpu 推理,将 cuda:0 替换为 cpu
# out_poly=True 返回外接多边形的标注
```

PS: 在 Windows 环境中,在 Anaconda Powershell Prompt 输入以下内容等价于上方的输入:

```shell

cd path/to/playground/label_anything

$env:sam_config = "vit_b"
Expand All @@ -136,7 +155,6 @@ device=$env:device

![image](https://user-images.githubusercontent.com/25839884/233821553-0030945a-8d83-4416-8edd-373ae9203a63.png)


此时,SAM 后端推理服务已经启动。

⚠以上的终端窗口需要保持打开状态。
Expand Down Expand Up @@ -200,17 +218,18 @@ wget https://download.openmmlab.com/mmyolo/data/cat_dataset.zip && unzip cat_dat

![](https://cdn.vansin.top/picgo20230330133715.png)


2.直接使用服务器上的图片数据:

通过 Cloud Storages 的方式实现。

① 在启动 SAM 后端之前,需要设置环境变量:

```
export LOCAL_FILES_DOCUMENT_ROOT=path/to/playground/label_anything
```

② 在启动 label studio 之前,需要设置环境变量:

```
export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
Expand All @@ -232,6 +251,7 @@ export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=path/to/playground/label_anything
### 配置 XML

---

`Settings/Labeling Interface` 中配置 Label-Studio 关键点和 Mask 标注。

```xml
Expand All @@ -255,6 +275,7 @@ export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=path/to/playground/label_anything
</BrushLabels>
</View>
```

在上述 XML 中我们对标注进行了配置,其中 `KeyPointLabels` 为关键点标注,`BrushLabels` 为 Mask 标注,`PolygonLabels` 为外接多边形标注,`RectangleLabels` 为矩形标注。

本实例使用 `cat``person` 两个类别,如果社区用户想增加更多的类别需要分别在 `KeyPointLabels``BrushLabels``PolygonLabels``RectangleLabels` 中添加对应的类别。
Expand Down Expand Up @@ -283,17 +304,16 @@ export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=path/to/playground/label_anything

需要打开 `Auto-Annotation` 的开关,并建议勾选 `Auto accept annotation suggestions`,并点击右侧 Smart 工具,切换到 Point 后,选择下方需要标注的物体标签,这里选择 cat。如果是 BBox 作为提示词请将 Smart 工具切换到 Rectangle。


![image](https://user-images.githubusercontent.com/25839884/233833200-a44c9c5f-66a8-491a-b268-ecfb6acd5284.png)

Point2Label:由下面的 gif 的动图可以看出,只需要在物体上点一个点,SAM 算法就能将整个物体分割和检测出来。

![SAM8](https://user-images.githubusercontent.com/25839884/233835410-29896554-963a-42c3-a523-3b1226de59b6.gif)


Bbox2Label: 由下面的 gif 的动图可以看出,只需要标注一个边界框,SAM 算法就能将整个物体分割和检测出来。

![SAM10](https://user-images.githubusercontent.com/25839884/233969712-0d9d6f0a-70b0-4b3e-b054-13eda037fb20.gif)

## COCO 格式数据集导出

### Label Studio 网页端导出
Expand All @@ -307,7 +327,6 @@ Bbox2Label: 由下面的 gif 的动图可以看出,只需要标注一个边界

![](https://cdn.vansin.top/picgo20230330140321.png)


### Label Studio 输出转换为RLE格式掩码

由于 label studio 导出来的 coco 不支持 rle 的实例标注,只支持 polygon 的实例。
Expand All @@ -322,14 +341,15 @@ polygon 实例格式由于不太好控制点数,太多不方便微调(不像
cd path/to/playground/label_anything
python tools/convert_to_rle_mask_coco.py --json_file_path path/to/LS_json --out_dir path/to/output/file
```

--json_file_path 输入 Label studio 的输出 json

--out_dir 输出路径


生成后脚本会在终端输出一个列表,这个列表是对应类别id的,可用于复制填写 config 用于训练。

输出路径下有 annotations 和 images 两个文件夹,annotations 里是 coco 格式的 json, images 是整理好的数据集。

```
Your dataset
├── annotations
Expand Down Expand Up @@ -383,7 +403,6 @@ playground
├── ...
```


接着我们使用 `tools/analysis_tools/browse_dataset.py` 对数据集进行可视化。

```shell
Expand All @@ -402,7 +421,6 @@ python tools/analysis_tools/browse_dataset.py data/my_set/mask-rcnn_r50_fpn.py -

经过上一步生成了可用于 mmdetection 训练的 config,路径为 `data/my_set/config_name.py` 我们可以用于训练。


```shell
python tools/train.py data/my_set/mask-rcnn_r50_fpn.py
```
Expand All @@ -414,6 +432,7 @@ python tools/train.py data/my_set/mask-rcnn_r50_fpn.py
```shell
python tools/test.py data/my_set/mask-rcnn_r50_fpn.py path/of/your/checkpoint --show --show-dir my_show
```

可视化图片将会保存在 `work_dir/{timestamp}/my_show`

完成后我们可以获得模型测试可视化图。左边是标注图片,右边是模型输出。
Expand Down Expand Up @@ -452,6 +471,3 @@ device=cuda:0 \
效果展示如下图:

![图片](https://github.com/JimmyMa99/playground/assets/101508488/c134e579-2f1b-41ed-a82b-8211f8df8b94)



Loading

0 comments on commit 5846726

Please sign in to comment.