Skip to content

Commit

Permalink
add speed in V100 and mobile
Browse files Browse the repository at this point in the history
  • Loading branch information
Sibo2rr committed Dec 15, 2021
1 parent 4e6a3c8 commit a1ad2c8
Show file tree
Hide file tree
Showing 16 changed files with 558 additions and 417 deletions.
438 changes: 218 additions & 220 deletions docs/zh_CN/algorithm_introduction/ImageNet_models.md

Large diffs are not rendered by default.

19 changes: 18 additions & 1 deletion docs/zh_CN/models/DLA.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## 目录
* [1. 概述](#1)
* [2. 精度、FLOPS 和参数量](#2)
* [3. 基于 V100 GPU 的预测速度](#3)

<a name='1'></a>

Expand All @@ -25,4 +26,20 @@ DLA(Deep Layer Aggregation)。 视觉识别需要丰富的表示形式,其范
| DLA102 | 33.3 | 7.2 | 78.93 | 94.52 |
| DLA102x | 26.4 | 5.9 | 78.10 | 94.00 |
| DLA102x2 | 41.4 | 9.3 | 78.85 | 94.45 |
| DLA169 | 53.5 | 11.6 | 78.09 | 94.09 |
| DLA169 | 53.5 | 11.6 | 78.09 | 94.09 |

<a name='3'></a>

## 3. 基于 V100 GPU 的预测速度

| 模型 | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
| -------- | --------- | ----------------- | ------------------------------ | ------------------------------ | ------------------------------ |
| DLA102 | 224 | 256 | 4.95 | 8.08 | 12.40 |
| DLA102x2 | 224 | 256 | 19.58 | 23.97 | 31.37 |
| DLA102x | 224 | 256 | 11.12 | 15.60 | 20.37 |
| DLA169 | 224 | 256 | 7.70 | 12.25 | 18.90 |
| DLA34 | 224 | 256 | 1.83 | 3.37 | 5.98 |
| DLA46_c | 224 | 256 | 1.06 | 2.08 | 3.23 |
| DLA60 | 224 | 256 | 2.78 | 5.36 | 8.29 |
| DLA60x_c | 224 | 256 | 1.79 | 3.68 | 5.19 |
| DLA60x | 224 | 256 | 5.98 | 9.24 | 12.52 |
26 changes: 13 additions & 13 deletions docs/zh_CN/models/DPN_DenseNet.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
## 1. 概述
DenseNet 是 2017 年 CVPR best paper 提出的一种新的网络结构,该网络设计了一种新的跨层连接的 block,即 dense-block。相比 ResNet 中的 bottleneck,dense-block 设计了一个更激进的密集连接机制,即互相连接所有的层,每个层都会接受其前面所有层作为其额外的输入。DenseNet 将所有的 dense-block 堆叠,组合成了一个密集连接型网络。密集的连接方式使得 DenseNe 更容易进行梯度的反向传播,使得网络更容易训练。
DPN 的全称是 Dual Path Networks,即双通道网络。该网络是由 DenseNet 和 ResNeXt 结合的一个网络,其证明了 DenseNet 能从靠前的层级中提取到新的特征,而 ResNeXt 本质上是对之前层级中已提取特征的复用。作者进一步分析发现,ResNeXt 对特征有高复用率,但冗余度低,DenseNet 能创造新特征,但冗余度高。结合二者结构的优势,作者设计了 DPN 网络。最终 DPN 网络在同样 FLOPS 和参数量下,取得了比 ResNeXt 与 DenseNet 更好的结果。

该系列模型的 FLOPS、参数量以及 T4 GPU 上的预测耗时如下图所示。

![](../../images/models/T4_benchmark/t4.fp32.bs4.DPN.flops.png)
Expand Down Expand Up @@ -48,18 +48,18 @@ DPN 的全称是 Dual Path Networks,即双通道网络。该网络是由 Dense
<a name='3'></a>
## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|-------------|-----------|-------------------|--------------------------|
| DenseNet121 | 224 | 256 | 4.371 |
| DenseNet161 | 224 | 256 | 8.863 |
| DenseNet169 | 224 | 256 | 6.391 |
| DenseNet201 | 224 | 256 | 8.173 |
| DenseNet264 | 224 | 256 | 11.942 |
| DPN68 | 224 | 256 | 11.805 |
| DPN92 | 224 | 256 | 17.840 |
| DPN98 | 224 | 256 | 21.057 |
| DPN107 | 224 | 256 | 28.685 |
| DPN131 | 224 | 256 | 28.083 |
| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
|-------------|-----------|-------------------|-------------------|-------------------|-------------------|
| DenseNet121 | 224 | 256 | 3.40 | 6.94 | 9.17 |
| DenseNet161 | 224 | 256 | 7.06 | 14.37 | 19.55 |
| DenseNet169 | 224 | 256 | 5.00 | 10.29 | 12.84 |
| DenseNet201 | 224 | 256 | 6.38 | 13.72 | 17.17 |
| DenseNet264 | 224 | 256 | 9.34 | 20.95 | 25.41 |
| DPN68 | 224 | 256 | 8.18 | 11.40 | 14.82 |
| DPN92 | 224 | 256 | 12.48 | 20.04 | 25.10 |
| DPN98 | 224 | 256 | 14.70 | 25.55 | 35.12 |
| DPN107 | 224 | 256 | 19.46 | 35.62 | 50.22 |
| DPN131 | 224 | 256 | 19.64 | 34.60 | 47.42 |


<a name='4'></a>
Expand Down
32 changes: 16 additions & 16 deletions docs/zh_CN/models/EfficientNet_and_ResNeXt101_wsl.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,22 +50,22 @@ ResNeXt 是 facebook 于 2016 年提出的一种对 ResNet 的改进版网络。

## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|-------------------------------|-----------|-------------------|--------------------------|
| ResNeXt101_<br>32x8d_wsl | 224 | 256 | 19.127 |
| ResNeXt101_<br>32x16d_wsl | 224 | 256 | 23.629 |
| ResNeXt101_<br>32x32d_wsl | 224 | 256 | 40.214 |
| ResNeXt101_<br>32x48d_wsl | 224 | 256 | 59.714 |
| Fix_ResNeXt101_<br>32x48d_wsl | 320 | 320 | 82.431 |
| EfficientNetB0 | 224 | 256 | 2.449 |
| EfficientNetB1 | 240 | 272 | 3.547 |
| EfficientNetB2 | 260 | 292 | 3.908 |
| EfficientNetB3 | 300 | 332 | 5.145 |
| EfficientNetB4 | 380 | 412 | 7.609 |
| EfficientNetB5 | 456 | 488 | 12.078 |
| EfficientNetB6 | 528 | 560 | 18.381 |
| EfficientNetB7 | 600 | 632 | 27.817 |
| EfficientNetB0_<br>small | 224 | 256 | 1.692 |
| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
|-------------------------------|-----------|-------------------|-------------------------------|-------------------------------|-------------------------------|
| ResNeXt101_<br>32x8d_wsl | 224 | 256 | 13.55 | 23.39 | 36.18 |
| ResNeXt101_<br>32x16d_wsl | 224 | 256 | 21.96 | 38.35 | 63.29 |
| ResNeXt101_<br>32x32d_wsl | 224 | 256 | 37.28 | 76.50 | 121.56 |
| ResNeXt101_<br>32x48d_wsl | 224 | 256 | 55.07 | 124.39 | 205.01 |
| Fix_ResNeXt101_<br>32x48d_wsl | 320 | 320 | 55.01 | 122.63 | 204.66 |
| EfficientNetB0 | 224 | 256 | 1.96 | 3.71 | 5.56 |
| EfficientNetB1 | 240 | 272 | 2.88 | 5.40 | 7.63 |
| EfficientNetB2 | 260 | 292 | 3.26 | 6.20 | 9.17 |
| EfficientNetB3 | 300 | 332 | 4.52 | 8.85 | 13.54 |
| EfficientNetB4 | 380 | 412 | 6.78 | 15.47 | 24.95 |
| EfficientNetB5 | 456 | 488 | 10.97 | 27.24 | 45.93 |
| EfficientNetB6 | 528 | 560 | 17.09 | 43.32 | 76.90 |
| EfficientNetB7 | 600 | 632 | 25.91 | 71.23 | 128.20 |
| EfficientNetB0_<br>small | 224 | 256 | 1.24 | 2.59 | 3.92 |


<a name='4'></a>
Expand Down
22 changes: 11 additions & 11 deletions docs/zh_CN/models/HRNet.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,17 @@ HRNet 是 2019 年由微软亚洲研究院提出的一种全新的神经网络
<a name='3'></a>
## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|-------------|-----------|-------------------|--------------------------|
| HRNet_W18_C | 224 | 256 | 7.368 |
| HRNet_W18_C_ssld | 224 | 256 | 7.368 |
| HRNet_W30_C | 224 | 256 | 9.402 |
| HRNet_W32_C | 224 | 256 | 9.467 |
| HRNet_W40_C | 224 | 256 | 10.739 |
| HRNet_W44_C | 224 | 256 | 11.497 |
| HRNet_W48_C | 224 | 256 | 12.165 |
| HRNet_W48_C_ssld | 224 | 256 | 12.165 |
| HRNet_W64_C | 224 | 256 | 15.003 |
| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
|-------------|-----------|-------------------|-------------------|-------------------|-------------------|
| HRNet_W18_C | 224 | 256 | 6.66 | 8.94 | 11.95 |
| HRNet_W18_C_ssld | 224 | 256 | 6.66 | 8.92 | 11.93 |
| HRNet_W30_C | 224 | 256 | 8.61 | 11.40 | 15.23 |
| HRNet_W32_C | 224 | 256 | 8.54 | 11.58 | 15.57 |
| HRNet_W40_C | 224 | 256 | 9.83 | 15.02 | 20.92 |
| HRNet_W44_C | 224 | 256 | 10.62 | 16.18 | 25.92 |
| HRNet_W48_C | 224 | 256 | 11.07 | 17.06 | 27.28 |
| HRNet_W48_C_ssld | 224 | 256 | 11.09 | 17.04 | 27.28 |
| HRNet_W64_C | 224 | 256 | 13.82 | 21.15 | 35.51 |


<a name='4'></a>
Expand Down
13 changes: 13 additions & 0 deletions docs/zh_CN/models/HarDNet.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

* [1. 概述](#1)
* [2. 精度、FLOPS 和参数量](#2)
* [3. 基于 V100 GPU 的预测速度](#3)

<a name='1'></a>
## 1. 概述
Expand All @@ -20,3 +21,15 @@ HarDNet(Harmonic DenseNet)是 2019 年由国立清华大学提出的一种
| HarDNet85 | 36.7 | 9.1 | 77.44 | 93.55 |
| HarDNet39_ds | 3.5 | 0.4 | 71.33 | 89.98 |
| HarDNet68_ds | 4.2 | 0.8 | 73.62 | 91.52 |

<a name='3'></a>

## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
| ------------ | --------- | ----------------- | ------------------------------ | ------------------------------ | ------------------------------ |
| HarDNet68 | 224 | 256 | 3.58 | 8.53 | 11.58 |
| HarDNet85 | 224 | 256 | 6.24 | 14.85 | 20.57 |
| HarDNet39_ds | 224 | 256 | 1.40 | 2.30 | 3.33 |
| HarDNet68_ds | 224 | 256 | 2.26 | 3.34 | 5.06 |

18 changes: 9 additions & 9 deletions docs/zh_CN/models/Inception.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,15 +53,15 @@ InceptionV4 是 2016 年由 Google 设计的新的神经网络,当时残差结

## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br>Batch Size=1<br>(ms) |
|------------------------|-----------|-------------------|--------------------------|
| GoogLeNet | 224 | 256 | 1.807 |
| Xception41 | 299 | 320 | 3.972 |
| Xception41_<br>deeplab | 299 | 320 | 4.408 |
| Xception65 | 299 | 320 | 6.174 |
| Xception65_<br>deeplab | 299 | 320 | 6.464 |
| Xception71 | 299 | 320 | 6.782 |
| InceptionV4 | 299 | 320 | 11.141 |
| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
|------------------------|-----------|-------------------|------------------------|------------------------|------------------------|
| GoogLeNet | 224 | 256 | 1.41 | 3.25 | 5.00 |
| Xception41 | 299 | 320 | 3.58 | 8.76 | 16.61 |
| Xception41_<br>deeplab | 299 | 320 | 3.81 | 9.16 | 17.20 |
| Xception65 | 299 | 320 | 5.45 | 12.78 | 24.53 |
| Xception65_<br>deeplab | 299 | 320 | 5.65 | 13.08 | 24.61 |
| Xception71 | 299 | 320 | 6.19 | 15.34 | 29.21 |
| InceptionV4 | 299 | 320 | 8.93 | 15.17 | 21.56 |


<a name='4'></a>
Expand Down
11 changes: 11 additions & 0 deletions docs/zh_CN/models/MixNet.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

* [1. 概述](#1)
* [2. 精度、FLOPS 和参数量](#2)
* [3. 基于 V100 GPU 的预测速度](#3)

<a name='1'></a>

Expand All @@ -26,4 +27,14 @@ MixNet 是谷歌出的一篇关于轻量级网络的文章,主要工作就在
| MixNet_M | 77.67 | 93.64 | 77.0 | 357.119 | 5.065 |
| MixNet_L | 78.60 | 94.37 | 78.9 | 579.017 | 7.384 |

<a name='3'></a>

## 3. 基于 V100 GPU 的预测速度

| Models | Crop Size | Resize Short Size | FP32<br/>Batch Size=1<br/>(ms) | FP32<br/>Batch Size=4<br/>(ms) | FP32<br/>Batch Size=8<br/>(ms) |
| -------- | --------- | ----------------- | ------------------------------ | ------------------------------ | ------------------------------ |
| MixNet_S | 224 | 256 | 2.31 | 3.63 | 5.20 |
| MixNet_M | 224 | 256 | 2.84 | 4.60 | 6.62 |
| MixNet_L | 224 | 256 | 3.16 | 5.55 | 8.03 |

关于 Inference speed 等信息,敬请期待。
Loading

0 comments on commit a1ad2c8

Please sign in to comment.