Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] new int8 implement,better accuracy #749

Merged
merged 24 commits into from
Mar 5, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8efb684
add the armv7a conv3x3s1 implement without overflow,remove old codes
BUG1989 Jan 8, 2019
19b983e
fix the bug of conv3x3s2 packed int8
BUG1989 Jan 9, 2019
67df4e2
Merge remote-tracking branch 'upstream/master' into ncnn-pr
BUG1989 Jan 9, 2019
b5d39fe
new int8 implement,weight quant by perchanel,better accuracy~
BUG1989 Jan 10, 2019
7b53edd
fix the bug of conv3x3s1 packed int8 neon
BUG1989 Jan 11, 2019
2a8fa6c
add the naive c fp32 and int8 winograd F(2,3)
BUG1989 Jan 23, 2019
46bbf5b
merge from master
BUG1989 Jan 23, 2019
4c52ec5
add the neon intrinsic int8 winograd F(2,3)
BUG1989 Jan 25, 2019
c84fa4a
optimize the armv7a int8 winograd F(2,3) with neon assembly
BUG1989 Jan 29, 2019
bcaa2d5
optimize the armv7a int8 winograd F(2,3) input transform with assembly.
BUG1989 Jan 31, 2019
07c63b6
add the requantize layer and int8 relu implement.
BUG1989 Feb 11, 2019
eff997e
add graph optimize conv1x1s2 -> conv1x1s1,begin optimize int8 aarch64.
BUG1989 Feb 12, 2019
c223155
fix int8 bugs
BUG1989 Feb 13, 2019
732efb4
add the c naive im2col with sgemm
BUG1989 Feb 14, 2019
ecc291b
add aarch64 int8 winograd f23, conv3x3s2 naive implement
BUG1989 Feb 15, 2019
bbe0a47
add the int8 sgemm conv7x7s2 on x86/armv7a platform
BUG1989 Feb 19, 2019
22593d4
optimize the int8 sgemm by neon intrinsic and packed kernel
BUG1989 Feb 20, 2019
2c666de
optimize the int8 sgemm with packed data
BUG1989 Feb 21, 2019
a3cb7d3
optimize the int8 sgemm with armv7a neon assembly
BUG1989 Feb 22, 2019
458a959
add the int8 sgemm on arm64-v8a platform
BUG1989 Feb 27, 2019
08f505e
perpare to merge latest codes from master
BUG1989 Mar 4, 2019
48c130f
merge from master and push the armv7a int8
BUG1989 Mar 4, 2019
8b22913
add the int8 param files
BUG1989 Mar 4, 2019
71b277b
In the Class Net,add the fuse_network method
BUG1989 Mar 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add the int8 param files
  • Loading branch information
BUG1989 committed Mar 4, 2019
commit 8b2291317afcf433bf58b8b90814c38b7bc35fe1
154 changes: 154 additions & 0 deletions benchmark/googlenet_int8.param

Large diffs are not rendered by default.

114 changes: 114 additions & 0 deletions benchmark/mobilenet_int8.param
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
7767517
112 112
Input data 0 1 data 0=224 1=224 2=3
Convolution conv1 1 1 data conv1 0=32 1=3 2=1 3=2 4=1 5=0 6=864 8=2
BatchNorm conv1/bn 1 1 conv1 conv1_conv1/bn 0=32
Scale conv1/scale 1 1 conv1_conv1/bn conv1_conv1/scale 0=32 1=1
ReLU relu1 1 1 conv1_conv1/scale conv1_relu1
ConvolutionDepthWise conv2_1/dw 1 1 conv1_relu1 conv2_1/dw 0=32 1=3 2=1 3=1 4=1 5=0 6=288 7=32 8=1
BatchNorm conv2_1/dw/bn 1 1 conv2_1/dw conv2_1/dw_conv2_1/dw/bn 0=32
Scale conv2_1/dw/scale 1 1 conv2_1/dw_conv2_1/dw/bn conv2_1/dw_conv2_1/dw/scale 0=32 1=1
ReLU relu2_1/dw 1 1 conv2_1/dw_conv2_1/dw/scale conv2_1/dw_relu2_1/dw
Convolution conv2_1/sep 1 1 conv2_1/dw_relu2_1/dw conv2_1/sep 0=64 1=1 2=1 3=1 4=0 5=0 6=2048 8=2
BatchNorm conv2_1/sep/bn 1 1 conv2_1/sep conv2_1/sep_conv2_1/sep/bn 0=64
Scale conv2_1/sep/scale 1 1 conv2_1/sep_conv2_1/sep/bn conv2_1/sep_conv2_1/sep/scale 0=64 1=1
ReLU relu2_1/sep 1 1 conv2_1/sep_conv2_1/sep/scale conv2_1/sep_relu2_1/sep
ConvolutionDepthWise conv2_2/dw 1 1 conv2_1/sep_relu2_1/sep conv2_2/dw 0=64 1=3 2=1 3=2 4=1 5=0 6=576 7=64 8=1
BatchNorm conv2_2/dw/bn 1 1 conv2_2/dw conv2_2/dw_conv2_2/dw/bn 0=64
Scale conv2_2/dw/scale 1 1 conv2_2/dw_conv2_2/dw/bn conv2_2/dw_conv2_2/dw/scale 0=64 1=1
ReLU relu2_2/dw 1 1 conv2_2/dw_conv2_2/dw/scale conv2_2/dw_relu2_2/dw
Convolution conv2_2/sep 1 1 conv2_2/dw_relu2_2/dw conv2_2/sep 0=128 1=1 2=1 3=1 4=0 5=0 6=8192 8=2
BatchNorm conv2_2/sep/bn 1 1 conv2_2/sep conv2_2/sep_conv2_2/sep/bn 0=128
Scale conv2_2/sep/scale 1 1 conv2_2/sep_conv2_2/sep/bn conv2_2/sep_conv2_2/sep/scale 0=128 1=1
ReLU relu2_2/sep 1 1 conv2_2/sep_conv2_2/sep/scale conv2_2/sep_relu2_2/sep
ConvolutionDepthWise conv3_1/dw 1 1 conv2_2/sep_relu2_2/sep conv3_1/dw 0=128 1=3 2=1 3=1 4=1 5=0 6=1152 7=128 8=1
BatchNorm conv3_1/dw/bn 1 1 conv3_1/dw conv3_1/dw_conv3_1/dw/bn 0=128
Scale conv3_1/dw/scale 1 1 conv3_1/dw_conv3_1/dw/bn conv3_1/dw_conv3_1/dw/scale 0=128 1=1
ReLU relu3_1/dw 1 1 conv3_1/dw_conv3_1/dw/scale conv3_1/dw_relu3_1/dw
Convolution conv3_1/sep 1 1 conv3_1/dw_relu3_1/dw conv3_1/sep 0=128 1=1 2=1 3=1 4=0 5=0 6=16384 8=2
BatchNorm conv3_1/sep/bn 1 1 conv3_1/sep conv3_1/sep_conv3_1/sep/bn 0=128
Scale conv3_1/sep/scale 1 1 conv3_1/sep_conv3_1/sep/bn conv3_1/sep_conv3_1/sep/scale 0=128 1=1
ReLU relu3_1/sep 1 1 conv3_1/sep_conv3_1/sep/scale conv3_1/sep_relu3_1/sep
ConvolutionDepthWise conv3_2/dw 1 1 conv3_1/sep_relu3_1/sep conv3_2/dw 0=128 1=3 2=1 3=2 4=1 5=0 6=1152 7=128 8=1
BatchNorm conv3_2/dw/bn 1 1 conv3_2/dw conv3_2/dw_conv3_2/dw/bn 0=128
Scale conv3_2/dw/scale 1 1 conv3_2/dw_conv3_2/dw/bn conv3_2/dw_conv3_2/dw/scale 0=128 1=1
ReLU relu3_2/dw 1 1 conv3_2/dw_conv3_2/dw/scale conv3_2/dw_relu3_2/dw
Convolution conv3_2/sep 1 1 conv3_2/dw_relu3_2/dw conv3_2/sep 0=256 1=1 2=1 3=1 4=0 5=0 6=32768 8=2
BatchNorm conv3_2/sep/bn 1 1 conv3_2/sep conv3_2/sep_conv3_2/sep/bn 0=256
Scale conv3_2/sep/scale 1 1 conv3_2/sep_conv3_2/sep/bn conv3_2/sep_conv3_2/sep/scale 0=256 1=1
ReLU relu3_2/sep 1 1 conv3_2/sep_conv3_2/sep/scale conv3_2/sep_relu3_2/sep
ConvolutionDepthWise conv4_1/dw 1 1 conv3_2/sep_relu3_2/sep conv4_1/dw 0=256 1=3 2=1 3=1 4=1 5=0 6=2304 7=256 8=1
BatchNorm conv4_1/dw/bn 1 1 conv4_1/dw conv4_1/dw_conv4_1/dw/bn 0=256
Scale conv4_1/dw/scale 1 1 conv4_1/dw_conv4_1/dw/bn conv4_1/dw_conv4_1/dw/scale 0=256 1=1
ReLU relu4_1/dw 1 1 conv4_1/dw_conv4_1/dw/scale conv4_1/dw_relu4_1/dw
Convolution conv4_1/sep 1 1 conv4_1/dw_relu4_1/dw conv4_1/sep 0=256 1=1 2=1 3=1 4=0 5=0 6=65536 8=2
BatchNorm conv4_1/sep/bn 1 1 conv4_1/sep conv4_1/sep_conv4_1/sep/bn 0=256
Scale conv4_1/sep/scale 1 1 conv4_1/sep_conv4_1/sep/bn conv4_1/sep_conv4_1/sep/scale 0=256 1=1
ReLU relu4_1/sep 1 1 conv4_1/sep_conv4_1/sep/scale conv4_1/sep_relu4_1/sep
ConvolutionDepthWise conv4_2/dw 1 1 conv4_1/sep_relu4_1/sep conv4_2/dw 0=256 1=3 2=1 3=2 4=1 5=0 6=2304 7=256 8=1
BatchNorm conv4_2/dw/bn 1 1 conv4_2/dw conv4_2/dw_conv4_2/dw/bn 0=256
Scale conv4_2/dw/scale 1 1 conv4_2/dw_conv4_2/dw/bn conv4_2/dw_conv4_2/dw/scale 0=256 1=1
ReLU relu4_2/dw 1 1 conv4_2/dw_conv4_2/dw/scale conv4_2/dw_relu4_2/dw
Convolution conv4_2/sep 1 1 conv4_2/dw_relu4_2/dw conv4_2/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=131072 8=2
BatchNorm conv4_2/sep/bn 1 1 conv4_2/sep conv4_2/sep_conv4_2/sep/bn 0=512
Scale conv4_2/sep/scale 1 1 conv4_2/sep_conv4_2/sep/bn conv4_2/sep_conv4_2/sep/scale 0=512 1=1
ReLU relu4_2/sep 1 1 conv4_2/sep_conv4_2/sep/scale conv4_2/sep_relu4_2/sep
ConvolutionDepthWise conv5_1/dw 1 1 conv4_2/sep_relu4_2/sep conv5_1/dw 0=512 1=3 2=1 3=1 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_1/dw/bn 1 1 conv5_1/dw conv5_1/dw_conv5_1/dw/bn 0=512
Scale conv5_1/dw/scale 1 1 conv5_1/dw_conv5_1/dw/bn conv5_1/dw_conv5_1/dw/scale 0=512 1=1
ReLU relu5_1/dw 1 1 conv5_1/dw_conv5_1/dw/scale conv5_1/dw_relu5_1/dw
Convolution conv5_1/sep 1 1 conv5_1/dw_relu5_1/dw conv5_1/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=262144 8=2
BatchNorm conv5_1/sep/bn 1 1 conv5_1/sep conv5_1/sep_conv5_1/sep/bn 0=512
Scale conv5_1/sep/scale 1 1 conv5_1/sep_conv5_1/sep/bn conv5_1/sep_conv5_1/sep/scale 0=512 1=1
ReLU relu5_1/sep 1 1 conv5_1/sep_conv5_1/sep/scale conv5_1/sep_relu5_1/sep
ConvolutionDepthWise conv5_2/dw 1 1 conv5_1/sep_relu5_1/sep conv5_2/dw 0=512 1=3 2=1 3=1 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_2/dw/bn 1 1 conv5_2/dw conv5_2/dw_conv5_2/dw/bn 0=512
Scale conv5_2/dw/scale 1 1 conv5_2/dw_conv5_2/dw/bn conv5_2/dw_conv5_2/dw/scale 0=512 1=1
ReLU relu5_2/dw 1 1 conv5_2/dw_conv5_2/dw/scale conv5_2/dw_relu5_2/dw
Convolution conv5_2/sep 1 1 conv5_2/dw_relu5_2/dw conv5_2/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=262144 8=2
BatchNorm conv5_2/sep/bn 1 1 conv5_2/sep conv5_2/sep_conv5_2/sep/bn 0=512
Scale conv5_2/sep/scale 1 1 conv5_2/sep_conv5_2/sep/bn conv5_2/sep_conv5_2/sep/scale 0=512 1=1
ReLU relu5_2/sep 1 1 conv5_2/sep_conv5_2/sep/scale conv5_2/sep_relu5_2/sep
ConvolutionDepthWise conv5_3/dw 1 1 conv5_2/sep_relu5_2/sep conv5_3/dw 0=512 1=3 2=1 3=1 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_3/dw/bn 1 1 conv5_3/dw conv5_3/dw_conv5_3/dw/bn 0=512
Scale conv5_3/dw/scale 1 1 conv5_3/dw_conv5_3/dw/bn conv5_3/dw_conv5_3/dw/scale 0=512 1=1
ReLU relu5_3/dw 1 1 conv5_3/dw_conv5_3/dw/scale conv5_3/dw_relu5_3/dw
Convolution conv5_3/sep 1 1 conv5_3/dw_relu5_3/dw conv5_3/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=262144 8=2
BatchNorm conv5_3/sep/bn 1 1 conv5_3/sep conv5_3/sep_conv5_3/sep/bn 0=512
Scale conv5_3/sep/scale 1 1 conv5_3/sep_conv5_3/sep/bn conv5_3/sep_conv5_3/sep/scale 0=512 1=1
ReLU relu5_3/sep 1 1 conv5_3/sep_conv5_3/sep/scale conv5_3/sep_relu5_3/sep
ConvolutionDepthWise conv5_4/dw 1 1 conv5_3/sep_relu5_3/sep conv5_4/dw 0=512 1=3 2=1 3=1 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_4/dw/bn 1 1 conv5_4/dw conv5_4/dw_conv5_4/dw/bn 0=512
Scale conv5_4/dw/scale 1 1 conv5_4/dw_conv5_4/dw/bn conv5_4/dw_conv5_4/dw/scale 0=512 1=1
ReLU relu5_4/dw 1 1 conv5_4/dw_conv5_4/dw/scale conv5_4/dw_relu5_4/dw
Convolution conv5_4/sep 1 1 conv5_4/dw_relu5_4/dw conv5_4/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=262144 8=2
BatchNorm conv5_4/sep/bn 1 1 conv5_4/sep conv5_4/sep_conv5_4/sep/bn 0=512
Scale conv5_4/sep/scale 1 1 conv5_4/sep_conv5_4/sep/bn conv5_4/sep_conv5_4/sep/scale 0=512 1=1
ReLU relu5_4/sep 1 1 conv5_4/sep_conv5_4/sep/scale conv5_4/sep_relu5_4/sep
ConvolutionDepthWise conv5_5/dw 1 1 conv5_4/sep_relu5_4/sep conv5_5/dw 0=512 1=3 2=1 3=1 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_5/dw/bn 1 1 conv5_5/dw conv5_5/dw_conv5_5/dw/bn 0=512
Scale conv5_5/dw/scale 1 1 conv5_5/dw_conv5_5/dw/bn conv5_5/dw_conv5_5/dw/scale 0=512 1=1
ReLU relu5_5/dw 1 1 conv5_5/dw_conv5_5/dw/scale conv5_5/dw_relu5_5/dw
Convolution conv5_5/sep 1 1 conv5_5/dw_relu5_5/dw conv5_5/sep 0=512 1=1 2=1 3=1 4=0 5=0 6=262144 8=2
BatchNorm conv5_5/sep/bn 1 1 conv5_5/sep conv5_5/sep_conv5_5/sep/bn 0=512
Scale conv5_5/sep/scale 1 1 conv5_5/sep_conv5_5/sep/bn conv5_5/sep_conv5_5/sep/scale 0=512 1=1
ReLU relu5_5/sep 1 1 conv5_5/sep_conv5_5/sep/scale conv5_5/sep_relu5_5/sep
ConvolutionDepthWise conv5_6/dw 1 1 conv5_5/sep_relu5_5/sep conv5_6/dw 0=512 1=3 2=1 3=2 4=1 5=0 6=4608 7=512 8=1
BatchNorm conv5_6/dw/bn 1 1 conv5_6/dw conv5_6/dw_conv5_6/dw/bn 0=512
Scale conv5_6/dw/scale 1 1 conv5_6/dw_conv5_6/dw/bn conv5_6/dw_conv5_6/dw/scale 0=512 1=1
ReLU relu5_6/dw 1 1 conv5_6/dw_conv5_6/dw/scale conv5_6/dw_relu5_6/dw
Convolution conv5_6/sep 1 1 conv5_6/dw_relu5_6/dw conv5_6/sep 0=1024 1=1 2=1 3=1 4=0 5=0 6=524288 8=2
BatchNorm conv5_6/sep/bn 1 1 conv5_6/sep conv5_6/sep_conv5_6/sep/bn 0=1024
Scale conv5_6/sep/scale 1 1 conv5_6/sep_conv5_6/sep/bn conv5_6/sep_conv5_6/sep/scale 0=1024 1=1
ReLU relu5_6/sep 1 1 conv5_6/sep_conv5_6/sep/scale conv5_6/sep_relu5_6/sep
ConvolutionDepthWise conv6/dw 1 1 conv5_6/sep_relu5_6/sep conv6/dw 0=1024 1=3 2=1 3=1 4=1 5=0 6=9216 7=1024 8=1
BatchNorm conv6/dw/bn 1 1 conv6/dw conv6/dw_conv6/dw/bn 0=1024
Scale conv6/dw/scale 1 1 conv6/dw_conv6/dw/bn conv6/dw_conv6/dw/scale 0=1024 1=1
ReLU relu6/dw 1 1 conv6/dw_conv6/dw/scale conv6/dw_relu6/dw
Convolution conv6/sep 1 1 conv6/dw_relu6/dw conv6/sep 0=1024 1=1 2=1 3=1 4=0 5=0 6=1048576 8=2
BatchNorm conv6/sep/bn 1 1 conv6/sep conv6/sep_conv6/sep/bn 0=1024
Scale conv6/sep/scale 1 1 conv6/sep_conv6/sep/bn conv6/sep_conv6/sep/scale 0=1024 1=1
ReLU relu6/sep 1 1 conv6/sep_conv6/sep/scale conv6/sep_relu6/sep
Pooling pool6 1 1 conv6/sep_relu6/sep pool6 0=1 1=0 2=1 3=0 4=1
Convolution fc7 1 1 pool6 fc7 0=1000 1=1 2=1 3=1 4=0 5=1 6=1024000 8=2
Softmax prob 1 1 fc7 prob 0=0
Loading