[MKL-DNN] Integrate Conv3d and Pool3d/1d #17884

wuxun-zhang · 2020-03-22T06:08:43Z

Description

This PR aims to integrate mkl-dnn 3d conv/pool primitive, including fp32 and quantization pass.
@pengzhao-intel @TaoLv @ciyongch @rongzha1 @zixuanweeei

Checklist

Essentials

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)

Changes

support 5d data layout for MKLDNN

TaoLv · 2020-03-23T14:55:50Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

 return (dtype == mshadow::kFloat32 || dtype == mshadow::kBfloat16) &&
- (ndim == 1 || ndim == 2 || ndim == 4);
+  (ndim >= 1 && ndim <= 5);


Please fix the indent and make sure you also want to enable ndim=3 here.

Maybe it's better not to touch this function. I will enable 5d layout in SupportPooling instead.

TaoLv · 2020-03-23T15:02:21Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

+ const int D = (ndim == 5) ? 2 : 1;
+ const int N = 0, C = 1, H = D + 1, W = D + 2;


Let's be more descriptive here:

Suggested change

const int D = (ndim == 5) ? 2 : 1;

const int N = 0, C = 1, H = D + 1, W = D + 2;

int N = 0, C = 1, H = 2, W = 3;

int D = -1;

if (ndim == 5) {

D = 2;

H = 3;

W = 4;

}

Thanks. Done.

TaoLv · 2020-03-23T15:09:29Z

src/operator/nn/mkldnn/mkldnn_pooling-inl.h

+ param.pad[2], param.pad[2], param.kernel[2], param.stride[2]));
+ case 4:
+ is_symmetric = is_symmetric && (param.pad[1] == GetPaddingSizeFull(dshape[3],
+ param.pad[1], param.pad[1], param.kernel[1], param.stride[1]));


I see both pad[0] and pad[1] are checked in previous code.

Could you please show me where you saw these checks? Thanks

okay, i didn't realize that you don't have break for each case. So if ndim == 4, both pad[1] and pad[0] are checked.

TaoLv · 2020-03-23T15:11:32Z

src/operator/nn/mkldnn/mkldnn_pooling.cc

@@ -127,61 +116,139 @@ mkldnn::algorithm GetMKLDNNPoolAlgo(const PoolingParam &param) {
 }
 }

+void InitPoolingPrimitiveParams(const PoolingParam &param,
+ const mkldnn::memory::desc &data_md,
+ mkldnn::memory::dims *new_kernel,


How about pass-by-reference?

TaoLv · 2020-03-25T13:47:58Z

src/operator/nn/mkldnn/mkldnn_pooling-inl.h

+ param.pad[2], param.pad[2], param.kernel[2], param.stride[2]));
+ case 4:
+ is_symmetric = is_symmetric && (param.pad[1] == GetPaddingSizeFull(dshape[3],
+ param.pad[1], param.pad[1], param.kernel[1], param.stride[1]));


okay, i didn't realize that you don't have break for each case. So if ndim == 4, both pad[1] and pad[0] are checked.

TaoLv · 2020-03-25T13:51:56Z

src/operator/nn/pooling.cc

@@ -274,12 +274,12 @@ void PoolingComputeExCPU(const nnvm::NodeAttrs &attrs, const OpContext &ctx,

 // Pooling does not currently support working with views
 if (inputs[0].IsView() || outputs[0].IsView()) {
+ std::cout << "Fall back to Pooling backward pass..." << std::endl;


Is this intended? We didn't have any log message for fallback execution.

Sorry. Forgot to remove this line. Will fix.

wuxun-zhang · 2020-03-26T01:16:38Z

@mxnet-bot run ci [centos-cpu, unix-gpu, windows-gpu]

mxnet-bot · 2020-03-26T01:16:45Z

Jenkins CI successfully triggered : [windows-gpu, unix-gpu, centos-cpu]

pengzhao-intel · 2020-03-30T01:24:33Z

@wuxun-zhang please take a look for CI

wuxun-zhang · 2020-03-30T02:44:37Z

Seems CI system is now experiencing failures. The errors are unrelated to this PR.

TaoLv · 2020-04-04T14:48:35Z

The CI issue should be already addressed. Please rebase your PR and resolve the comments. Thanks.

wuxun-zhang · 2020-04-05T06:05:08Z

Now DNNL version in master branch has been wrongly changed (from v1.2.2 to v1.1.2). This PR is now waiting for DNNL v1.2 or higher version coming back.

ChaiBapchya · 2020-04-06T19:12:54Z

PR #17084 updated 3rdparty/mkldnn from 8e96ef to cb2cc7 [1.1.0 to 1.1.2]
@wuxun-zhang Can you confirm when was it v1.2.2? and what's the path forward?

wuxun-zhang · 2020-04-07T00:18:22Z

Thank you @ChaiBapchya for investigating this. 8e96ef should be OneDNN v1.2.2. Actually we are working on confirming with the PR author of #17084 if such downgrade is intended or a mistake.

wuxun-zhang · 2020-04-08T04:41:02Z

Now CI has finally passed. @pengzhao-intel @TaoLv please help review again.
@ChaiBapchya please also double check if this PR fixes #17915.

pengzhao-intel

LGTM

@ciyongch @TaoLv please take a review too

ChaiBapchya · 2020-04-08T16:24:20Z

@wuxun-zhang Thanks a lot for the work.
Can you share perf numbers for this patch?
What's the time taken 3d conv now w/MKLDNN vs then w/o MKLDNN?
Can we profile it?

ChaiBapchya · 2020-04-08T16:43:55Z

src/operator/nn/mkldnn/mkldnn_base-inl.h

 return (dtype == mshadow::kFloat32 || dtype == mshadow::kBfloat16) &&
- (ndim == 1 || ndim == 2 || ndim == 4);


What's the issue with 3D tensor?

No. This is not needed for this PR. We have already enabled mkldnn conv with 3D tensor previously.

can you point me to where it's handled? I didn't understand the separate treatment of 3D

pengzhao-intel · 2020-04-09T01:50:38Z

Thanks for the improvement. Merging now.

We will share the performance data soon.

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

* [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase

ChaiBapchya · 2020-04-20T19:49:28Z

@pengzhao-intel @wuxun-zhang Gentle ping. Can we get the perf numbers after this PR changes?

wuxun-zhang · 2020-04-21T02:26:27Z

Performance numbers for Conv3d op:

shape	w/o mkldnn	w/mkldnn
(3, 3, 16, 224, 224)	3.696679 sec	0.046571 sec
(3, 3, 128, 128, 128)	11.716535 sec	0.165749 sec

Test script:

import mxnet as mx
from mxnet import nd, gluon
import time

data_shape = [(3, 3, 16, 224, 224), (3, 3, 128, 128, 128)]

for shape in data_shape:
	data = mx.random.uniform(shape=shape)
	weight_shape = (32, shape[1], 3, 3, 3)
	weight = mx.nd.ones(shape=weight_shape)

	num_iter = 10
	dry_run = 5
	for i in range(num_iter):
		if i == dry_run:
			tic = time.time()
		out = mx.nd.Convolution(data, weight, kernel=(3,3,3), stride=(1,1,1), num_filter=weight_shape[0], pad=(0,0,0), dilate=(2,2,2), no_bias=True, cudnn_off=True, name='conv3d')
		out.asnumpy()
	print("For shape : {}".format(shape))
	print("Average time cost is %f sec" % ((time.time() - tic)/(num_iter-dry_run))

bartekkuncer · 2020-04-22T12:09:48Z

Some more performance numbers for Conv3d:

Shape	Kernel shape	w/o mkldnn	w/mkldnn
(3, 3, 12, 128, 64)	(2, 2, 2)	0.56526 sec	0.00332 sec
(3, 3, 12, 128, 128)	(2, 2, 2)	1.06378 sec	0.00453 sec
(3, 3, 128, 128, 128)	(2, 2, 2)	10.53919 sec	0.04117 sec
(5, 5, 12, 128, 128)	(2, 2, 2)	2.76165 sec	0.00688 sec
(5, 5, 128, 128, 128)	(2, 2, 2)	29.06880 sec	0.07062 sec
(3, 3, 12, 128, 64)	(3, 3, 3)	1.48476 sec	0.00172 sec
(3, 3, 12, 128, 128)	(3, 3, 3)	2.87235 sec	0.00424 sec
(3, 3, 128, 128, 128)	(3, 3, 3)	34.26913 sec	0.04124 sec
(5, 5, 12, 128, 128)	(3, 3, 3)	7.77486 sec	0.00697 sec
(5, 5, 128, 128, 128)	(3, 3, 3)	141.04758 sec	0.07259 sec

Test I used:

import mxnet as mx
import time

def perf_conv3d():
    shapes = [(3, 3, 12, 128, 64), (3, 3, 12, 128, 128), (3, 3, 128, 128, 128), (5, 5, 16, 128, 128), (5, 5, 128, 128, 128)]
    kernel_shapes = [(2, 2, 2), (3, 3, 3)]
    num_filter = 32

    warmup = 10
    runs = 40

    for shape in shapes:
        for kernel_shape in kernel_shapes:
            a = mx.random.uniform(shape=shape)
            w = mx.random.uniform(shape=(num_filter, shape[1], kernel_shape[0], kernel_shape[1], kernel_shape[2]))

            tic = 0
            for i in range(runs + warmup):
                if i == warmup:
                    tic = time.time()
                _ = mx.nd.Convolution(data=a, weight=w, kernel=kernel_shape, num_filter=num_filter, stride=(1,1,1), pad=(0,0,0), no_bias=True, cudnn_off=True)
                mx.nd.waitall()

            toc = time.time()
            print('conv3d benchmark, shape={}, kernel_shape={} time={} ms/iter'.format(shape, kernel_shape, (toc-tic)*1000/(runs-warmup)))


if __name__ == '__main__':
    perf_conv3d()

Sorry for the delay.

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

…nd Fix Sanity pipeline in 1.6.x (#18206) * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * pylint astroid sanity issue * astroid and pylint versions only supported in py3 * remove kBFloat16 as its not supported int 1.6 * added missing definition GetPaddingSizeFull * Remove dilation restriction for conv3d (#17491) * Remove conv3d dilation restriction * Remove comment * fix unix-gpu test for num_outputs and inputs Co-authored-by: Wuxun Zhang <[email protected]> Co-authored-by: reminisce <[email protected]>

TaoLv reviewed Mar 23, 2020

View reviewed changes

wuxun-zhang force-pushed the 3d_conv_pool branch from 8f7fc5b to 3bb434f Compare March 25, 2020 07:37

TaoLv reviewed Mar 25, 2020

View reviewed changes

TaoLv mentioned this pull request Mar 26, 2020

MKLDNN 3D dilate support #17915

Closed

pengzhao-intel added the MKLDNN label Mar 27, 2020

pengzhao-intel added this to In progress in CPU Performance and Quantization via automation Mar 27, 2020

wuxun-zhang force-pushed the 3d_conv_pool branch from bdb94de to 94acedd Compare March 30, 2020 01:31

wuxun-zhang force-pushed the 3d_conv_pool branch from 94acedd to 437622f Compare April 5, 2020 02:23

ChaiBapchya mentioned this pull request Apr 6, 2020

Update oneDNN master from 1.1.2 to 1.2.2 oneapi-src/oneDNN#687

Closed

wuxun-zhang added 4 commits April 8, 2020 10:01

Integrate MKl-DNN conv3d and pool3d/1d

dbb5bc8

fix UT & address comments

613d103

clean code

fc0f97b

rebase against latest master

4cc8f9b

wuxun-zhang force-pushed the 3d_conv_pool branch from 437622f to 4cc8f9b Compare April 8, 2020 02:12

pengzhao-intel approved these changes Apr 8, 2020

View reviewed changes

CPU Performance and Quantization automation moved this from In progress to Reviewer approved Apr 8, 2020

ChaiBapchya reviewed Apr 8, 2020

View reviewed changes

pengzhao-intel merged commit 664889a into apache:master Apr 9, 2020

CPU Performance and Quantization automation moved this from Reviewer approved to Done Apr 9, 2020

wuxun-zhang added a commit to wuxun-zhang/incubator-mxnet that referenced this pull request Apr 15, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

57dc78d

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

wuxun-zhang deleted the 3d_conv_pool branch April 15, 2020 01:54

wuxun-zhang added a commit to wuxun-zhang/incubator-mxnet that referenced this pull request Apr 16, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

4da4aab

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

wuxun-zhang added a commit to wuxun-zhang/incubator-mxnet that referenced this pull request Apr 16, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

e6f46e7

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

wuxun-zhang added a commit to wuxun-zhang/incubator-mxnet that referenced this pull request Apr 18, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

58dc450

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

ChaiBapchya pushed a commit to ChaiBapchya/mxnet that referenced this pull request Apr 22, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

20632af

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

ChaiBapchya pushed a commit to ChaiBapchya/mxnet that referenced this pull request Apr 30, 2020

[MKL-DNN] Integrate Conv3d and Pool3d/1d (apache#17884)

6996261

* Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MKL-DNN] Integrate Conv3d and Pool3d/1d #17884

[MKL-DNN] Integrate Conv3d and Pool3d/1d #17884

wuxun-zhang commented Mar 22, 2020

TaoLv Mar 23, 2020

wuxun-zhang Mar 25, 2020

TaoLv Mar 23, 2020

wuxun-zhang Mar 25, 2020

TaoLv Mar 23, 2020

wuxun-zhang Mar 25, 2020

TaoLv Mar 25, 2020

TaoLv Mar 23, 2020

TaoLv Mar 25, 2020

TaoLv Mar 25, 2020

wuxun-zhang Mar 25, 2020

wuxun-zhang commented Mar 26, 2020

mxnet-bot commented Mar 26, 2020

pengzhao-intel commented Mar 30, 2020

wuxun-zhang commented Mar 30, 2020

TaoLv commented Apr 4, 2020

wuxun-zhang commented Apr 5, 2020

ChaiBapchya commented Apr 6, 2020

wuxun-zhang commented Apr 7, 2020

wuxun-zhang commented Apr 8, 2020

pengzhao-intel left a comment

ChaiBapchya commented Apr 8, 2020

ChaiBapchya Apr 8, 2020

wuxun-zhang Apr 8, 2020

ChaiBapchya Apr 8, 2020

pengzhao-intel commented Apr 9, 2020

ChaiBapchya commented Apr 20, 2020

wuxun-zhang commented Apr 21, 2020

bartekkuncer commented Apr 22, 2020

		const int D = (ndim == 5) ? 2 : 1;
		const int N = 0, C = 1, H = D + 1, W = D + 2;

- const int D = (ndim == 5) ? 2 : 1;
- const int N = 0, C = 1, H = D + 1, W = D + 2;
+ int N = 0, C = 1, H = 2, W = 3;
+ int D = -1;
+ if (ndim == 5) {
+ D = 2;
+ H = 3;
+ W = 4;
+ }

		return (dtype == mshadow::kFloat32 \|\| dtype == mshadow::kBfloat16) &&
		(ndim == 1 \|\| ndim == 2 \|\| ndim == 4);

[MKL-DNN] Integrate Conv3d and Pool3d/1d #17884

[MKL-DNN] Integrate Conv3d and Pool3d/1d #17884

Conversation

wuxun-zhang commented Mar 22, 2020

Description

Checklist

Essentials

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wuxun-zhang commented Mar 26, 2020

mxnet-bot commented Mar 26, 2020

pengzhao-intel commented Mar 30, 2020

wuxun-zhang commented Mar 30, 2020

TaoLv commented Apr 4, 2020

wuxun-zhang commented Apr 5, 2020

ChaiBapchya commented Apr 6, 2020

wuxun-zhang commented Apr 7, 2020

wuxun-zhang commented Apr 8, 2020

pengzhao-intel left a comment

Choose a reason for hiding this comment

ChaiBapchya commented Apr 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pengzhao-intel commented Apr 9, 2020

ChaiBapchya commented Apr 20, 2020

wuxun-zhang commented Apr 21, 2020

bartekkuncer commented Apr 22, 2020