Support projection feature for LSTM on CPU (Only Inference) #17702

zixuanweeei · 2020-02-27T02:57:33Z

Description

gluon.rnn.LSTM has an argument - projection_size - which enables the projection feature for LSTM operator. Previously, this feature is not supported on the CPU backend. This PR aims to add it to the CPU backend. It should be noticed that only the forward pass is ready in this PR. Backward pass and needs_grads scenario is not adapted to this feature. When it runs into those contents, it throws an error.

@ciyongch @pengzhao-intel @TaoLv

pengzhao-intel · 2020-02-27T03:05:33Z

@eric-haibin-lin

eric-haibin-lin · 2020-02-28T00:08:56Z

src/operator/rnn.cc

@@ -385,7 +382,9 @@ The definition of GRU here is slightly different from paper but compatible with
 })


I don't think the projection support is clear in the documentation. Could you update the documentation with LSTMP support when projection_size is set? You can refer to https://github.com/apache/incubator-mxnet/blob/62a85f365b819829fedb60116f803e0c0a3c554c/python/mxnet/gluon/contrib/rnn/rnn_cell.py#L197

Thanks!

Sure. Thanks for pointing out that.

Done. Please take a review again. Thanks.

zixuanweeei · 2020-02-28T04:13:41Z

CI has passed last time. The latest commit just added some documents for the projection feature. Accordingly, it should have no impact on functionality. Let's wait for CI validation. Please take a review. Thanks. @ciyongch @pengzhao-intel

ciyongch

LGTM

TaoLv

Thank you for the contribution. Minor comments.

TaoLv · 2020-02-29T12:59:45Z

src/operator/rnn.cc

@@ -385,7 +399,9 @@ The definition of GRU here is slightly different from paper but compatible with
 })
 .set_attr<mxnet::FInferShape>("FInferShape", RNNShape)
 .set_attr<nnvm::FInferType>("FInferType", RNNType)
+#if MXNET_USE_MKLDNN == 1


Let's merge this check with the one at L407.

Make sure you know that, if MKL-DNN is not used, FInferStorageType will not be registered.

Done. The default storage type inference function will be executed when FInferStorageType is not registered.

TaoLv · 2020-02-29T13:01:58Z

src/operator/rnn.cc

@@ -427,7 +443,9 @@ NNVM_REGISTER_OP(_backward_RNN)
 .set_attr_parser(ParamParser<RNNParam>)
 .set_attr<bool>("TIsLayerOpBackward", true)
 .set_attr<nnvm::TIsBackward>("TIsBackward", true)
+#if MXNET_USE_MKLDNN == 1


Same here, merge this check with the one at L450.

Done. Thanks.

TaoLv · 2020-02-29T13:04:24Z

src/operator/rnn_impl.h

+ const Tensor<cpu, 2, DType> wh(w_ptr + I * H * 4, Shape2(H * 4, (P ? P : H)));
+ Tensor<cpu, 2, DType> whr(w_ptr, Shape2(1, 1));
+ if (P > 0)
+ whr = Tensor<cpu, 2, DType>(wh.dptr_ + P * 4 * H, Shape2(P, H));


Let's put this into the same line of if (P > 0) or add {. .. } for it, like what you're doing at L236.

Put them into the same line, as well as L236.

* test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix

pengzhao-intel · 2020-03-01T05:51:58Z

Thanks for all of your great works. I will merge PR after passed CI.

…7702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI

* Support projection feature for LSTM on CPU (Only Inference) (#17702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 (#17872) * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 * Use nd.copy() to initialize parameters of new operator * Add check for output states * Initialize i2h/h2h_weights with zeros for rnn_relu/tanh, and reduce size * Split fused rnn layer test into tests of individual mode * Skip lstm and gru tests on CPU context without DNNL

…18038) * Support projection feature for LSTM on CPU (Only Inference) (apache#17702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 (apache#17872) * Fix issue of zeros gradients w.r.t. RNN bias when num_layers > 1 * Use nd.copy() to initialize parameters of new operator * Add check for output states * Initialize i2h/h2h_weights with zeros for rnn_relu/tanh, and reduce size * Split fused rnn layer test into tests of individual mode * Skip lstm and gru tests on CPU context without DNNL

…7702) * Support projection feature for LSTM on CPU * test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix * Re-run CI

pengzhao-intel added this to In progress in CPU Performance and Quantization via automation Feb 27, 2020

zixuanweeei force-pushed the lstmp branch 2 times, most recently from 981f9b0 to 759285b Compare February 27, 2020 11:27

eric-haibin-lin reviewed Feb 28, 2020

View reviewed changes

zixuanweeei force-pushed the lstmp branch from 93f7e34 to c2741d7 Compare February 28, 2020 07:15

ciyongch approved these changes Feb 29, 2020

View reviewed changes

TaoLv approved these changes Feb 29, 2020

View reviewed changes

CPU Performance and Quantization automation moved this from In progress to Reviewer approved Feb 29, 2020

TaoLv mentioned this pull request Feb 29, 2020

[Large Tensor] Fixed RNN op #17632

Merged

4 tasks

Support projection feature for LSTM on CPU

a84d0fc

* test solution for -Werror=maybe-uninitialized * Check device type when create state * Document the projection feature of LSTM for RNN operator * Minor fix

zixuanweeei force-pushed the lstmp branch from 46a93ab to a84d0fc Compare March 1, 2020 01:53

Re-run CI

5a65fc4

pengzhao-intel merged commit ac77974 into apache:master Mar 2, 2020

CPU Performance and Quantization automation moved this from Reviewer approved to Done Mar 2, 2020

zixuanweeei mentioned this pull request Apr 13, 2020

[v1.x] Backport #17702 and #17872 to v1.x branch #18038

Merged

stu1130 mentioned this pull request Apr 15, 2020

[v1.7] Backport #17702 and #17872 to v1.7 branch (#18038) #18070

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support projection feature for LSTM on CPU (Only Inference) #17702

Support projection feature for LSTM on CPU (Only Inference) #17702

zixuanweeei commented Feb 27, 2020

pengzhao-intel commented Feb 27, 2020

eric-haibin-lin Feb 28, 2020

zixuanweeei Feb 28, 2020

zixuanweeei Feb 28, 2020

zixuanweeei commented Feb 28, 2020

ciyongch left a comment

TaoLv left a comment

TaoLv Feb 29, 2020

TaoLv Feb 29, 2020

zixuanweeei Mar 1, 2020

TaoLv Feb 29, 2020

zixuanweeei Mar 1, 2020

TaoLv Feb 29, 2020

zixuanweeei Mar 1, 2020

pengzhao-intel commented Mar 1, 2020

		@@ -385,7 +382,9 @@ The definition of GRU here is slightly different from paper but compatible with
		})

Support projection feature for LSTM on CPU (Only Inference) #17702

Support projection feature for LSTM on CPU (Only Inference) #17702

Conversation

zixuanweeei commented Feb 27, 2020

Description

pengzhao-intel commented Feb 27, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zixuanweeei commented Feb 28, 2020

ciyongch left a comment

Choose a reason for hiding this comment

TaoLv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pengzhao-intel commented Mar 1, 2020