[MXNET-1233] Enable dynamic shape in CachedOp #13419

junrushao · 2018-11-27T08:04:56Z

Description

This PR enables dynamic shape in CachedOp, a.k.a. the backend supporting HybridBlock in Gluon.

In the forward pass, we have to invoke operators one-by-one because dynamic shape disallows us to allocate memory ahead of time.

The backward pass is actually unaffected, because after forward, shape of everything becomes known.

CC: @zheng-da @szha @yidawang

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Add NaiveRunGraph in imperative mode, in which operators are executed in a synchronized manner.
Add NaiveForward mode in CachedOp, which calls NaiveRunGraph.
Add CheckDynamicShapeExists in CachedOp, which tells whether the graph contains an operator returning dynamic shape.

Comments

Yet another mode in the executor of CachedOp. Perhaps we need to refactor some day.

vandanavk · 2018-11-27T17:14:09Z

@mxnet-label-bot add [pr-work-in-progress]

junrushao · 2018-12-02T08:18:26Z

@mxnet-label-bot add [pr-awaiting-review]

junrushao · 2018-12-02T08:19:29Z

@mxnet-label-bot remove [pr-work-in-progress]

zheng-da · 2018-12-12T05:04:30Z

src/imperative/cached_op.cc

@@ -262,6 +262,29 @@ std::vector<nnvm::NodeEntry> CachedOp::Gradient(
 return ret;
 }

+bool CachedOp::CheckDynamicShapeExists(const Context& default_ctx,
+ const std::vector<NDArray*>& inputs) {


i wonder if it's better to check operators with dynamic shape directly. right now, it assumes that if a computation graph can't infer shape, it contains dynamic-shape operators. it's better to write one that works for both CachedOp and symbol executor. It's a property of a computation graph whether a graph contains dynamic shape. We can easily check it by traversing all operators in a graph.

Per our discussion last time, I think our solution should be naive implementation first, and then do graph partitioning to speed stuff up.

zheng-da · 2018-12-12T05:56:26Z

src/imperative/cached_op.cc

+ arrays.reserve(num_entries);
+ for (auto& item : runtime.buff) {
+ arrays.push_back(&item);
+ }


i wonder if we should buffer arrays from the previous run?

Why buffer stuff from previous run? To save memory alloc overhead?

zheng-da · 2018-12-12T05:58:47Z

src/imperative/imperative_utils.cc

+ Context ctx = GetContext(node.source->attrs, ndinputs, ndoutputs, default_ctx);
+ auto invoke = [&](const OpStatePtr &state) {
+ DispatchMode dispatch_mode = DispatchMode::kUndefined;
+ SetShapeType(ctx, node.source->attrs, ndinputs, ndoutputs, &dispatch_mode);


do we still infer shape here?

No. This function leverages the existing infer_shape to find out whether there are dynamic shape stuff inside the graph.

zheng-da · 2018-12-12T06:03:11Z

src/imperative/imperative_utils.cc

+ auto fwd_node_id = idx.node_id(fwd_node);
+ cached_op->Backward(retain_graph, states[fwd_node_id], ndinputs, req, ndoutputs);
+ } else if (createop.count(node.source->op())) {
+ // case 2: node is in createop


i think this is to handle stateful operators

Yep, you are right. Should I change the comments?

eric-haibin-lin · 2018-12-17T06:15:08Z

src/imperative/imperative_utils.h

@@ -1002,6 +1009,18 @@ void RunGraph(const bool retain_graph,
 const DispatchModeVector &dispatch_modes,
 bool recording);

+
+void NaiveRunGraph(const bool retain_graph,


I think these new functions deserve documentations for inputs/outputs and what it does to keep the readability

anirudhacharya · 2019-01-11T22:08:18Z

@eric-haibin-lin another round of review?

sandeep-krishnamurthy · 2019-01-28T17:55:33Z

@zheng-da - Can you please take a look at this PR again?

junrushao · 2019-01-29T01:30:09Z

I am going to close this PR, split it into small pieces, and PR again.

Dynamic shape in forward

0678444

junrushao requested a review from anirudh2290 as a code owner November 27, 2018 08:04

marcoabreu added the pr-work-in-progress PR is still work in progress label Nov 27, 2018

Add test for backward pass

21d0bd6

junrushao changed the title ~~[WIP] Enable dynamic shape in CachedOp~~ [MXNET-1233] Enable dynamic shape in CachedOp Nov 27, 2018

junrushao added 2 commits November 27, 2018 21:05

Re-trigger

48011d4

Re-trigger

325eee3

marcoabreu added the pr-awaiting-review PR is waiting for code review label Dec 2, 2018

marcoabreu removed the pr-work-in-progress PR is still work in progress label Dec 2, 2018

zheng-da reviewed Dec 12, 2018

View reviewed changes

junrushao added 2 commits December 15, 2018 23:01

Add symbolic arange

45f4bef

Add a faked backward for contrib.sarange

496fe9a

eric-haibin-lin reviewed Dec 17, 2018

View reviewed changes

junrushao added 3 commits December 28, 2018 21:00

Done

800b7cd

Make lint happy

605f4df

Trigger CI

dd0df84

sandeep-krishnamurthy added the Operator label Jan 28, 2019

junrushao closed this Jan 29, 2019

junrushao mentioned this pull request Jan 29, 2019

[MXNET-1315] Add checks for dynamic-shaped operators in CachedOp #14018

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-1233] Enable dynamic shape in CachedOp #13419

[MXNET-1233] Enable dynamic shape in CachedOp #13419

junrushao commented Nov 27, 2018 •

edited

Loading

vandanavk commented Nov 27, 2018

junrushao commented Dec 2, 2018

junrushao commented Dec 2, 2018

zheng-da Dec 12, 2018

junrushao Dec 13, 2018

zheng-da Dec 12, 2018

junrushao Dec 13, 2018

zheng-da Dec 12, 2018

junrushao Dec 13, 2018

zheng-da Dec 12, 2018

junrushao Dec 13, 2018

eric-haibin-lin Dec 17, 2018

anirudhacharya commented Jan 11, 2019

sandeep-krishnamurthy commented Jan 28, 2019

junrushao commented Jan 29, 2019

[MXNET-1233] Enable dynamic shape in CachedOp #13419

[MXNET-1233] Enable dynamic shape in CachedOp #13419

Conversation

junrushao commented Nov 27, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

vandanavk commented Nov 27, 2018

junrushao commented Dec 2, 2018

junrushao commented Dec 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anirudhacharya commented Jan 11, 2019

sandeep-krishnamurthy commented Jan 28, 2019

junrushao commented Jan 29, 2019

junrushao commented Nov 27, 2018 •

edited

Loading