Revert the change on broadcast_to param shape #14998

reminisce · 2019-05-20T03:52:06Z

Description

This PR reverts the mistaken change on broadcast_to's param shape introduced by #14661. There was no unit test for the param shape with 0 as dimension size so this error was not caught. Such case has been added in unit tests to ensure correctness.

Thank @DickJC123 for reporting this in #14954.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

reminisce · 2019-05-20T21:43:16Z

@DickJC123 Could you please give a review? Thanks.

DickJC123 · 2019-05-20T23:30:29Z

Trying to pop my to-do stack. Will look at this by EOD tomorrow.

DickJC123 · 2019-05-24T02:11:48Z

Thank you for your patience as I dealt with other MXNet fires. I still feel I want to think about the Shape serialization to files. But in the meantime, let me ask about the issue that started it all, and that you're addressing with this PR:

Isn't the real problem that the Python output shape parameter of broadcast_to() isn't being converted correctly by the parameter parser? In the default mode (so not numpy compatibility), a shape dimension size of 0 should be converted to -1 in the backend, which is what the broadcast_to() infer shape routine is already (i.e. correctly) looking for? In numpy compatibility mode, the user should use -1 for the output shape in Python to denote unknown.

reminisce · 2019-05-24T03:28:52Z

@DickJC123 Thanks for the review. For this specific problem, it actually has nothing to do with numpy compatibility mode. The document says that 0 just means the output dimension size is the same as the corresponding input dimension size; it's not related to any judgement on whether the shape dimension size should be 0 or unknown. So per document, 0 should never be changed to -1 in whatever situation, but it was changed by a mistake. This is the same as MXNet reshape operator where the shape parameter has 0, -1, -2, -3, and -4 for different shape inference maneuvers.

omarroberts40 · 2019-05-24T03:52:39Z

Stop On Thursday, May 23, 2019, 8:29:55 PM PDT, reminisce <[email protected]> wrote: @DickJC123 Thanks for the review. For this specific problem, it actually has nothing to do with numpy compatibility mode. The document says that 0 just means the output dimension size is the same as the corresponding input dimension size; it's not related to any judgement on whether the shape dimension size should be 0 or unknown. So per document, 0 should never be changed to -1 in whatever situation, but it was changed by a mistake. This is the same as MXNet reshape operator where the shape parameter has 0, -1, -2, -3, and -4 for different shape inference maneuvers. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

fhieber · 2019-05-24T10:07:48Z

I would appreciate a timely merge of this fix. Thank you!

DickJC123 · 2019-05-24T20:24:40Z

I'm close to giving this the thumbs up (see the end of my comments). I think a PR relating to Shape serialization (as we have discussed) should be separate from this one because of the possible extended discussion, given the 1.5 release deadline and @fhieber's eagerness for this fix.

On this PR, let me try one more time with my argument. Consider numpy broadcast_to() behavior:

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x = np.array([1,2,3])
>>> np.broadcast_to(x, (3,3))
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])
>>> np.broadcast_to(x, (0,3))
array([], shape=(0, 3), dtype=int64)
>>>

If we say that MXNet's broadcast_to() uses 0 to mean 'match input', regardless of compatibility mode, then we could never duplicate numpy's behavior. However, if we think of 0 to mean 'unknown' in non-compatibility mode, and we say that 'unknown' really means 'unknown until inputs are bound, then match input size', we'd retain numpy functionality:

In numpy compatibility mode, MXNet could broadcast to a shape with a zero size as in:
mx.sym.broadcast_to(x, (0,3)) and boadcast_to() an input-matching shape as in mx.sym.broadcast_to(x, (-1,3)) [this being an extension to numpy capability].

And speaking of which, is there a reason you chose -1 for unknown size in numpy compatibility mode? Why not None as in (None, 3) for the example above? If you chose None, then even MXNet reshape functionality could be considered a compatible superset of Numpy's (in compatibility mode):

Numpy reshape:
0 = 0 size
-1 = calculate dim so total elements match

MXNet not in numpy-compatible mode:
0 = output size matches input [in contrast to numpy definition]
-1 = calculate dim so total elements match
-2 -> -4 other MXNet-specific definitions

MXNet numpy-compatible mode:
None = unknown size (until binding), then output size matches input
0 = 0 size [like numpy]
-1 = calculate dim so total elements match [like numpy]
-2 -> -4 other MXNet-specific definitions

Having typed this up, I see one problem- what value holds 'None' size in the backend, if -1 might have a meaning as it does in reshape? It could be defined to have a non-conflicting value as in enum ShapeSizeConstants { kUnknownSize = MIN_INT } and tested via:

inline bool dim_size_is_known(const dim_t dim_size) {  return dim_size != kUnknownSize; }

Anyway thanks for reading this far. Unless this really resonates with you, I'll leave it by saying I support this PR as it stands- this corrects many of the Sockeye test failures I was seeing (the rest related to Shape serialization, not broadcast_to()).

reminisce · 2019-05-24T20:45:31Z

@DickJC123 Thanks for the deep analysis. I can see the confusion of this numpy compatibility flag. We decided to change it to another name called np_shape in this PR: #15063

Please know that this flag is designed for only affecting the incoming MXNet-NumPy operators which are not in the master branch for now. Legacy operators are NOT expected to comply with NumPy when this flag is set True. We are not going to announce any numpy related design/implementation in 1.5 release notes.

DickJC123 · 2019-05-25T01:11:44Z

Thanks for your explanation. Again, this PR LGTM.

Revert the change broadcast_to param shape

10f3c80

reminisce added the Bug label May 20, 2019

reminisce changed the title ~~Revert the change broadcast_to param shape~~ Revert the change on broadcast_to param shape May 20, 2019

reminisce requested a review from eric-haibin-lin May 20, 2019 21:24

eric-haibin-lin approved these changes May 20, 2019

View reviewed changes

eric-haibin-lin merged commit 9250a73 into apache:master May 25, 2019

reminisce mentioned this pull request Jun 6, 2019

broadcast_to() behavior change after new 0D and uninitialized shape support added? #14954

Closed

haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019

Revert the change broadcast_to param shape (apache#14998)

4e09157

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert the change on broadcast_to param shape #14998

Revert the change on broadcast_to param shape #14998

reminisce commented May 20, 2019

reminisce commented May 20, 2019

DickJC123 commented May 20, 2019

DickJC123 commented May 24, 2019

reminisce commented May 24, 2019

omarroberts40 commented May 24, 2019 via email

fhieber commented May 24, 2019

DickJC123 commented May 24, 2019

reminisce commented May 24, 2019

DickJC123 commented May 25, 2019

Revert the change on broadcast_to param shape #14998

Revert the change on broadcast_to param shape #14998

Conversation

reminisce commented May 20, 2019

Description

Checklist

Essentials

reminisce commented May 20, 2019

DickJC123 commented May 20, 2019

DickJC123 commented May 24, 2019

reminisce commented May 24, 2019

omarroberts40 commented May 24, 2019 via email

fhieber commented May 24, 2019

DickJC123 commented May 24, 2019

reminisce commented May 24, 2019

DickJC123 commented May 25, 2019