Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Mixed data type binary ops #16699

Merged
merged 2 commits into from
Nov 5, 2019
Merged

Conversation

haojin2
Copy link
Contributor

@haojin2 haojin2 commented Nov 1, 2019

Description

Coverage for true_divide between floating types and integer types (including boolean).
Coverage for multiply between floating types and boolean type. (mainly for )
Also a side fix for cumsum with boolean inputs.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • true_divide mixed-precision support
  • multiply mixed-precision support
  • cumsum boolean input support
  • unit tests for all of above

Comments

@haojin2 haojin2 added the Numpy label Nov 1, 2019
@haojin2 haojin2 self-assigned this Nov 1, 2019
@haojin2 haojin2 added this to In progress in numpy via automation Nov 1, 2019
@haojin2 haojin2 force-pushed the mixed_binary_ops branch 4 times, most recently from f6aa9e9 to 28f07a7 Compare November 3, 2019 10:34
src/common/utils.h Outdated Show resolved Hide resolved
@haojin2 haojin2 changed the title [DO NOT MERGE] [DO NOT REVIEW] [WIP]Mixed binary ops Mixed data type binary ops Nov 4, 2019
numpy automation moved this from In progress to Reviewer approved Nov 4, 2019
numpy automation moved this from Reviewer approved to Needs review Nov 4, 2019
@@ -194,6 +194,100 @@ MXNET_BINARY_MATH_OP_NC(right, b);

MXNET_BINARY_MATH_OP_NC(mul, a * b);

#ifndef _WIN32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate why Windows is not supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supported with a different implementation due to limitations of the windows vs compiler.
The fact that the corresponding unit tests are not discriminative against windows machines and they passed both windows cpu and gpu checks means this feature is also supported on windows.
Please do make sure you have some grasp of the big picture of a PR before you block one.

Copy link
Contributor

@marcoabreu marcoabreu Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm well aware of the unit tests passing on windows, thanks for the helpful hint.

Still, can you elaborate which part exactly is not supported by the windows compiler? Basically the whole PR is excluding Windows and that seems off. Having an entirely different implementation for a different OS is not something I see regularly, so I'm eager to learn

Copy link
Contributor Author

@haojin2 haojin2 Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was due to the C1002: out of heap space error we've encountered many times.
We're not generating more kernels(code) on windows to prevent hitting that error on windows machines.
If you still think windows is excluded I would only think you've not given the code changes a complete look:

  1. There's also some parts that we have #ifdef _WIN32 such as: https://github.com/apache/incubator-mxnet/pull/16699/files#diff-c383124e9cb87f51ac456a96b799615aR73
  2. We also have parts that have #else blocks such as: https://github.com/apache/incubator-mxnet/pull/16699/files#diff-c383124e9cb87f51ac456a96b799615aR73

This is indeed a workaround for an issue that we could not solve on our own. I've also tried with upgrading vs compiler locally and it does not get this issue out of our way so that's why we have different impls for this same feature, otherwise we only have to drop this new feature for windows users.
It's good to be eager to learn, but IMHO blocking a PR without a complete look and a very solid reason is not a good (nor polite) way for demonstrating your eagerness.

@reminisce
Copy link
Contributor

@marcoabreu Appreciate your review. I can assure you that Windows is absolutely not excluded from supporting mixed-precision as Unix. @haojin2 has gone through thorough trial-and-error to make it work with Windows compilation tool chain, which I believe very few of us would be willing to get hands dirty to make this happen as compilation on Windows platforms is outside our domain knowledge. This is an extremely non-trivial task that took @haojin2 many day-and-nights to accomplish. So kudos to @haojin2 .

We are trying to merge this to meet a deadline. If you feel your concerns/questions have not been addressed after @haojin2 's explanation, could you raise them so that we can help to close gap. Thanks.

Copy link
Member

@cjolivier01 cjolivier01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve seen that out of heap space error a lot. Really annoying. You may know already, but just FYI Sometimes was able to track it down to too much template nesting which was killing the compiler, when nesting too many MXNET_TYPE_SWITCH and the other similar macros like that one within templates where they end up generating lots of permutations of code.

@reminisce reminisce merged commit 3c404a5 into apache:master Nov 5, 2019
numpy automation moved this from Needs review to Done Nov 5, 2019
@haojin2
Copy link
Contributor Author

haojin2 commented Nov 5, 2019

@cjolivier01 Yeah that's exactly part of what I did in #16711, also in this PR you could see that there're some places I gave up using the type switches to only generate kernels for what I really need. I think we probably need a big revisit to our operator implementations to optimize the macro usages. Thanks for your insight.

yajiedesign pushed a commit to yajiedesign/mxnet that referenced this pull request Nov 6, 2019
* support mixed-precision binary operations

* improvement for documentations and error messages
@haojin2 haojin2 added the R1.6.0 label Nov 7, 2019
ptrendx pushed a commit to ptrendx/mxnet that referenced this pull request Nov 15, 2019
* support mixed-precision binary operations

* improvement for documentations and error messages
ptrendx added a commit that referenced this pull request Nov 16, 2019
…, #16792) (#16832)

* Fix nightly build (#16773)

* Remove dependency on tvmop.conf

* Fix binaries dependencies for ni nightly

* Add comments

* Update tvmop.py

* Fix rebase

* Fix (#16781)

* Speed fused_op compilation by caching ptx and jit-compiled functions (#16783)

* [Numpy] Fix collect_params().zero_grad() in gluon numpy interface (#16716)

* fix zero_grad

* Update parameter.py

* add test

* fix

* Mixed data type binary ops (#16699)

* support mixed-precision binary operations

* improvement for documentations and error messages

* Support boolean elemwise/broadcast binary add, multiply and true_divide (#16728)

* support pure boolean elemwise/broadcast binary op

* switch to unique_tpr

* fix the test error

* Fix rtrue_divide grad (#16769)

* Fix rtrue_divide_scalar

* More tests

* Fix numpy-compatible mean output type for integer inputs (#16792)

* fix mean output type for integer inputs

* enable for windows
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
numpy
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants