Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Safe LayerNorm #15002

Merged
merged 3 commits into from
May 22, 2019
Merged

Safe LayerNorm #15002

merged 3 commits into from
May 22, 2019

Conversation

sxjscience
Copy link
Member

Description

Continue #14699 . Requires #14935 to be merged.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Support safe reduction in LayerNorm.

Comments

When the data input type is float16, we recommend to turn on the MXNET_SAFE_ACCUMULATION flag which not only improves precision but also accelerates the speed of the kernel.

@pinaraws
Copy link

@mxnet-label-bot add[Operator, pr-work-in-progress]

@marcoabreu marcoabreu added Operator pr-work-in-progress PR is still work in progress labels May 20, 2019
@eric-haibin-lin
Copy link
Member

#14935 is merged. Can you resolve conflict?

enable safe accumulation

fix bug

fix
@sxjscience
Copy link
Member Author

@marcoabreu There's a weird sanity check error. The pylint check seems to have passed but it shows "Makefile:600: recipe for target 'pylint' failed". See http:https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fsanity/detail/PR-15002/5/pipeline/

Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanity check is fixed on master. can you sync with master branch?

src/operator/nn/layer_norm-inl.h Outdated Show resolved Hide resolved
@eric-haibin-lin eric-haibin-lin merged commit b3c91bf into apache:master May 22, 2019
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request May 22, 2019
* use float32 to store the reduction result of float16

enable safe accumulation

fix bug

fix

* update test for safe_accumulate

* fix
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* use float32 to store the reduction result of float16

enable safe accumulation

fix bug

fix

* update test for safe_accumulate

* fix
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Operator pr-work-in-progress PR is still work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants