Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[PERFORMANCE] Improve shuffle implementation #20928

Merged
merged 3 commits into from
Mar 15, 2022

Conversation

anko-intel
Copy link
Contributor

@anko-intel anko-intel commented Mar 2, 2022

Description

Improve performance of random.shuffle implementation.

Results on c6i.8xlarge instance ( Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz ) of the follwing script

import mxnet as mx
from benchmark.opperf.utils.benchmark_utils import run_performance_test

shapes = [
    (  10, 1),
    ( 32, 9999),
    (4096,   1),
    (4096,  500),
    (4096, 4096),
    (4096, 1024, 32)]

print("\nnot-in-place (mx.nd.random.shuffle):")
for shape in shapes:
    s_res = run_performance_test(mx.nd.random.shuffle, run_backward=False, inputs=[{"data": (shape)}])
    print(s_res)

print("\nin-place (mx.np.random.shuffle):")
for shape in shapes:
    s_res = run_performance_test(mx.np.random.shuffle, run_backward=False, inputs=[{"x": (shape)}])
    print(s_res)

shows following improvments:

  • not-in-place (mx.nd.random.shuffle):
Shape Before [ms] After [ms] Improvement
(10, 1)} 0.0038 0.0024 37%
(32, 9999) 0.168 0.0795 53%
(4096, 1)} 0.0463 0.0328 29%
(4096, 500) 1.1972 0.6511 46%
(4096, 4096) 15.5678 8.8016 43%
(4096, 1024, 32) 139.1404 74.0476 47%
  • in-place (mx.np.random.shuffle):
Shape Before [ms] After [ms] Improvement
(10, 1)} 0.0023 0.0022 4%
(32, 9999) 0.0852 0.0406 52%
(4096, 1)} 0.0436 0.0404 7%
(4096, 500) 0.6643 0.4317 35%
(4096, 4096) 6.778 3.7478 45%
(4096, 1024, 32) 89.1249 41.931 53%

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

@mxnet-bot
Copy link

Hey @anko-intel , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, centos-gpu, sanity, unix-cpu, windows-gpu, unix-gpu, clang, website, miscellaneous, centos-cpu, windows-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress labels Mar 2, 2022
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Mar 3, 2022
@@ -78,28 +78,37 @@ void ShuffleND(DType* const out,
std::shuffle(index.begin(), index.end(), *prnd);
if (reqT != kWriteInplace) {
for (index_t i = 0; i < first_axis_len; ++i) {
auto j = index[i];
std::memcpy(out + stride * j, in + stride * i, stride_bytes);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following error occurs after first comit:
/work/mxnet/src/operator/random/shuffle_op.cc:82:18: error: 'void* memcpy(void*, const void*, size_t)' writing to an object of type 'class mshadow::half::half_t' with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Werror=class-memaccess]

@mseth10 mseth10 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022
@anko-intel anko-intel changed the title Improve shuffle implementation [PERFORMANCE] Improve shuffle implementation Mar 3, 2022
@anko-intel
Copy link
Contributor Author

@mxnet-bot run ci [windows-cpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu, windows-cpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022
@anko-intel
Copy link
Contributor Author

@mxnet-bot run ci [windows-cpu, windows-gpu, unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-cpu, windows-gpu, unix-cpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 4, 2022
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 7, 2022
@anko-intel
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu, centos-gpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu, centos-gpu, windows-gpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 8, 2022
@anko-intel
Copy link
Contributor Author

@mxnet-bot run ci [centos-gpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu, centos-gpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 8, 2022
Copy link
Contributor

@mozga-intel mozga-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-merge Review and CI is complete. Ready to Merge pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-merge Review and CI is complete. Ready to Merge pr-work-in-progress PR is still work in progress labels Mar 14, 2022
@bgawrych bgawrych merged commit 927d68b into apache:master Mar 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants