Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GraphBolt] Add gb.numpy_save_aligned. #7524

Merged
merged 8 commits into from
Jul 16, 2024

Conversation

mfbalin
Copy link
Collaborator

@mfbalin mfbalin commented Jul 15, 2024

Description

If we store our features in the numpy format using this new function, our io_uring and mmap disk read operations will speedup significantly (2.1 GiB/s vs 1.7 GiB/s) for nice close to power of 2 embedding sizes such as 768 floats or 512 int8s

It is risk free to do this change because the files that we handle are a lot larger than 4K.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@mfbalin mfbalin requested a review from Rhett-Ying July 15, 2024 08:51
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

Commit ID: 24804be

Build ID: 1

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

Commit ID: 74d8d9d

Build ID: 2

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

Commit ID: 66c4663

Build ID: 3

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@mfbalin
Copy link
Collaborator Author

mfbalin commented Jul 15, 2024

Finished working on the PR.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

Commit ID: e139b9d

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 15, 2024

Commit ID: 3052e6c

Build ID: 5

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 16, 2024

Commit ID: d35bb20fa82f3f67435c4de3ea9152fb45eeba1b

Build ID: 6

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 16, 2024

Commit ID: 872b7a2d5a9b5a9e4d851f697135399d848ceedf

Build ID: 7

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 16, 2024

Commit ID: 7f19f91c22b20f2a0d54c69aa5f93bc6efa8b61e

Build ID: 8

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@frozenbugs frozenbugs merged commit 8368f62 into dmlc:master Jul 16, 2024
2 checks passed
@mfbalin mfbalin deleted the gb_numpy_save_aligned branch July 16, 2024 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants