-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Unified Memory #3494
base: docs/develop
Are you sure you want to change the base?
Add Unified Memory #3494
Conversation
94f2948
to
a610e91
Compare
Send some comments via direct message. The short summary:
|
@matyas-streamhpc Please rebase your branch on docs/develop |
05a8d2a
to
4a9a4ac
Compare
f230126
to
1f7967f
Compare
@matyas-streamhpc rebased the branch. |
docs/how-to/unified_memory.rst
Outdated
high-performance low latency operations, while the GPU is optimized for | ||
high-throughput (data processed by unit time). | ||
|
||
How-to use? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended
Using Unified Memory Management (UMM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
docs/how-to/unified_memory.rst
Outdated
and a GPU which would require large memory transfers. Here are some areas where | ||
UMM can be beneficial: | ||
|
||
- **Simplification of Memory Management**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended
Simplification of memory management
UMM can help to simplify the complexities of memory management. This can make it easier for developers to write code without worrying about memory allocation and deallocation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
docs/how-to/unified_memory.rst
Outdated
As a positive side effect, the use of UMM can reduce the lines of code, | ||
thereby improving programming productivity. | ||
|
||
In HIP, pinned memory allocations are coherent by default. Pinned memory is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended
In HIP, pinned memory allocations are coherent by default. Pinned memory is host memory mapped into the address space of all GPUs, meaning that the pointer can be used on both host and device. Using pinned memory instead of pageable memory on the host can improve bandwidth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
docs/how-to/unified_memory.rst
Outdated
pointer can be used on both host and device. Using pinned memory instead of | ||
pageable memory on the host can lead an improvement in bandwidth. | ||
|
||
While UMM can provide numerous benefits, it is also important |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended
While UMM can provide numerous benefits, it is important to be aware of the potential performance overhead associated with UMM. You must thoroughly test and profile your code to ensure it is the most suitable choice for your use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
docs/how-to/unified_memory.rst
Outdated
|
||
The following example shows how to use unified memory management with | ||
``hipMallocManaged()``, function, with ``__managed__`` attribute for static | ||
allocation and standard ``malloc()`` allocation. The Explicit Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is EMM presented for comparison? Do you mean in the figure? Consider adding a link to the section, so users don't have to scroll back and forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant in the last tab of the tabset. I updated this section.
docs/how-to/unified_memory.rst
Outdated
} | ||
|
||
|
||
Compiler Hints for the Better Performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help establish context with UMM? Do we mean C++ compiler hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are UMM compiler hints. I updated the header.
@matyas-streamhpc @yhuiYH added some comments from a user standpoint. Please feel free to use or ignore them. Let me know if they are not technically applicable or relevant. |
@Rmalavally @yhuiYH Thank you very much for your feedback. I updated the page, accordingly. |
bdd527d
to
dd845c5
Compare
|
||
❌: **Unsupported** | ||
|
||
:sup:`1` Works only with ``XNACK=1``. First GPU access causes recoverable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to XNACK documentation:
https://rocm.docs.amd.com/en/latest/conceptual/gpu-memory.html#xnack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Please fix the following rst issues:
|
docs/how-to/unified_memory.rst
Outdated
Concept | ||
======= | ||
|
||
In a conventional architectures, both CPUs a GPUs have a dedicated memory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Matyas. Appreciate the prompt fixes.
Some more minor fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I applied the recommended fixes.
docs/how-to/unified_memory.rst
Outdated
Concept | ||
======= | ||
|
||
In a conventional architectures, both CPUs a GPUs have a dedicated memory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove 'a' in
In a conventional architectures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been removed in the last commit, but Github UI does not show it, because there was a comment on it.
You can see the latest version here:
https://github.com/ROCm/HIP/pull/3494/files#diff-9ffae9738c016086230e9386a7d0aa1b26bd7a8f4845494ba9952cf0de47c1db
No description provided.