Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unified Memory #3494

Open
wants to merge 6 commits into
base: docs/develop
Choose a base branch
from
Open

Add Unified Memory #3494

wants to merge 6 commits into from

Conversation

matyas-streamhpc
Copy link

No description provided.

@matyas-streamhpc matyas-streamhpc self-assigned this May 22, 2024
@matyas-streamhpc matyas-streamhpc force-pushed the unified-memory branch 2 times, most recently from 94f2948 to a610e91 Compare May 22, 2024 19:36
@neon60 neon60 changed the base branch from develop to docs/develop May 23, 2024 08:49
@neon60
Copy link
Contributor

neon60 commented May 23, 2024

Send some comments via direct message. The short summary:

  • Add the list of missing CUDA features
  • Add system requirements for Unified Memory
  • Add Performance improvements tips
  • Allocation APIs for System-Allocated Memory? Is this supported?
  • Tutorial is not needed, example is enough.

@neon60
Copy link
Contributor

neon60 commented May 28, 2024

@matyas-streamhpc Please rebase your branch on docs/develop

@matyas-streamhpc matyas-streamhpc force-pushed the unified-memory branch 11 times, most recently from 05a8d2a to 4a9a4ac Compare May 31, 2024 21:43
docs/reference/unified_memory.rst Outdated Show resolved Hide resolved
docs/reference/unified_memory.rst Outdated Show resolved Hide resolved
docs/reference/unified_memory.rst Outdated Show resolved Hide resolved
docs/reference/unified_memory.rst Outdated Show resolved Hide resolved
docs/reference/unified_memory.rst Outdated Show resolved Hide resolved
@neon60
Copy link
Contributor

neon60 commented Jun 10, 2024

@matyas-streamhpc rebased the branch.

high-performance low latency operations, while the GPU is optimized for
high-throughput (data processed by unit time).

How-to use?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

Using Unified Memory Management (UMM)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

and a GPU which would require large memory transfers. Here are some areas where
UMM can be beneficial:

- **Simplification of Memory Management**:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

Simplification of memory management

UMM can help to simplify the complexities of memory management. This can make it easier for developers to write code without worrying about memory allocation and deallocation details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

As a positive side effect, the use of UMM can reduce the lines of code,
thereby improving programming productivity.

In HIP, pinned memory allocations are coherent by default. Pinned memory is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

In HIP, pinned memory allocations are coherent by default. Pinned memory is host memory mapped into the address space of all GPUs, meaning that the pointer can be used on both host and device. Using pinned memory instead of pageable memory on the host can improve bandwidth.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

pointer can be used on both host and device. Using pinned memory instead of
pageable memory on the host can lead an improvement in bandwidth.

While UMM can provide numerous benefits, it is also important

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended

While UMM can provide numerous benefits, it is important to be aware of the potential performance overhead associated with UMM. You must thoroughly test and profile your code to ensure it is the most suitable choice for your use case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


The following example shows how to use unified memory management with
``hipMallocManaged()``, function, with ``__managed__`` attribute for static
allocation and standard ``malloc()`` allocation. The Explicit Memory

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is EMM presented for comparison? Do you mean in the figure? Consider adding a link to the section, so users don't have to scroll back and forth.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant in the last tab of the tabset. I updated this section.

}


Compiler Hints for the Better Performance

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help establish context with UMM? Do we mean C++ compiler hints?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are UMM compiler hints. I updated the header.

@Rmalavally
Copy link

@matyas-streamhpc @yhuiYH added some comments from a user standpoint. Please feel free to use or ignore them. Let me know if they are not technically applicable or relevant.

@Rmalavally Rmalavally closed this Jun 13, 2024
@Rmalavally Rmalavally reopened this Jun 13, 2024
@matyas-streamhpc
Copy link
Author

@matyas-streamhpc @yhuiYH added some comments from a user standpoint. Please feel free to use or ignore them. Let me know if they are not technically applicable or relevant.

@Rmalavally @yhuiYH Thank you very much for your feedback. I updated the page, accordingly.


❌: **Unsupported**

:sup:`1` Works only with ``XNACK=1``. First GPU access causes recoverable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@neon60
Copy link
Contributor

neon60 commented Jun 14, 2024

Please fix the following rst issues:

HIP\docs\reference\unified_memory_reference.rst:12: WARNING: Duplicate C++ declaration, also defined at reference/unified_memory_reference:10.
Declaration is '.. cpp:function:: hipError_t hipMallocManaged (void **dev_ptr, size_t size, unsigned int flags)'.
HIP\docs\reference/unified_memory_reference.rst:8: CRITICAL: Duplicate ID: "group___memory_m_1gaadf4780d920bb6f5cc755880740ef7dc".
HIP\docs\reference/unified_memory_reference.rst:8: WARNING: Duplicate explicit target name: "group___memory_m_1gaadf4780d920bb6f5cc755880740ef7dc".
HIP\docs\reference/unified_memory_reference.rst:15: WARNING: doxygenfunction: Cannot find function "hipMemoryAdvise" in doxygen xml output for project "HIP 6.2.0 Documentation" from directory: HIP\docs\doxygen\xml
HIP\docs\reference/unified_memory_reference.rst:17: WARNING: doxygenfunction: Cannot find function "hipMemRangeAttribute" in doxygen xml output for project "HIP 6.2.0 Documentation" from directory: HIP\docs\doxygen\xm

Concept
=======

In a conventional architectures, both CPUs a GPUs have a dedicated memory,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Matyas. Appreciate the prompt fixes.

Some more minor fixes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I applied the recommended fixes.

Concept
=======

In a conventional architectures, both CPUs a GPUs have a dedicated memory,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove 'a' in

In a conventional architectures

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been removed in the last commit, but Github UI does not show it, because there was a comment on it.
You can see the latest version here:
https://github.com/ROCm/HIP/pull/3494/files#diff-9ffae9738c016086230e9386a7d0aa1b26bd7a8f4845494ba9952cf0de47c1db

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants