Skip to content

Commit

Permalink
Rest of the changes from docs/6.1.1
Browse files Browse the repository at this point in the history
- Fix precision support link
- Fix data paths

Correct file paths
  • Loading branch information
neon60 committed May 17, 2024
1 parent aa1a41d commit 749c6c3
Show file tree
Hide file tree
Showing 21 changed files with 12 additions and 12 deletions.
Binary file not shown.
Empty file.
10 changes: 5 additions & 5 deletions docs/understand/hardware_implementation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The amount of warps that can reside concurrently on a CU, known
as occupancy, is determined by the warp's resource usage of registers and
shared memory.

.. figure:: ../data/hardware_implementation/compute_unit.svg
.. figure:: ../data/understand/hardware_implementation/compute_unit.svg
:alt: Diagram depicting the general structure of a compute unit of an AMD
GPU.

Expand Down Expand Up @@ -110,9 +110,9 @@ The general structure of CUs stays mostly as it is in GCN
architectures. The most prominent change is the addition of matrix ALUs, which
can greatly improve the performance of algorithms involving matrix
multiply-accumulate operations for
:doc:`int8, float16, bfloat16 or float32<rocm:about/compatibility/precision-support>`.
:doc:`int8, float16, bfloat16 or float32<rocm:compatibility/precision-support>`.

.. figure:: ../data/hardware_implementation/cdna3_cu.png
.. figure:: ../data/understand/hardware_implementation/cdna3_cu.png
:alt: Block diagram showing the structure of a CDNA3 compute unit. It includes
Shader Cores, the Matrix Core Unit, a Local Data Share used for sharing
memory between threads in a block, an L1 Cache and a Scheduler. The
Expand All @@ -136,7 +136,7 @@ It also adds an extra layer of cache to the WGP, shared by the CUs
within it. This cache is referred to as L1 cache, promoting the per-CU cache to
an L0 cache.

.. figure:: ../data/hardware_implementation/rdna3_cu.png
.. figure:: ../data/understand/hardware_implementation/rdna3_cu.png
:alt: Block diagram showing the structure of an RDNA3 Compute Unit. It
consists of four SIMD units, each including a vector and scalar register
file, with the corresponding scalar and vector ALUs. All four SIMDs
Expand All @@ -152,7 +152,7 @@ For hardware implementation's sake, multiple CUs are grouped
together into a Shader Engine or Compute Engine, typically sharing some fixed
function units or memory subsystem resources.

.. figure:: ../data/hardware_implementation/cdna2_gcd.png
.. figure:: ../data/understand/hardware_implementation/cdna2_gcd.png
:alt: Block diagram showing four Compute Engines each with 28 Compute Units
inside. These four Compute Engines share one block of L2 Cache. Around
them are four Memory Controllers. To the top and bottom of all these are
Expand Down
8 changes: 4 additions & 4 deletions docs/understand/programming_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ AMD block diagrams, or as streaming multiprocessor (SM).

.. _rdna3_cu:

.. figure:: ../data/programming_model/understand/rdna3_cu.png
.. figure:: ../data/understand/programming_model/rdna3_cu.png
:alt: Block diagram showing the structure of an RDNA3 Compute Unit. It
consists of four SIMD units, each including a vector and scalar register
file, with the corresponding scalar and vector ALUs. All four SIMDs
Expand All @@ -41,7 +41,7 @@ AMD block diagrams, or as streaming multiprocessor (SM).

.. _cdna3_cu:

.. figure:: ../data/programming_model/understand/cdna3_cu.png
.. figure:: ../data/understand/programming_model/cdna3_cu.png
:alt: Block diagram showing the structure of a CDNA3 compute unit. It includes
Shader Cores, the Matrix Core Unit, a Local Data Share used for sharing
memory between threads in a block, an L1 Cache and a Scheduler. The
Expand All @@ -56,7 +56,7 @@ memory subsystem resources.

.. _cdna2_gcd:

.. figure:: ../data/programming_model/understand/cdna2_gcd.png
.. figure:: ../data/understand/programming_model/cdna2_gcd.png
:alt: Block diagram showing four Compute Engines each with 28 Compute Units
inside. These four Compute Engines share one block of L2 Cache. Around
them are four Memory Controllers. To the top and bottom of all these are
Expand Down Expand Up @@ -103,7 +103,7 @@ typically look the following:

.. _simt:

.. figure:: ../data/programming_model/understand/simt.svg
.. figure:: ../data/understand/programming_model/simt.svg
:alt: Image representing the instruction flow of a SIMT program. Two identical
arrows pointing downward with blocks representing the instructions
inside and ellipsis between the arrows. The instructions represented in
Expand Down
6 changes: 3 additions & 3 deletions docs/understand/programming_model_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The thread hierarchy inherent to how AMD GPUs operate is depicted in

.. _inherent_thread_hierarchy:

.. figure:: ../data/programming_model/reference/thread_hierarchy.svg
.. figure:: ../data/understand/programming_model_reference/thread_hierarchy.svg
:alt: Diagram depicting nested rectangles of varying color. The outermost one
titled "Grid", inside sets of uniform rectangles layered on one another
titled "Block". Each "Block" containing sets of uniform rectangles
Expand Down Expand Up @@ -93,7 +93,7 @@ The thread hierarchy abstraction of Cooperative Groups manifest as depicted in

.. _coop_thread_hierarchy:

.. figure:: ../data/programming_model/reference/thread_hierarchy_coop.svg
.. figure:: ../data/understand/programming_model_reference/thread_hierarchy_coop.svg
:alt: Diagram depicting nested rectangles of varying color. The outermost one
titled "Grid", inside sets of different sized rectangles layered on
one another titled "Block". Each "Block" containing sets of uniform
Expand Down Expand Up @@ -134,7 +134,7 @@ how they relate to the various levels of the threading model.

.. _memory_hierarchy:

.. figure:: ../data/programming_model/reference/memory_hierarchy.svg
.. figure:: ../data/understand/programming_model_reference/memory_hierarchy.svg
:alt: Diagram depicting nested rectangles of varying color. The outermost one
titled "Grid", inside on the upper half a rectangle titled "Cluster".
Inside it are two identical rectangles titled "Block", inside them are
Expand Down

0 comments on commit 749c6c3

Please sign in to comment.