Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cube/doc/readme #1904

Merged
merged 8 commits into from
Jul 12, 2024
Merged

Cube/doc/readme #1904

merged 8 commits into from
Jul 12, 2024

Conversation

nathanielsimard
Copy link
Member

First draft of the cubecl readme.

Copy link
Member

@louisfd louisfd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few fixes to do, the end is not complete, and we should have a section on element types


## TL;DR

With CubeCL, you can use Rust to program your GPU, any GPU!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence is not syntatically correct

The goal of CubeCL is to ease the pain of writing highly optimized compute kernels that are portable across hardware.
There is currently no adequate solution when you want optimal performance while still being multi-platform.
You either have to write custom kernels for different hardware, often with different languages such as CUDA, Metal, or ROCm.
To make it possible, we created a Just-in-Time compiler with three core features: **automatic vectorization**, **comptime**, and **autotune**!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it possible: not clear it concerns what at this point.

To make it possible, we created a Just-in-Time compiler with three core features: **automatic vectorization**, **comptime**, and **autotune**!

These features are extremely useful for anyone writing high-performance kernels, even when portability is not a concern.
They improve code composability, reusability, and maintainability, all while staying optimal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe testability

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also insist more on "Write kernel code like software, with software engineering good practices"


## Design

CubeCL is designed around - you guessed it - Cubes! More precisely, cuboids since not all axes are forced to be the same size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, it's based on cuboids, because not all axes are the same size.

Using because will help differentiating from next paragraph starting with since

_A cube is composed of units, so a 3x3x3 cube has 27 units that can be accessed by their positions along the x, y, and z axes.
Similarly, an hyper-cube is composed of cubes, just as a cube is composed of units.
Each cube in the hyper-cube can be accessed by its position relative to the hyper-cube along the x, y, and z axes.
Hence, an hyper-cube of 3x3x3 will have 27 cubes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a hyper-cube

In this example, the total number of working units would be 27 x 27 = 729._

<details>
<summary>Topology Equivalent 👇</summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equivalence

There are some limitations right now, some that could be addressed later on, but some that will stick around by design.

* Using functions with generic requires the generics to be specify at all time.
Since we don't have access to symbols during the procedure macro, we don't have the type information and aren't able to properly do type inference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have an idea how to fix it

## Resources

Check out our matmul example, which autotunes between a simple vectorized version, a tiled algorithm and one based cooperative matrix.
Clone the project and run the example locally to see how autotune fares and your own device.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo

Check out our matmul example, which autotunes between a simple vectorized version, a tiled algorithm and one based cooperative matrix.
Clone the project and run the example locally to see how autotune fares and your own device.

If you have any questions or want to contribute, don't hesitate to join the Discord.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which discord lol

@nathanielsimard nathanielsimard merged commit a4123f6 into main Jul 12, 2024
@nathanielsimard nathanielsimard deleted the cube/doc/readme branch July 12, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants