Name		Name	Last commit message	Last commit date
parent directory ..
diags		diags
hallo		hallo
README.md		README.md
random_order_th_block.cu		random_order_th_block.cu

README.md

Uda Parallel programming

Micheal Abrahms

The GPU Programming Model

GPUs are now increasignly of interest as the CPU speed has stalled mostly due to termal problems, GPUs have a higher latency and can do a lot of work on data, but they run the same programm in the same data, while CPUs run different instructions over different data.

The GPU approacho to the problem is to create simpler processors with less debug and control HW, then the reduced complexity allows to add more processors which are less efficient, less flexible and do not have debug capabilities.

latency

Time
Seconds

throughput

Stuff/Time
Jobs/hour

Tipical GPU programm:

CPU allocates storage on GPU (malloc)
CPU copies input data from cpu -> GPU memcpy (cudaMemcpy)
CPU launches kernel(s) on the GPU to process the data (kernel lauch)
CPU copies results back to CPU from GPU (memcpy)

The main idea of GPU computations is to write a program as if it were to run on a single thread, but the GPU will run it on multiple threads.

Try to maximize the number of thread and the number of operations made in a thread.

Core GPU design tenets(tenets = principles)

Lots of simple compute units. Trade simple control for more compute.
Explicitly parallel programming Model
Optimize for throughput not latency

Programm diagram

Data CPU -> GPU
Data GPU -> CPU (1,2) cudaMemcpy
Allocate GPU memory (3) cudaMalloc
Launch kernel in GPU

Thread: One independent path of execution through the code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda_tutorial

cuda_tutorial

README.md

Uda Parallel programming

The GPU Programming Model

latency

throughput

Tipical GPU programm:

Core GPU design tenets(tenets = principles)

Programm diagram

Files

cuda_tutorial

Directory actions

More options

Directory actions

More options

Latest commit

History

cuda_tutorial

Folders and files

parent directory

README.md

Uda Parallel programming

The GPU Programming Model

latency

throughput

Tipical GPU programm:

Core GPU design tenets(tenets = principles)

Programm diagram