GPGPU microprocessor architecture
-
Updated
Apr 26, 2024 - C
GPGPU microprocessor architecture
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples.
qCUDA: GPGPU Virtualization at a New API Remoting Method with Para-virtualization
CUDA bindings for Ruby
How fast can we brute force a 64-bit comparison?
PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural fe…
C implementation for CPU and GPU of OpenSimplex 2
Fast Fourier Transform using the Vulkan API
🔭 cross platform general purpose GPU library - optimized for rendering
This serves as a repository for reproducibility of the SC21 paper "In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated Computing," as well as several components of the IPDPS21 paper "Demystifying GPU UVM Cost with Deep Runtime and Workload Analysis."
best CPU/GPU sparse solver for large sparse matrices
DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
Programming for Numerical Computation using C-OpenMP (Parallel Programming)
A versatile multifluid HD/MHD code that runs on clusters of CPUs or GPUs, with special emphasis on protoplanetary disks.
Implementation of an Image Processing Library for time consuming operations such as Image Blurring,Negation,Edge Detection and Contrast Stretching.
Real-time ray tracer renderer that allows to move freely in the scene with the keyboard and mouse, while streaming live the results from the GPU to the screen. Implemented using C++ with CUDA.
A configurable OpenCL memory benchmark for assessing access stride impact
Just a bunch of methods and scripts to test performances within containers
Add a description, image, and links to the gpu-computing topic page so that developers can more easily learn about it.
To associate your repository with the gpu-computing topic, visit your repo's landing page and select "manage topics."