#

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Here are 4,921 public repositories matching this topic...

beam-cloud / beta9

The open-source serverless GPU container runtime.

gpu distributed-computing cuda self-hosted fine-tuning ml-platform large-language-models llm generative-ai llm-inference

Updated Jun 2, 2024
Go

DejvBayer / afft

C++17 wrapper library for fft-related computations on CPUs and GPUs

cuda fft hip dct mkl dst cufft rocm dtt fftw3 pocketfft vkfft

Updated Jun 2, 2024
C++

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated Jun 2, 2024
Python

QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support

c-plus-plus hpc gpu mpi cuda high-performance-computing quantum-chemistry quantum-monte-carlo electronic-structure

Updated Jun 2, 2024
C++

NVIDIA / cccl

CUDA C++ Core Libraries

cpp hpc gpu modern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing parallel-algorithm parallel-programming nvidia-gpu gpu-programming cuda-library cpp-programming cuda-programming accelerated-computing cuda-cpp

Updated Jun 2, 2024
C++

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

machine-learning compiler runtime tensorflow vulkan cuda pytorch spirv jax mlir

Updated Jun 2, 2024
C++

shocker-0x15 / GfxExp

Sandbox for graphics paper implementation

neural-network gpu cuda raytracing ray-tracing optix path-tracing

Updated Jun 2, 2024
C++

onediff

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

cuda pytorch lora lcm performance-optimization inference-engine diffusion-models stable-diffusion diffusers sd-webui comfyui sdxl aigc-serving lcm-lora stable-video-diffusion sdxl-turbo comfyui-workflow

Updated Jun 2, 2024
Python

LuisaGroup / LuisaRender

High-Performance Cross-Platform Monte Carlo Renderer Based on LuisaCompute

metal cpp gpu high-performance rendering cuda renderer ray-tracing optix path-tracing ispc siggraph-asia-2022

Updated Jun 2, 2024
C++

janhq / cortex

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan

ai cuda llama accelerated inference-engine openai-api llm stable-diffusion llms llamacpp llama2 gguf tensorrt-llm

Updated Jun 2, 2024
C++

LuisaGroup / LuisaCompute

High-Performance Rendering Framework on Stream Architectures

cpu metal cross-platform gpu dsl graphics high-performance rendering llvm directx cuda raytracing rtx optix ispc dxr siggraph-asia-2022

Updated Jun 2, 2024
C++

bergolho / MonoAlg3D_C

High Performance Monodomain program for cardiac eletrophysiology simulations.

c medicine cuda electrophysiology

Updated Jun 2, 2024
C

replicate / cog

Containers for machine learning

docker machine-learning ai deep-learning containers tensorflow cuda pytorch

Updated Jun 2, 2024
Python

LordMathis / CUDANet

Convolutional Neural Network inference library running on CUDA

cpp cuda pytorch convolutional-neural-networks

Updated Jun 2, 2024
Cuda

catboost / catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

python data-science machine-learning data-mining tutorial r big-data gpu cuda kaggle gbdt gbm gpu-computing decision-trees gradient-boosting coreml catboost categorical-features

Updated Jun 2, 2024
Python

sebhtml / novigrad

An animal can do training and inference every day of its existence until the day of its death. A forward pass is all you need.

training rust ai neural-network cuda embeddings tensor language-model

Updated Jun 2, 2024
Rust

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

machine-learning deep-learning cuda pytorch nvidia jetson tensorrt libtorch

Updated Jun 2, 2024
Python

tkob-vh / CUDA_practice

Some general algorithms implemented in cuda.

cpp hpc cuda nvidia heterogeneous-parallel-programming nvidia-cuda

Updated Jun 2, 2024
Cuda

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

gpu cuda pytorch tvm llm-inference flash-attention large-large-models

Updated Jun 2, 2024
Cuda

brucefan1983 / GPUMD

Graphics Processing Units Molecular Dynamics

machine-learning neural-network simulation gpu cuda molecular-dynamics neuroevolution high-performance-computing molecular-dynamics-simulation phonon physics-simulation natural-evolution-strategies heat-transport gpumd machine-learning-potential

Updated Jun 2, 2024
Cuda

Created by Nvidia

Released June 23, 2007

Followers: 202 followers
Website: developer.nvidia.com/cuda-zone
Wikipedia: Wikipedia

Related Topics