- Stanford
- https://twitter.com/karpathy
Highlights
- Pro
Stars
NanoGPT (124M) quality in 7.8 8xH100-minutes
A native PyTorch Library for large model training
Efficient Triton Kernels for LLM Training
A MLX port of FLUX based on the Huggingface Diffusers implementation.
Official inference repo for FLUX.1 models
the scott CPU from "But How Do It Know?" by J. Clark Scott
Run PyTorch LLMs locally on servers, desktop and mobile
A lightweight library for portable low-level GPU computation using WebGPU.
Simple Byte pair Encoding mechanism used for tokenization process . written purely in C
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Implementation of Diffusion Transformer (DiT) in JAX
Schedule-Free Optimization in PyTorch
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
A minimal GPU design in Verilog to learn how GPUs work from the ground up
lightweight, standalone C++ inference engine for Google's Gemma models.
Distribute and run LLMs with a single file.
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Fast bare-bones BPE for modern tokenizer training
The official PyTorch implementation of Google's Gemma models
A benchmark to evaluate language models on questions I've previously asked them to solve.
Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""