-
DAMO Academy
Stars
Efficient Triton Kernels for LLM Training
CUDA accelerated rasterization of gaussian splatting
Official inference repo for FLUX.1 models
Run PyTorch LLMs locally on servers, desktop and mobile
OpenBot leverages smartphones as brains for low-cost robots. We have designed a small electric vehicle that costs about $50 and serves as a robot body. Our software stack for Android smartphones su…
Machine Learning Engineering Open Book
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Dynamic Memory Management for Serving LLMs without PagedAttention
flash attention tutorial written in python, triton, cuda, cutlass
An extremely fast Python package and project manager, written in Rust.
A fast communication-overlapping library for tensor parallelism on GPUs.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design
Beautifully designed components that you can copy and paste into your apps. Accessible. Customizable. Open Source.
DSPy: The framework for programming—not prompting—foundation models
Universal LLM Deployment Engine with ML Compilation
《Software Engineering at Google》的中英文对译版本
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
A generative speech model for daily dialogue.
A native PyTorch Library for large model training