![:shipit: :shipit:](https://github.githubassets.com/images/icons/emoji/shipit.png)
Highlights
Block or Report
Block or report hvaara
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Fast and memory-efficient exact attention
Transformer related optimization, including BERT, GPT
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capa…
📰 Must-read papers and blogs on Speculative Decoding ⚡️
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Agentic components of the Llama Stack APIs
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Portable Text is a JSON based rich text specification for modern content editing platforms.
Utilities intended for use with Llama models.
A comprehensive repository of reasoning tasks for LLMs (and beyond)
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Turn expensive prompts into cheap fine-tuned models
Matlab Algorithms for Randomized Linear Algebra
A massively parallel, high-level programming language
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
A massively parallel, optimal functional runtime in Rust
Temporary repository for Kind2's refactor based on HVM2
RuLES: a benchmark for evaluating rule-following in language models
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"