Highlights
- Pro
Block or Report
Block or report crazy-dreamer
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Sakana widget for Web. | 网页小组件版本的石蒜模拟器。
AthenaOS is a next generation AI-native operating system managed by Swarms of AI Agents
Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.
[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 2…
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Sequence Parallel Attention for Long Context LLM Model Training and Inference
Code for the paper "Language Models are Unsupervised Multitask Learners"
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to large…
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
📰 Must-read papers and blogs on Speculative Decoding ⚡️
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个
SGLang is yet another fast serving framework for large language models and vision language models.
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
A "large" language model running on a microcontroller
An unnecessarily tiny implementation of GPT-2 in NumPy.
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
vendor independent TinyML deep learning library, compiler and inference framework microcomputers and micro-controllers
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW