[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 2…

C 768 129 Updated Jul 8, 2024

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

956 20 Updated Jul 31, 2024

feifeibear / long-context-attention

Sequence Parallel Attention for Long Context LLM Model Training and Inference

Python 246 9 Updated Jun 27, 2024

openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 22,087 5,459 Updated Jun 11, 2024

tysam-code / hlb-gpt

Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to large…

Python 262 24 Updated Jul 29, 2024

ridgerchu / matmulfreellm

Implementation for MatMul-free LM.

Python 2,788 169 Updated Jun 27, 2024

XueFuzhao / OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Python 1,321 64 Updated Mar 8, 2024

NVIDIA / nccl-tests

NCCL Tests

Cuda 776 226 Updated Jul 30, 2024

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 29,967 6,342 Updated Jul 26, 2024

metame-ai / awesome-llm-plaza

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

103 7 Updated Aug 1, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE-1 and EAGLE-2

Python 692 69 Updated Jul 30, 2024

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

295 12 Updated Aug 5, 2024

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 474 42 Updated Aug 7, 2024

xianshang33 / llm-paper-daily

Daily updated LLM papers. 每日更新 LLM 相关的论文，欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

830 31 Updated Jul 31, 2024

sgl-project / sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Python 3,954 244 Updated Aug 7, 2024

dingo-actual / infini-transformer

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)

Python 260 22 Updated May 4, 2024

maxbbraun / llama4micro

A "large" language model running on a microcontroller

C++ 466 33 Updated Dec 9, 2023

jaymody / picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,141 402 Updated Apr 24, 2023

kelseyhightower / nocode

The best way to write secure and reliable applications. Write nothing; deploy nowhere.

Dockerfile 60,171 4,719 Updated Aug 7, 2024

ai-techsystems / deepC

vendor independent TinyML deep learning library, compiler and inference framework microcomputers and micro-controllers

C++ 546 86 Updated Oct 29, 2022

astramind-ai / Mixture-of-depths

Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 116 7 Updated Jun 20, 2024

SJTU-IPADS / Bamboo

Bamboo-7B Large Language Model

88 1 Updated Mar 28, 2024

OpenBMB / cpm_kernels

Python 24 9 Updated Oct 2, 2023

ekzhu / datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,484 291 Updated Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crazy-dreamer

Highlights

Block or report crazy-dreamer

Starred repositories

aliyun / aicb

chenzomi12 / AISystem

dsrkafuu / sakana-widget

kyegomez / AthenaOS

DS3Lab / AC-SGD

BaguaSys / bagua

mit-han-lab / tinyengine