[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 whil…

Python 776 36 Updated Nov 2, 2024

andrewyng / translation-agent

Python 4,756 545 Updated Aug 4, 2024

comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 55,752 5,881 Updated Nov 5, 2024

wandb / wandb

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 9,112 673 Updated Nov 6, 2024

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 50,781 7,300 Updated Nov 6, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,314 152 Updated Aug 23, 2024

GoogleCloudPlatform / generative-ai

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI

Jupyter Notebook 7,311 1,984 Updated Nov 5, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,471 2,346 Updated Nov 6, 2024

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,495 193 Updated Oct 16, 2024

apple / corenet

CoreNet: A library for training deep neural networks

Jupyter Notebook 6,979 539 Updated Oct 14, 2024

jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 642 46 Updated Sep 27, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 33,757 4,154 Updated Nov 4, 2024

TIGER-AI-Lab / LongICLBench

Code and Data for "Long-context LLMs Struggle with Long In-context Learning"

Python 91 4 Updated Jul 1, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,314 2,742 Updated Oct 2, 2024

BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…

Python 12,624 859 Updated Oct 31, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 17,033 986 Updated Nov 5, 2024

databricks / dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,504 237 Updated May 1, 2024

microsoft / LLMLingua

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,596 252 Updated Aug 22, 2024

skip2 / go-qrcode

✨ QR Code encoder (Go)

Go 2,663 342 Updated Mar 1, 2024

zxing / zxing

ZXing ("Zebra Crossing") barcode scanning library for Java, Android

Java 32,810 9,367 Updated Oct 22, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 14,064 1,313 Updated Nov 5, 2024

余森彬 justdoit0823

Lists (7)

bigdata

database

go

java

llm

ml

python

Stars