Skip to content
View breaddaerb's full-sized avatar
💭
🐈‍⬛
💭
🐈‍⬛
  • Zhejiang
  • 15:20 (UTC +08:00)

Highlights

  • Pro

Block or report breaddaerb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 10,594 2,107 Updated Sep 24, 2024

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 577 22 Updated Sep 26, 2024

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 6,943 440 Updated Sep 26, 2024

Helpful tools and examples for working with flex-attention

Python 363 15 Updated Aug 17, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,284 920 Updated Sep 26, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,291 386 Updated Sep 27, 2024

Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam

Python 66 3 Updated Jul 28, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

2,554 174 Updated Sep 27, 2024

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Python 284 47 Updated Sep 26, 2024

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 17,571 1,682 Updated Sep 26, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,554 254 Updated Sep 23, 2024

maximal update parametrization (µP)

Jupyter Notebook 1,353 93 Updated Jul 17, 2024

LLM inference in C/C++

C++ 65,515 9,400 Updated Sep 27, 2024

LLM101n: Let's build a Storyteller

28,930 1,582 Updated Aug 1, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,225 152 Updated Jun 25, 2024

Structured Text Generation

Python 8,335 425 Updated Sep 26, 2024

User-friendly WebUI for AI (Formerly Ollama WebUI)

Svelte 40,438 4,745 Updated Sep 26, 2024

Zhejiang University Graduation Thesis LaTeX Template

TeX 2,554 601 Updated Sep 6, 2024

Code for Zero-Shot Tokenizer Transfer

Python 111 8 Updated Jul 4, 2024

A collection of AWESOME things about mixture-of-experts

929 70 Updated Jul 31, 2024

Ongoing research training transformer models at scale

Python 10,116 2,278 Updated Sep 27, 2024

Minimal implementation of scalable rectified flow transformers, based on SD3's approach

Jupyter Notebook 421 29 Updated Jul 1, 2024

A playbook for systematically maximizing the performance of deep learning models.

26,581 2,209 Updated Jun 18, 2024

A byte-level decoder architecture that matches the performance of tokenized Transformers.

Jupyter Notebook 57 6 Updated Apr 24, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 37,523 3,942 Updated Jul 28, 2024

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,224 66 Updated Sep 26, 2024

A Native-PyTorch Library for LLM Fine-tuning

Python 4,028 368 Updated Sep 27, 2024

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,512 3,880 Updated Sep 26, 2024

PyContinual (An Easy and Extendible Framework for Continual Learning)

Python 299 64 Updated Jan 29, 2024

An Extensible Continual Learning Framework Focused on Language Models (LMs)

Python 240 18 Updated Jan 28, 2024
Next