wxd000000

wxd000000

6 followers · 16 following

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Highlights

Block or Report

Block or report wxd000000

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,148 468 Updated Jun 7, 2024

DefTruth / CUDA-Learn-Notes

🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记，更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Cuda 791 80 Updated Jun 23, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 27,959 3,229 Updated Jun 23, 2024

Oneflow-Inc / oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,780 658 Updated Jun 20, 2024

Jack47 / hack-SysML

The road to hack SysML and become an system expert

Emacs Lisp 369 46 Updated Jun 18, 2024

Jeevan-kumar-Raj / Grokking-System-Design

Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems …

Shell 4,701 1,343 Updated Mar 6, 2024

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 31,551 3,762 Updated Jun 15, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 741 59 Updated Jun 24, 2024

S-LoRA / S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,598 80 Updated Jan 21, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,575 250 Updated Jun 25, 2024

sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Python 2,707 174 Updated Jun 25, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,746 162 Updated Jun 12, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 5,610 874 Updated Mar 27, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

803 66 Updated Jun 24, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 11,705 1,036 Updated Jun 13, 2024

nomic-ai / gpt4all

GPT4All: Chat with Local LLMs on Any Device

C++ 66,049 7,267 Updated Jun 24, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE

Python 565 53 Updated May 26, 2024

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

839 48 Updated Jun 25, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,289 788 Updated Jun 25, 2024

hao-ai-lab / LookaheadDecoding

Python 1,024 62 Updated Feb 14, 2024

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

1,808 128 Updated Jun 24, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,039 2,431 Updated Jun 24, 2024

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 23,287 3,088 Updated Jun 4, 2024

PKUFlyingPig / CMU10-714

Learning material for CMU10-714: Deep Learning System

Jupyter Notebook 169 28 Updated May 12, 2024

Glanvery / LLM-Travel

欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。

Jupyter Notebook 218 27 Updated Apr 10, 2024

triton-lang / triton

Development repository for the Triton language and compiler

C++ 11,797 1,389 Updated Jun 25, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 1,977 128 Updated Jun 25, 2024

realYurkOfGitHub / translation-Introduction-to-HPC

为 Eijhout 教授的Introduction to HPC提供中文翻译、 PPT和Lab。

C 252 37 Updated Apr 11, 2022

medusajs / medusa

Building blocks for digital commerce

TypeScript 23,614 2,246 Updated Jun 25, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 9,185 2,073 Updated Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wxd000000

Achievements

Achievements

Highlights

Block or report wxd000000

Stars

yangjianxin1 / Firefly

DefTruth / CUDA-Learn-Notes

RVC-Boss / GPT-SoVITS

Oneflow-Inc / oneflow

Jack47 / hack-SysML

Jeevan-kumar-Raj / Grokking-System-Design

coqui-ai / TTS

flashinfer-ai / flashinfer

S-LoRA / S-LoRA

NVIDIA / TransformerEngine

sgl-project / sglang

microsoft / DeepSpeed-MII

NVIDIA / FasterTransformer

AIoT-MLSys-Lab / Efficient-LLMs-Survey

Dao-AILab / flash-attention

nomic-ai / gpt4all

SafeAILab / EAGLE

HuangOwen / Awesome-LLM-Compression

NVIDIA / TensorRT-LLM

hao-ai-lab / LookaheadDecoding

DefTruth / Awesome-LLM-Inference

microsoft / unilm

openai / CLIP

PKUFlyingPig / CMU10-714

Glanvery / LLM-Travel

triton-lang / triton

FasterDecoding / Medusa

realYurkOfGitHub / translation-Introduction-to-HPC

medusajs / medusa

NVIDIA / Megatron-LM