Skip to content
View wxd000000's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report wxd000000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,148 468 Updated Jun 7, 2024

🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Cuda 791 80 Updated Jun 23, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 27,959 3,229 Updated Jun 23, 2024

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 5,780 658 Updated Jun 20, 2024

The road to hack SysML and become an system expert

Emacs Lisp 369 46 Updated Jun 18, 2024

Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems …

Shell 4,701 1,343 Updated Mar 6, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 31,551 3,762 Updated Jun 15, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 741 59 Updated Jun 24, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,598 80 Updated Jan 21, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 1,575 250 Updated Jun 25, 2024

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Python 2,707 174 Updated Jun 25, 2024

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,746 162 Updated Jun 12, 2024

Transformer related optimization, including BERT, GPT

C++ 5,610 874 Updated Mar 27, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

803 66 Updated Jun 24, 2024

Fast and memory-efficient exact attention

Python 11,705 1,036 Updated Jun 13, 2024

GPT4All: Chat with Local LLMs on Any Device

C++ 66,049 7,267 Updated Jun 24, 2024

Official Implementation of EAGLE

Python 565 53 Updated May 26, 2024

Awesome LLM compression research papers and tools.

839 48 Updated Jun 25, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,289 788 Updated Jun 25, 2024

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

1,808 128 Updated Jun 24, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,039 2,431 Updated Jun 24, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 23,287 3,088 Updated Jun 4, 2024

Learning material for CMU10-714: Deep Learning System

Jupyter Notebook 169 28 Updated May 12, 2024

欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。

Jupyter Notebook 218 27 Updated Apr 10, 2024

Development repository for the Triton language and compiler

C++ 11,797 1,389 Updated Jun 25, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 1,977 128 Updated Jun 25, 2024

为 Eijhout 教授的Introduction to HPC提供中文翻译、 PPT和Lab。

C 252 37 Updated Apr 11, 2022

Building blocks for digital commerce

TypeScript 23,614 2,246 Updated Jun 25, 2024

Ongoing research training transformer models at scale

Python 9,185 2,073 Updated Jun 24, 2024
Next