Skip to content
View JunnYu's full-sized avatar
🐢
Focusing
🐢
Focusing

Organizations

@PaddlePaddle

Block or report JunnYu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A multi-level tensor algebra superoptimizer

C++ 407 21 Updated Oct 1, 2024

离线证件水印助手

HTML 55 23 Updated Jun 7, 2024

Example fasthtml applications demonstrating a range of web programming techniques

CSS 607 93 Updated Oct 3, 2024

The fastest way to create an HTML app

Jupyter Notebook 5,232 218 Updated Oct 7, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,634 264 Updated Oct 1, 2024

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,811 303 Updated Sep 29, 2024

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 323 20 Updated Sep 19, 2024

[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Parallelism

Python 42 1 Updated Jul 31, 2024

A collection of memory efficient attention operators implemented in the Triton language.

Python 211 14 Updated Jun 5, 2024

Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Python 188 9 Updated Aug 19, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 243 27 Updated Sep 27, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 5,438 403 Updated Oct 7, 2024

FlagGems is an operator library for large language models implemented in Triton Language.

Python 282 25 Updated Oct 6, 2024

Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"

Python 788 46 Updated Aug 21, 2024

Awesome Papers related to Mamba.

1,109 61 Updated Sep 10, 2024

The Fast Cross-Platform Package Manager

C++ 6,801 351 Updated Oct 7, 2024

A generative speech model for daily dialogue.

Python 31,246 3,386 Updated Sep 21, 2024

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 290 55 Updated Aug 12, 2024

Mamba SSM architecture

Python 12,755 1,077 Updated Oct 7, 2024

Mora: More like Sora for Generalist Video Generation

Python 1,491 94 Updated Jun 21, 2024

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,343 215 Updated Sep 27, 2024

JAX bindings for Flash Attention v2

C++ 77 3 Updated Jul 15, 2024

Grok open release

Python 49,461 8,325 Updated Aug 30, 2024

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,946 3,917 Updated Oct 7, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,382 143 Updated Sep 27, 2024

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"

Python 1,053 99 Updated Mar 10, 2024

Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"

122 3 Updated Apr 28, 2024
Next