wwbitejotunn

Wang Bojun wwbitejotunn

PhD in Materials Science and Engineering at UESTC

3 followers · 34 following

UESTC
Chengdu, Sichuan, China

Achievements

Stars

Dao-AILab / fast-hadamard-transform

Fast Hadamard transform in CUDA, with a PyTorch interface

C 97 14 Updated May 24, 2024

cisco-open / pymultiworld

A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL

Python 14 4 Updated Sep 16, 2024

FlagOpen / FlagGems

FlagGems is an operator library for large language models implemented in Triton Language.

Python 285 26 Updated Oct 10, 2024

weishengying / cutlass_flash_atten_fp8

使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention

Cuda 48 3 Updated Aug 12, 2024

OpenBMB / InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Python 263 21 Updated Sep 25, 2024

VITA-MLLM / VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 839 43 Updated Oct 6, 2024

mit-han-lab / qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 413 19 Updated Sep 5, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 5,526 412 Updated Oct 9, 2024

FloridSleeves / LLMDebugger

LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step

Python 403 40 Updated Sep 10, 2024

yuweihao / MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Python 1,982 34 Updated Jun 6, 2024

microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 366 29 Updated Oct 9, 2024

42Shawn / LLaVA-PruMerge

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Python 94 5 Updated May 15, 2024

kaiwang960112 / SpeeD

SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Python 153 5 Updated Jul 5, 2024

pprp / Awesome-LLM-Prune

Awesome list for LLM pruning.

141 8 Updated Oct 10, 2024

harry0703 / MoneyPrinterTurbo

利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.

Python 16,416 2,606 Updated Jul 26, 2024

databricks / dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,498 236 Updated May 1, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,313 1,009 Updated Oct 8, 2024

Oneflow-Inc / diffusers

Python 15 1 Updated Feb 21, 2024

efeslab / fiddler

Fast Inference of MoE Models with CPU-GPU Orchestration

Python 167 16 Updated Sep 28, 2024

3DTopia / LGM

[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

Python 1,601 105 Updated Aug 20, 2024

libxsmm / libxsmm

Library for specialized dense and sparse matrix operations, and deep learning primitives.

C 844 181 Updated Oct 9, 2024

wang-xinyu / tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

C++ 6,926 1,765 Updated Oct 9, 2024

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,496 197 Updated Oct 10, 2024

Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 814 128 Updated Jul 29, 2023

cumulo-autumn / StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,536 685 Updated Jul 25, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,914 409 Updated Sep 6, 2024

XueFuzhao / OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Python 1,371 70 Updated Mar 8, 2024

microsoft / vscode

Visual Studio Code

TypeScript 163,317 28,902 Updated Oct 10, 2024

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 492 63 Updated Jun 5, 2024

joanrod / star-vector

110 1 Updated Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wang Bojun wwbitejotunn

Achievements

Achievements

Block or report wwbitejotunn

Stars

Dao-AILab / fast-hadamard-transform

cisco-open / pymultiworld

FlagOpen / FlagGems

weishengying / cutlass_flash_atten_fp8

OpenBMB / InfiniteBench

VITA-MLLM / VITA

mit-han-lab / qserve

sgl-project / sglang

FloridSleeves / LLMDebugger

yuweihao / MambaOut

microsoft / BitBLAS

42Shawn / LLaVA-PruMerge

kaiwang960112 / SpeeD

pprp / Awesome-LLM-Prune

harry0703 / MoneyPrinterTurbo

databricks / dbrx

PKU-YuanGroup / Open-Sora-Plan

Oneflow-Inc / diffusers

efeslab / fiddler

3DTopia / LGM

libxsmm / libxsmm

wang-xinyu / tensorrtx

ModelTC / lightllm

Liu-xiandong / How_to_optimize_in_GPU

cumulo-autumn / StreamDiffusion

SJTU-IPADS / PowerInfer

XueFuzhao / OpenMoE

microsoft / vscode

NVIDIA / nvbench

joanrod / star-vector