UranusSeven

🎯

Focusing

Uranus UranusSeven

🎯

Focusing

42 followers · 9 following

https://www.zhihu.com/people/840445

Achievements

x2 x3 x3

BetaSend feedback

Achievements

x2 x3 x3

BetaSend feedback

Block or Report

Block or report UranusSeven

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (9)

Sort

🚀 My stack

Stable Diffusion⭐️

3 repositories

Tools🔨

7 repositories

Training

1 repository

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …

Python 3,420 285 Updated Jun 28, 2024

pypa / auditwheel

Auditing and relabeling cross-distribution Linux wheels.

Python 423 138 Updated Jun 19, 2024

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 12,358 1,002 Updated Jun 27, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 3,111 281 Updated Jun 28, 2024

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,067 407 Updated Jun 28, 2024

meta-llama / codellama

Inference code for CodeLlama models

Python 15,400 1,784 Updated May 21, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 428 32 Updated Apr 22, 2024

apoorvumang / prompt-lookup-decoding

Jupyter Notebook 410 22 Updated Jun 25, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 745 60 Updated Jun 28, 2024

sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Python 2,742 176 Updated Jun 28, 2024

UranusSeven / Effective-LLM-Inference-Evaluation

A project aimed at measuring the real-world performance of Large Language Model (LLM) inference frameworks, inspired by the concepts in deepspeed-fastgen.

Python 5 Updated Jan 15, 2024

python-poetry / poetry

Python packaging and dependency management made easy

Python 30,279 2,234 Updated Jun 25, 2024

databricks / megablocks

Python 1,108 154 Updated May 28, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE

Python 610 57 Updated Jun 27, 2024

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 3,116 331 Updated Jun 28, 2024

haishanh / yacd

Yet Another Clash Dashboard

TypeScript 3,835 686 Updated Feb 8, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,752 162 Updated Jun 28, 2024

microsoft / promptbench

A unified evaluation framework for large language models

Python 2,253 176 Updated May 27, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,571 401 Updated Jun 28, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 63,798 7,410 Updated Jun 22, 2024

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 79,918 21,498 Updated Jun 29, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 9,226 2,082 Updated Jun 27, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 7,339 794 Updated Jun 27, 2024

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and…

Python 6,213 1,222 Updated Jun 28, 2024

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 35,562 4,787 Updated Jun 29, 2024