Skip to content
View sjuxax's full-sized avatar

Sponsoring

@jd
@amark
@webrecorder

Organizations

@deseret-tech
Block or Report

Block or report sjuxax

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official inference repo for FLUX.1 models

Python 5,010 284 Updated Aug 6, 2024

A family of lightweight multimodal models.

Python 841 64 Updated Aug 2, 2024

The Triton TensorRT-LLM Backend

Python 626 88 Updated Aug 7, 2024

This repository contains integer operators on GPUs for PyTorch.

Python 162 48 Updated Sep 29, 2023

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,136 130 Updated Jul 12, 2024

A pytorch quantization backend for optimum

Python 707 43 Updated Aug 2, 2024

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

Python 3,713 349 Updated Aug 6, 2024

ComfyUI nodes to use segment-anything-2

Python 388 20 Updated Aug 5, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 8,634 508 Updated Aug 6, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 611 38 Updated Aug 6, 2024

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,042 370 Updated Aug 7, 2024

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHX…

Python 2,567 201 Updated Mar 31, 2024

A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.

Python 130 9 Updated Jul 24, 2024

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 1,984 129 Updated Aug 6, 2024

Easy and Efficient Quantization for Transformers

C++ 165 13 Updated Jul 15, 2024

Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,745 245 Updated Aug 7, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 36,034 4,430 Updated Aug 6, 2024

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,531 177 Updated Aug 7, 2024

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.

Rust 7,463 303 Updated Aug 6, 2024

Hybrid search engine, combining best features of text and semantic search worlds

Scala 56 3 Updated Aug 5, 2024

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

Python 758 40 Updated Aug 2, 2024

Lightning Fast: Faiss CPU + Onnx Quantized Multilingual Embedding Model

Python 21 Updated Jul 5, 2024

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,106 249 Updated Aug 7, 2024

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"

Python 141 19 Updated Aug 7, 2024

💭 Retrieval augmented generation (RAG) and language model powered search applications

Python 259 14 Updated Jan 16, 2024

DSPy: The framework for programming—not prompting—foundation models

Python 15,259 1,181 Updated Aug 6, 2024

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 9,124 851 Updated Aug 6, 2024

🔍 LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your d…

Python 15,029 1,744 Updated Aug 6, 2024

All-in-one infrastructure for search, recommendations, RAG, and analytics offered via API

Rust 1,231 112 Updated Aug 7, 2024
Next