Highlights
- Pro
Block or Report
Block or report Inigo-13
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
Library for fast text representation and classification.
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
👨💻 An awesome and curated list of best code-LLM for research.
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Open-source vector similarity search for Postgres
A complement to pgvector for high performance, cost efficient vector search on large workloads.
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
use multiple proxies with Scrapy
Fast and memory-efficient exact attention
Build and run containers leveraging NVIDIA GPUs
An extremely fast Python package installer and resolver, written in Rust.
Robust recipes to align language models with human and AI preferences
The easiest way to use Agentic RAG in any enterprise
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
Multilingual Sentence & Image Embeddings with BERT
Python dependency injection framework, inspired by Guice
Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.