Block or Report
Block or report singhranjodh
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (2)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Fast and customizable framework for automatic ML model creation (AutoML)
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Sparsity-aware deep learning inference runtime for CPUs
LLM Workshop by Sourab Mangrulkar
A Native-PyTorch Library for LLM Fine-tuning
Open-Sora: Democratizing Efficient Video Production for All
PyTorch code and models for V-JEPA self-supervised learning from video.
Description Describes the IndicNLP corpus and associated datasets
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Custom data types and layouts for training and inference
Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Official repository for 'GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation'
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.