Skip to content
View shreyansh26's full-sized avatar
👨‍🎓
Always Learning
👨‍🎓
Always Learning

Organizations

@COPS-IITBHU

Block or report shreyansh26

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 353 17 Updated Oct 19, 2024

Structured Text Generation

Python 8,804 437 Updated Oct 18, 2024

Materials for learning SGLang

38 3 Updated Oct 18, 2024

A Bulletproof Way to Generate Structured JSON from Language Models

Jupyter Notebook 4,423 155 Updated Feb 24, 2024
TeX 356 11 Updated Oct 18, 2024
TypeScript 62 1 Updated Oct 16, 2024

AI powered one-click comprehensive docs from transcripts and text.

TypeScript 1,549 97 Updated Aug 30, 2024

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 675 42 Updated Oct 18, 2024

Material for gpu-mode lectures

Jupyter Notebook 2,736 268 Updated Oct 17, 2024

Entropy Based Sampling and Parallel CoT Decoding

TypeScript 2,675 275 Updated Oct 16, 2024

extensible collectives library in triton

Python 56 2 Updated Sep 23, 2024

LLM training code for Databricks foundation models

Python 4,018 524 Updated Oct 19, 2024

A native PyTorch Library for large model training

Python 2,458 182 Updated Oct 18, 2024

Because it's there.

Python 14 1 Updated Sep 22, 2024

Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

JavaScript 64 4 Updated Oct 2, 2024
Python 86 2 Updated Sep 24, 2024

16-fold memory access reduction with nearly no loss

Python 46 1 Updated Aug 18, 2024

Simple and fast low-bit matmul kernels in CUDA / Triton

Python 109 8 Updated Oct 18, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,711 259 Updated Oct 18, 2024

An implementation of the Llama architecture, to instruct and delight

Python 21 Updated Aug 16, 2024

Long context evaluation for large language models

Python 183 15 Updated Oct 14, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 598 24 Updated Sep 21, 2024

Code for Palu: Compressing KV-Cache with Low-Rank Projection

Python 48 2 Updated Oct 3, 2024

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…

157 6 Updated Oct 17, 2024

Applied AI experiments and examples for PyTorch

Python 146 12 Updated Oct 18, 2024

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 279 65 Updated Sep 8, 2024

Efficient Triton Kernels for LLM Training

Python 3,238 173 Updated Oct 17, 2024

A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.

159 10 Updated Aug 28, 2024
Next