shreyansh26

Follow

👨‍🎓

Always Learning

Shreyansh Singh shreyansh26

👨‍🎓

Always Learning

Follow

Lead ML Engineer at Level AI. Ex- AI at @Mastercard, @Samsung Research. CS Grad, IIT (BHU) Varanasi.

277 followers · 232 following

Achievements

Achievements

Highlights

Developer Program Member

Organizations

Lists (4)

Sort

AI Tools

Aligning LLMs

LLM Reasoning

MLSys

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 353 17 Updated Oct 19, 2024

dottxt-ai / outlines

Structured Text Generation

Python 8,804 437 Updated Oct 18, 2024

sgl-project / sgl-learning-materials

Materials for learning SGLang

38 3 Updated Oct 18, 2024

1rgs / jsonformer

A Bulletproof Way to Generate Structured JSON from Language Models

Jupyter Notebook 4,423 155 Updated Feb 24, 2024

srush / awesome-o1

TeX 356 11 Updated Oct 18, 2024

SouthBridgeAI / diagen

TypeScript 62 1 Updated Oct 16, 2024

hrishioa / lumentis

AI powered one-click comprehensive docs from transcripts and text.

TypeScript 1,549 97 Updated Aug 30, 2024

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 675 42 Updated Oct 18, 2024

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 2,736 268 Updated Oct 17, 2024

xjdr-alt / entropix

Entropy Based Sampling and Parallel CoT Decoding

TypeScript 2,675 275 Updated Oct 16, 2024

cchan / tccl

extensible collectives library in triton

Python 56 2 Updated Sep 23, 2024

mosaicml / llm-foundry

LLM training code for Databricks foundation models

Python 4,018 524 Updated Oct 19, 2024

pytorch / torchtitan

A native PyTorch Library for large model training

Python 2,458 182 Updated Oct 18, 2024

charlesfrye / cuda-substrings

Because it's there.

Python 14 1 Updated Sep 22, 2024

ColfaxResearch / cutlass-kernels

Cuda 150 28 Updated Jul 11, 2024

Infini-AI-Lab / MagicDec

Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

JavaScript 64 4 Updated Oct 2, 2024

FasterDecoding / TEAL

Python 86 2 Updated Sep 24, 2024

andy-yang-1 / DoubleSparse

16-fold memory access reduction with nearly no loss

Python 46 1 Updated Aug 18, 2024

mobiusml / gemlite

Simple and fast low-bit matmul kernels in CUDA / Triton

Python 109 8 Updated Oct 18, 2024

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,711 259 Updated Oct 18, 2024

triton-inference-server / vllm_backend

Python 179 19 Updated Oct 9, 2024

thecharlieblake / lovely-llama

An implementation of the Llama architecture, to instruct and delight

Python 21 Updated Aug 16, 2024

magicproduct / hash-hop

Long context evaluation for large language models

Python 183 15 Updated Oct 14, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 598 24 Updated Sep 21, 2024

shadowpa0327 / Palu

Code for Palu: Compressing KV-Cache with Low-Rank Projection

Python 48 2 Updated Oct 3, 2024

galeselee / Awesome_LLM_System-PaperList

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…

157 6 Updated Oct 17, 2024

pytorch-labs / applied-ai

Applied AI experiments and examples for PyTorch

Python 146 12 Updated Oct 18, 2024

Bruce-Lee-LY / cuda_hgemm

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 279 65 Updated Sep 8, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 3,238 173 Updated Oct 17, 2024

samkhur006 / awesome-llm-planning-reasoning

A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.

159 10 Updated Aug 28, 2024

Starred topics

Natural language processing

embeddings

Twitter

Tensorflow

Machine learning

Deep learning