koulanurag

🏠

Working from home

Anurag Koul koulanurag

🏠

Working from home

PostDoc @ Microsoft Research | Ph.D. - Oregon State University | Deep Reinforcement Learning

86 followers · 31 following

Microsoft
New York, New York
https://koulanurag.dev
@koulanurag
in/koulanurag

Achievements

x3 x2

Achievements

x3 x2

Highlights

Block or Report

Block or report koulanurag

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 1,887 146 Updated May 23, 2024

sarthakrastogi / quality-prompts

Python 603 37 Updated Jul 20, 2024

vwxyzjn / gym-microrts-paper

The source code for the gym-microrts paper.

Python 39 3 Updated Aug 5, 2022

openai / lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,173 163 Updated Jul 25, 2023

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,076 194 Updated Jun 24, 2024

JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.

Python 1,263 101 Updated Jun 13, 2024

facebookresearch / searchformer

Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".

Jupyter Notebook 280 13 Updated Jun 11, 2024

KindXiaoming / pykan

Kolmogorov Arnold Networks

Jupyter Notebook 13,848 1,239 Updated Jul 20, 2024

forhaoliu / chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

Python 212 17 Updated Sep 30, 2023

pytorch / torchtitan

A native PyTorch Library for large model training

Python 1,361 117 Updated Jul 20, 2024

cohere-ai / cohere-toolkit

Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

TypeScript 2,535 285 Updated Jul 20, 2024

understanding-search / maze-dataset

maze datasets for investigating OOD behavior of ML systems

Jupyter Notebook 14 3 Updated Jul 19, 2024

openai / openai-cookbook

Examples and guides for using the OpenAI API

MDX 57,740 9,118 Updated Jul 20, 2024

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,372 488 Updated Jul 13, 2024

maitrix-org / llm-reasoners

A library for advanced large language model reasoning

Python 1,006 81 Updated Jul 19, 2024

microsoft / LLF-Bench

A benchmark for evaluating learning agents based on just language feedback

Python 46 10 Updated Jul 8, 2024

gautierdag / bpeasy

Fast bare-bones BPE for modern tokenizer training

Python 129 2 Updated Dec 19, 2023

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Python 5,176 491 Updated Jul 11, 2024

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 8,771 799 Updated Jul 1, 2024

rasbt / LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 23,161 2,387 Updated Jul 19, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 15,842 903 Updated Jul 19, 2024

alexmolas / microsearch

Python 375 30 Updated Jul 16, 2024

VIRL-Platform / VIRL

(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life

Python 295 9 Updated Jul 10, 2024

maybe-finance / maybe

The OS for your personal finances

Ruby 28,805 2,185 Updated Jul 20, 2024

marc-rigter / waker

Official code for "Reward-Free Curricula for Training Robust World Models", ICLR 2024.

Python 24 2 Updated Jan 24, 2024

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 34,377 3,591 Updated Jul 16, 2024

google-deepmind / dqn_zoo

DQN Zoo is a collection of reference implementations of reinforcement learning agents developed at DeepMind based on the Deep Q-Network (DQN) agent.

Python 442 76 Updated Apr 6, 2024

koulanurag / opcc

Benchmark for "Offline Policy Comparison with Confidence"

Python 3 Updated Oct 25, 2023

danijar / director

Deep Hierarchical Planning from Pixels

Python 85 22 Updated Dec 21, 2022

tldraw / tldraw

SDK for creating whiteboards and canvas experiences on the web.

TypeScript 34,520 2,077 Updated Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anurag Koul koulanurag

Achievements

Achievements

Highlights

Block or report koulanurag

Stars

eric-mitchell / direct-preference-optimization

sarthakrastogi / quality-prompts

vwxyzjn / gym-microrts-paper

openai / lm-human-preferences

opendilab / awesome-RLHF

JonasGeiping / cramming

facebookresearch / searchformer

KindXiaoming / pykan

forhaoliu / chain-of-hindsight

pytorch / torchtitan

cohere-ai / cohere-toolkit

understanding-search / maze-dataset

openai / openai-cookbook

pytorch-labs / gpt-fast

maitrix-org / llm-reasoners

microsoft / LLF-Bench

gautierdag / bpeasy

google / gemma_pytorch

karpathy / minbpe

rasbt / LLMs-from-scratch

ml-explore / mlx

alexmolas / microsearch

VIRL-Platform / VIRL

maybe-finance / maybe

marc-rigter / waker

mlabonne / llm-course

google-deepmind / dqn_zoo

koulanurag / opcc

danijar / director

tldraw / tldraw