Skip to content
View koulanurag's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro
Block or Report

Block or report koulanurag

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reference implementation for DPO (Direct Preference Optimization)

Python 1,887 146 Updated May 23, 2024

The source code for the gym-microrts paper.

Python 39 3 Updated Aug 5, 2022

Code for the paper Fine-Tuning Language Models from Human Preferences

Python 1,173 163 Updated Jul 25, 2023

A curated list of reinforcement learning with human feedback resources (continually updated)

3,076 194 Updated Jun 24, 2024

Cramming the training of a (BERT-type) language model into limited compute.

Python 1,263 101 Updated Jun 13, 2024

Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".

Jupyter Notebook 280 13 Updated Jun 11, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 13,848 1,239 Updated Jul 20, 2024

Chain-of-Hindsight, A Scalable RLHF Method

Python 212 17 Updated Sep 30, 2023

A native PyTorch Library for large model training

Python 1,361 117 Updated Jul 20, 2024

Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

TypeScript 2,535 285 Updated Jul 20, 2024

maze datasets for investigating OOD behavior of ML systems

Jupyter Notebook 14 3 Updated Jul 19, 2024

Examples and guides for using the OpenAI API

MDX 57,740 9,118 Updated Jul 20, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,372 488 Updated Jul 13, 2024

A library for advanced large language model reasoning

Python 1,006 81 Updated Jul 19, 2024

A benchmark for evaluating learning agents based on just language feedback

Python 46 10 Updated Jul 8, 2024

Fast bare-bones BPE for modern tokenizer training

Python 129 2 Updated Dec 19, 2023

The official PyTorch implementation of Google's Gemma models

Python 5,176 491 Updated Jul 11, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 8,771 799 Updated Jul 1, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 23,161 2,387 Updated Jul 19, 2024

MLX: An array framework for Apple silicon

C++ 15,842 903 Updated Jul 19, 2024
Python 375 30 Updated Jul 16, 2024

(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life

Python 295 9 Updated Jul 10, 2024

The OS for your personal finances

Ruby 28,805 2,185 Updated Jul 20, 2024

Official code for "Reward-Free Curricula for Training Robust World Models", ICLR 2024.

Python 24 2 Updated Jan 24, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 34,377 3,591 Updated Jul 16, 2024

DQN Zoo is a collection of reference implementations of reinforcement learning agents developed at DeepMind based on the Deep Q-Network (DQN) agent.

Python 442 76 Updated Apr 6, 2024

Benchmark for "Offline Policy Comparison with Confidence"

Python 3 Updated Oct 25, 2023

Deep Hierarchical Planning from Pixels

Python 85 22 Updated Dec 21, 2022

SDK for creating whiteboards and canvas experiences on the web.

TypeScript 34,520 2,077 Updated Jul 20, 2024
Next