amy12xx

Follow

Amanda Dsouza amy12xx

Follow

MS CS @ Georgia Tech. Staff Applied Research Scientist. Interests in RL, NLP, Open Source, Python.

27 followers · 15 following

Achievements

Achievements

Stars

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Python 2,062 167 Updated Aug 11, 2024

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,301 203 Updated Aug 30, 2024

alirezadir / Machine-Learning-Interviews

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 4,555 805 Updated Mar 5, 2024

helblazer811 / ManimML

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Python 2,355 140 Updated Jun 22, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 9,627 1,210 Updated Oct 7, 2024

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,029 1,573 Updated Oct 7, 2024

leehanchung / lora-instruct

Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA

Python 100 12 Updated Jul 29, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,954 4,128 Updated Oct 8, 2024

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 9,962 820 Updated Jun 10, 2024

GaryYufei / AlignLLMHumanSurvey

Aligning Large Language Models with Human: A Survey

679 30 Updated Sep 11, 2023

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

26,643 2,213 Updated Jun 18, 2024

orhonovich / q-squared

Python 28 8 Updated Sep 5, 2021

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,681 666 Updated Jan 14, 2024

dair-ai / Prompt-Engineering-Guide

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 49,159 4,767 Updated Sep 19, 2024

Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Python 6,965 775 Updated Oct 6, 2024

osigaud / bbrl

A lightweight RL library inspired from salina

Python 15 11 Updated Sep 5, 2024

MrSyee / pg-is-all-you-need

Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.

Jupyter Notebook 857 119 Updated Jul 25, 2024

RobertTLange / gymnax

RL Environments in JAX 🌍

Python 616 61 Updated Jul 4, 2024

Denys88 / rl_games

RL implementations

Jupyter Notebook 864 146 Updated Oct 1, 2024

iamlab-cmu / isaacgym-utils

Wrappers and utilities for Nvidia IsaacGym

Python 91 19 Updated Apr 16, 2022

halcy / LearningJAX

Jupyter Notebook 2 Updated Jul 28, 2021

ShangtongZhang / reinforcement-learning-an-introduction

Python Implementation of Reinforcement Learning: An Introduction

Python 13,522 4,817 Updated Aug 9, 2024

shagunsodhani / powerlaw

Power-Law Distribution Analysis

Python 24 5 Updated Jul 1, 2019

ikostrikov / jaxrl

JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.

Jupyter Notebook 614 65 Updated Oct 26, 2022

google-research / rlds

Jupyter Notebook 280 23 Updated Sep 26, 2024

ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…

Python 3,573 829 Updated May 29, 2022

khangich / machine-learning-interview

Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.

9,354 1,531 Updated Aug 31, 2023

clvrai / awesome-rl-envs

1,053 80 Updated May 27, 2024

sail-sg / envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

C++ 1,081 99 Updated Aug 12, 2024

google-deepmind / dm_robotics

Libraries, tools and tasks created and used at DeepMind Robotics.

Python 339 35 Updated Sep 23, 2024