Skip to content
View amy12xx's full-sized avatar

Block or report amy12xx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reference implementation for DPO (Direct Preference Optimization)

Python 2,062 167 Updated Aug 11, 2024

A curated list of reinforcement learning with human feedback resources (continually updated)

3,301 203 Updated Aug 30, 2024

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 4,555 805 Updated Mar 5, 2024

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

Python 2,355 140 Updated Jun 22, 2024

Train transformer language models with reinforcement learning.

Python 9,627 1,210 Updated Oct 7, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,029 1,573 Updated Oct 7, 2024

Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA

Python 100 12 Updated Jul 29, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,954 4,128 Updated Oct 8, 2024

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 9,962 820 Updated Jun 10, 2024

Aligning Large Language Models with Human: A Survey

679 30 Updated Sep 11, 2023

A playbook for systematically maximizing the performance of deep learning models.

26,643 2,213 Updated Jun 18, 2024
Python 28 8 Updated Sep 5, 2021

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,681 666 Updated Jan 14, 2024

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 49,159 4,767 Updated Sep 19, 2024

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Python 6,965 775 Updated Oct 6, 2024

A lightweight RL library inspired from salina

Python 15 11 Updated Sep 5, 2024

Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.

Jupyter Notebook 857 119 Updated Jul 25, 2024

RL Environments in JAX 🌍

Python 616 61 Updated Jul 4, 2024

RL implementations

Jupyter Notebook 864 146 Updated Oct 1, 2024

Wrappers and utilities for Nvidia IsaacGym

Python 91 19 Updated Apr 16, 2022
Jupyter Notebook 2 Updated Jul 28, 2021

Python Implementation of Reinforcement Learning: An Introduction

Python 13,522 4,817 Updated Aug 9, 2024

Power-Law Distribution Analysis

Python 24 5 Updated Jul 1, 2019

JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.

Jupyter Notebook 614 65 Updated Oct 26, 2022
Jupyter Notebook 280 23 Updated Sep 26, 2024

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…

Python 3,573 829 Updated May 29, 2022

Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.

9,354 1,531 Updated Aug 31, 2023

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

C++ 1,081 99 Updated Aug 12, 2024

Libraries, tools and tasks created and used at DeepMind Robotics.

Python 339 35 Updated Sep 23, 2024
Next