Stars
Reference implementation for DPO (Direct Preference Optimization)
A curated list of reinforcement learning with human feedback resources (continually updated)
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.
Train transformer language models with reinforcement learning.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA
A high-throughput and memory-efficient inference and serving engine for LLMs
QLoRA: Efficient Finetuning of Quantized LLMs
Aligning Large Language Models with Human: A Survey
A playbook for systematically maximizing the performance of deep learning models.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.
Wrappers and utilities for Nvidia IsaacGym
Python Implementation of Reinforcement Learning: An Introduction
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Libraries, tools and tasks created and used at DeepMind Robotics.