- Seattle, WA
- https://rasley.io
- @jeffra45
Highlights
- Pro
Block or Report
Block or report jeffra
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
Machine Learning Engineering Open Book
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Pretrained language model with 100B parameters
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
Code release for SLIP Self-supervision meets Language-Image Pre-training
Library for 8-bit optimizers and quantization routines.
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Distribution transparent Machine Learning experiments on Apache Spark
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
RDMA and SHARP plugins for nccl library
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A minimal & modern LaTeX template for your (bachelor's | master's | doctoral) thesis
Find the smallest number of switches necessary to build topologies of a given number of hosts and bisection bandwidth for the EGFT, HyperX, and Jellyfish topologies.