- Abu Dhabi, United Arab Emirates
- https://jaygala24.github.io
- @jaygala24
Highlights
- Pro
Stars
Language
Sort by: Recently starred
Automatically split your PyTorch models on multiple GPUs for training & inference
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective…
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Annotated version of the Mamba paper
AI4Bharat / IndicInstruct
Forked from allenai/open-instructCode repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
Machine Learning Engineering Open Book
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Additional resources from our AACL tutorial
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
An open collection of implementation tips, tricks and resources for training large language models
An open collection of methodologies to help with successful training of large language models.
prompt2model - Generate Deployable Models from Natural Language Instructions
Python programs, usually short, of considerable difficulty, to perfect particular skills.
Tutorial on neural theorem proving
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthet…
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
MTEB: Massive Text Embedding Benchmark
A Multilingual Replicable Instruction-Following Model
Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).
Fast & Simple repository for pre-training and fine-tuning T5-style models