Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep lear…

Python 23,570 4,550 Updated Oct 15, 2023

karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 19,409 2,402 Updated Apr 28, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,122 2,438 Updated Jul 7, 2024

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 14,949 1,424 Updated Jul 8, 2024

facebookresearch / detr

End-to-End Object Detection with Transformers

Python 13,097 2,374 Updated Mar 12, 2024

openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 11,061 750 Updated Jul 2, 2024

rtqichen / torchdiffeq

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

Python 5,347 907 Updated Oct 19, 2023

open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Python 4,184 735 Updated Jun 2, 2024

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 3,562 270 Updated May 25, 2024

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,589 236 Updated Jun 4, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,466 79 Updated Jul 8, 2024

qiuqiangkong / audioset_tagging_cnn

Python 1,279 247 Updated Jul 13, 2021

OFA-Sys / ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 884 54 Updated Jun 27, 2024

THUDM / SwissArmyTransformer

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Python 867 83 Updated Jun 12, 2024

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 667 31 Updated Jun 2, 2024