Highlights
- Pro
Block or Report
Block or report ta012
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (2)
Sort Name ascending (A-Z)
Stars
Language: Python
Sort by: Most stars
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep lear…
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
End-to-End Object Detection with Transformers
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.
OpenMMLab Text Detection, Recognition and Understanding Toolbox
An open-source framework for training large multimodal models.
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
GPT4V-level open-source multi-modal model based on Llama3-8B
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Audio Dataset for training CLAP and other models
Official PyTorch implementation of "A Comprehensive Overhaul of Feature Distillation" (ICCV 2019)
Code for "Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning"
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
Papers and resources related to the security and privacy of LLMs 🤖
A New Tamil Large Language Model (LLM) Based on Llama 2
MU-LLaMA: Music Understanding Large Language Model
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
cross modal background suppression for audio-visual event localization
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)