Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 14,801 2,147 Updated Nov 1, 2024

facebookresearch / audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,702 254 Updated Sep 15, 2024

allenai / allennlp

An open-source NLP research library, built on PyTorch.

Python 11,756 2,251 Updated Nov 22, 2022

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,076 2,209 Updated Aug 12, 2024

Chuhanxx / helping_hand_for_egocentric_videos

Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'

Python 31 2 Updated Nov 7, 2023

facebookresearch / DomainBed

DomainBed is a suite to test domain generalization algorithms

Python 1,403 299 Updated Jul 26, 2024

sallymmx / ActionCLIP

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Python 511 59 Updated Dec 6, 2023

voxel51 / fiftyone-examples

Examples of using FiftyOne

Python 200 37 Updated Mar 5, 2024

ninatu / learning_by_sorting

Official implementation of "Learning by Sorting: Self-supervised Learning with Group Ordering Constraints". ICCV 2023

Python 14 1 Updated Nov 12, 2023

ninatu / in_style

Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval". ICCV 2023

11 Updated Oct 5, 2023

ninatu / howtocaption

Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024

Python 45 Updated Oct 2, 2024

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,399 85 Updated Sep 23, 2024

microsoft / XPretrain

Multi-modality pre-training

Python 470 36 Updated May 8, 2024

OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 293 15 Updated May 27, 2024

muzairkhattak / multimodal-prompt-learning

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".

Python 657 51 Updated Jul 24, 2023

explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Python 30,104 4,398 Updated Oct 23, 2024

Anna Kukleva Annusha

Lists (22)

awsome

backbones

captions

clustering

contrastve learning

diffusion_models

ego4d

few-shot

germany

learn

LLMs

long-tail

NCD

nlp

openset

resources

tech

time_transformers

transformers

video memory efficient

videos

work-in-progress

Stars