Stars
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Using Claude Opus to reverse engineer code from MegaPortraits: One-shot Megapixel Neural Head Avatars
A latent text-to-image diffusion model
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
DeepFake Face Datasets. Code accompanying the paper "Robustness and Generalizability of Deepfake Detection: A Study with Diffusion Models".
[VISAPP2024] Towards the Detection of Diffusion Model Deepfakes
FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, Comfy…
Image-to-Image Translation in PyTorch
Non-local Neural Networks for Video Classification
Convolutional neural network model for video classification trained on the Kinetics dataset.
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Video classification tools using 3D ResNet
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Open-Sora: Democratizing Efficient Video Production for All
Unofficial implementation of FSD50k baselines for Sound Event Recognition
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Sequence modeling benchmarks and temporal convolutional networks
Retinaface get 80.99% in widerface hard val using mobilenet0.25.
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016
This is the implementation for the NeurIPS 2022 paper: ZIN: When and How to Learn Invariance Without Environment Partition?
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition