Stars
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
[MM24 Oral] Identity-Driven Multimedia Forgery Detection via Reference Assistance
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
Compact Bilinear Pooling for PyTorch
[ICCV 2021] Continual Learning for Image-Based Camera Localization.
This is the Pytorch implementation of LwF
Official PyTorch implementation of "Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation"
Multi-level Online Sequential Experts (MOSE) for online continual learning problem. (CVPR2024)
Continual Learning Method RAWM for ICML 2023
Relevant papers in Continual Learning
[ACM MM'23] UMMAFormer: A Universal Multimodal-adaptive Transformer Framework For Temporal Forgery Localization
Continual Learning Method RWM for AAAI 2024
The official repository for the ICASSP paper, "Hearing and Seeing Abnormality: Self-Supervised Audio-Visual Mutual Learning for Deepfake Detection"."
Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
A curated list of awesome anomaly detection resources
NOMAD is a fully unsupervised non-matching reference audio quality metric
Official code for Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection (CVPR 2022)
[ACM MM] AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
Code for Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection
Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)
Recurrent Video Restoration Transformer with Guided Deformable Attention (NeurlPS2022, official repository)
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks