-
SUSTech
- Shenzhen, China
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
A list of tools, papers and code related to Deepfake Detection.
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
Awesome work on word sense disambiguation in general
✨✨Latest Advances on Multimodal Large Language Models
Pytorch implementation of convolutional neural network adversarial attack techniques
A curated list of papers & resources on backdoor attacks and defenses in deep learning.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsens…
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
PyTorch implementation of adversarial attacks [torchattacks]
Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).
An open-source framework for training large multimodal models.
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction, such as text, speech, images, and videos, to provide a sea…
Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Official implementation and data release of the paper "Visual Prompting via Image Inpainting".
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
LAVIS - A One-stop Library for Language-Vision Intelligence
ImageNet-R(endition) and DeepAugment (ICCV 2021)
EasyRobust: an Easy-to-use library for state-of-the-art Robust Computer Vision Research with PyTorch.