-
Sungkyunkwan Univ
- South Korea
Highlights
- Pro
Lists (28)
Sort Name ascending (A-Z)
CoIR
Contrastive
CoVR
Cross_modal_retrieval
Det
detr
diffusion
DVC
GENIR
i3d
long video
Math
medical diffusion
MICCAI
ML
PRVR
RefVOS
Reranking
Semes
SGG
T2P_retrieval
Uncertainty estimation
vcmr
VL_models
VLCL
VR
WrefS
WS_ML
Stars
The official implementation of DenoiseLoc: Boundary Denoising for Video Activity Localization, ICLR 2024
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]
[SIGIR'2024 Best Paper Honorable Mention] Official repository for "LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval"
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[ECCV 2024] Official code for "Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation"
[ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Collection of Composed Image Retrieval (CIR) papers.
Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
Utilities intended for use with Llama models.
Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
Noisy-Correspondence Learning for Text-to-Image Person Re-identification (CVPR 2024 Pytorch Code)
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"
Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024
[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion