Lists (1)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
[MICCAI 2024] Official code repository of paper titled "BAPLe: Backdoor Attacks on Medical Foundation Models using Prompt Learning" accepted in MICCAI 2024 conference.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
CoreNet: A library for training deep neural networks
MobiLlama : Small Language Model tailored for edge devices
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Efficient Video Object Segmentation via Modulated Cross-Attention Memory
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
[MICCAI 2023][Early Accept] Official code repository of paper titled "Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification"
How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges
[MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation" accepted in MICCAI 2023 conference.
[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
[EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabic languages.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Implementation for ECCV 2022 paper Language-Grounded Indoor 3D Semantic Segmentation in the Wild
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
[NeurIPS 2022] Official repository of paper titled "Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection".
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
This repo includes ChatGPT prompt curation to use ChatGPT better.