-
Danmarks Tekniske Universitet
- Kongens Lyngby, Danmark
-
21:39
(UTC +03:00) - @holidaypiggy233
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
[ECCV 2024] Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention
Outlier detection challenge 2024 - a DTU Compute summer school challenge
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Humengge / act-plus-plus
Forked from MarkFzp/act-plus-plusImitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
将Mask2Former的backbone替换成DINOv2训练好的ViT模型
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
philferriere / cocoapi
Forked from cocodataset/cocoapiClone of COCO API - Dataset @ https://cocodataset.org/ - with changes to support Windows build and python3
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
a state-of-the-art-level open visual language model | 多模态预训练模型
Official implementation of "Controllable Prompt Tuning For Balancing Group Distributional Robustness" (ICML 2024), coming soon.
Official Implementation of Avoiding spurious correlations via logit correction
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning (ICML 2023)
[ECCV2024 Oral🔥] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
Official PyTorch implementation of ChAda-ViT [CVPR 2024]
[MICCAI'2024] EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera
[IPCAI'2024 (IJCARS special issue)] Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery