- Germany
-
18:28
(UTC +02:00) - https://elizazhou96.github.io/
- @Yutong96Sweet
Highlights
- Pro
Block or Report
Block or report Yutong-Zhou-cv
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Official Implementation of 'Inserting Anybody in Diffusion Models via Celeb Basis'
Contextual Object Detection with Multimodal Large Language Models
[ECCV 2024 Workshop🎈] The first agriculture benchmark to evaluate MM-LLMs.
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
✨✨Latest Advances on Multimodal Large Language Models
Accelerating the development of large multimodal models (LMMs) with lmms-eval
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
A list for Text-to-Video, Image-to-Video works
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey [Miyai+, arXiv2024]
Interactive Tools for Machine Learning, Deep Learning and Math
This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine“
Official Implementation for "HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach"
Integrated Image-based Deep Learning and Language Models for Primary Diabetes Care
Medical SAM 2: Segment Medical Images As Video Via Segment Anything Model 2
(ICML 2024) Spider: A Unified Framework for Context-dependent Concept Segmentation
Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
Official inference repo for FLUX.1 models
A curated list of awesome resources for camouflaged/concealed object detection (COD).
(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
A complete alternative for Overleaf with VSCode + Web + Git Integration + Copilot + Grammar & Spell Checker + Live Collaboration Support. Based on GitHub Codespace and Dev container.
[CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…