Skip to content
View zhang-tao-whu's full-sized avatar
Block or Report

Block or report zhang-tao-whu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official inference repo for FLUX.1 models

Python 4,147 234 Updated Aug 5, 2024

[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Python 115 2 Updated Jul 23, 2024

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 113 Updated Jul 28, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 8,390 476 Updated Aug 5, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 390 17 Updated Jul 31, 2024

This is the official implementation for ControlVAR.

Python 14 1 Updated Jul 29, 2024

Implementation of Autoregressive Diffusion in Pytorch

Python 224 2 Updated Jul 30, 2024

GoldFinch and other hybrid transformer components

Python 35 3 Updated Jul 20, 2024

EVE: Encoder-Free Vision-Language Models

Python 181 3 Updated Jul 20, 2024
Python 24 Updated Aug 5, 2024

Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

Python 385 14 Updated Jul 28, 2024

LLM101n: Let's build a Storyteller

26,543 1,432 Updated Aug 1, 2024

Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models' (ICML2024)

Python 11 Updated Jul 15, 2024

[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model

Python 93 1 Updated Aug 5, 2024

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Python 81 1 Updated Jul 31, 2024

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 886 50 Updated Jul 14, 2024

This is the official implementation of "GvSeg: General and Task-Oriented Video Segmentation" (Accepted at ECCV 2024).

5 1 Updated Jul 15, 2024

Streamlit — A faster way to build and share data apps.

Python 33,800 2,946 Updated Aug 5, 2024

Code release for "Segment Anything without Supervision"

Jupyter Notebook 261 17 Updated Jul 9, 2024

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 577 29 Updated Aug 5, 2024

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Python 1,851 73 Updated Aug 5, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 24,090 2,528 Updated Aug 5, 2024

[ECCV 2024] ControlCap: Controllable Region-level Captioning

Python 42 Updated Jul 2, 2024

Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"

Python 128 3 Updated Aug 5, 2024

This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"

Python 79 4 Updated Jul 8, 2024

Point-SAM: This is the official repository of "Point-SAM: Promptable 3D Segmentation Model for Point Clouds". We provide codes for running our demo and links to download checkpoints.

Python 79 5 Updated Jul 1, 2024

Image Prompter for Gradio

JavaScript 61 9 Updated Dec 14, 2023

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 31,393 2,339 Updated Aug 5, 2024

The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

Python 79 7 Updated Jul 31, 2024

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 126 2 Updated Jul 19, 2024
Next