MikeWangWZHL

Zhenhailong Wang MikeWangWZHL

CS Phd at UIUC, Research Assistant at BLENDER lab advised by Prof. Heng Ji | Intern at Tencent AI lab | Intern at MSRA

82 followers · 3 following

Achievements

Highlights

Stars

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 2,251 182 Updated Oct 29, 2024

mit-han-lab / hart

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 279 9 Updated Oct 16, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

6,722 405 Updated Jul 28, 2024

X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Python 2,930 271 Updated Sep 26, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 966 53 Updated Sep 27, 2024

lucidrains / autoregressive-diffusion-pytorch

Implementation of Autoregressive Diffusion in Pytorch

Python 289 8 Updated Sep 26, 2024

IDEA-Research / Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 1,010 95 Updated Nov 3, 2024

showlab / Awesome-Unified-Multimodal-Models

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

198 6 Updated Oct 31, 2024

showlab / Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,005 43 Updated Oct 27, 2024

Alpha-VLLM / Lumina-mGPT

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

Python 493 20 Updated Aug 16, 2024

qwqjsq / qwqjsq

qwqjsq.com 的最新地址

266 18 Updated Feb 27, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-V…

Python 4,068 360 Updated Nov 2, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 668 36 Updated Aug 5, 2024

zh460045050 / VQGAN-LC

Python 99 6 Updated Jun 28, 2024

TencentARC / Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 677 28 Updated Sep 27, 2024

bytedance / 1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Jupyter Notebook 449 18 Updated Oct 16, 2024

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,192 93 Updated Mar 8, 2023

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

540 34 Updated Oct 26, 2024

MikeWangWZHL / VDLM

Repo for paper: https://arxiv.org/abs/2404.06479

Python 25 1 Updated Oct 3, 2024

nerfies / nerfies.github.io

JavaScript 2,509 889 Updated Jun 21, 2024

google-deepmind / alphageometry

Python 4,143 465 Updated Oct 25, 2024

khuangaf / CHOCOLATE

Code and data for the ACL 2024 Findings paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning"

Jupyter Notebook 23 Updated Jun 5, 2024

NeuSpeech / EEG-To-Text

Python 103 10 Updated Aug 15, 2024

understanding-search / maze-dataset

maze datasets for investigating OOD behavior of ML systems

Jupyter Notebook 16 3 Updated Sep 10, 2024

microsoft / torchgeo

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data

Python 2,727 338 Updated Nov 1, 2024

hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Python 1,254 129 Updated Aug 1, 2024

xingyaoww / ecole-dataset

Python 5 Updated Oct 10, 2023

StanfordVL / atp-video-language

Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (ATP).

Python 48 2 Updated May 29, 2024

MikeWangWZHL / Solo-Performance-Prompting

Repo for paper "Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration"

Python 313 28 Updated May 8, 2024

princeton-nlp / tree-of-thought-llm

[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Python 4,771 445 Updated Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhenhailong Wang MikeWangWZHL

Achievements

Achievements

Highlights

Block or report MikeWangWZHL

Stars

mit-han-lab / efficientvit

mit-han-lab / hart

WooooDyy / LLM-Agent-Paper-List

X-PLUG / MobileAgent

LTH14 / mar

lucidrains / autoregressive-diffusion-pytorch

IDEA-Research / Grounded-SAM-2

showlab / Awesome-Unified-Multimodal-Models

showlab / Show-o

Alpha-VLLM / Lumina-mGPT

qwqjsq / qwqjsq

modelscope / ms-swift

GAIR-NLP / anole

zh460045050 / VQGAN-LC

TencentARC / Open-MAGVIT2

bytedance / 1d-tokenizer

xiaobai1217 / Awesome-Video-Datasets

Yangyi-Chen / Multimodal-AND-Large-Language-Models

MikeWangWZHL / VDLM

nerfies / nerfies.github.io

google-deepmind / alphageometry

khuangaf / CHOCOLATE

NeuSpeech / EEG-To-Text

understanding-search / maze-dataset

microsoft / torchgeo

hkchengrex / Tracking-Anything-with-DEVA

xingyaoww / ecole-dataset

StanfordVL / atp-video-language

MikeWangWZHL / Solo-Performance-Prompting

princeton-nlp / tree-of-thought-llm