vhzy

Aaron Han vhzy

AI phd condidate

2 followers · 21 following

知乎：https://www.zhihu.com/people/kai-h

Starred repositories

OpenGVLab / PhyGenBench

The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Python 71 1 Updated Oct 25, 2024

apple / ml-slowfast-llava

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 174 10 Updated Sep 16, 2024

jun0wanan / awesome-large-multimodal-agents

346 20 Updated Sep 25, 2024

OpenGVLab / Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,071 252 Updated Sep 5, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,124 190 Updated Oct 4, 2024

LDLINGLINGLING / AutoPlan

本项目是自动化学报中AUTOPLAN的代码地址，使用大语言模型完成了复杂任务的任务规划以及任务执行

Python 77 6 Updated Nov 14, 2024

LDLINGLINGLING / adan_application

个人项目地址，一些大语言模型和多模态模型的应用

Python 119 7 Updated Nov 6, 2024

orrzohar / Video-STaR

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Python 47 4 Updated Jul 10, 2024

AlibabaResearch / DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.

Python 1,236 186 Updated Nov 18, 2024

mbzuai-oryx / Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Python 245 11 Updated Jan 2, 2024

MLNLP-World / Paper-Writing-Tips

MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips

3,621 468 Updated May 29, 2022

ziplab / LongVLM

Python 65 6 Updated Jul 30, 2024

kahnchana / mvu

Multimodal Video Understanding Framework (MVU)

Python 23 Updated May 15, 2024

declare-lab / Sealing

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

Python 9 3 Updated Jul 25, 2024

imagegridworth / IG-VLM

Python 120 5 Updated Sep 29, 2024

ttengwang / Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

209 9 Updated Nov 15, 2024

facebookresearch / atlas

Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03299)

Python 516 67 Updated Nov 28, 2023

darcula1993 / diffusion-models-class-CN

Forked from huggingface/diffusion-models-class

Materials for the Hugging Face Diffusion Models Course

Jupyter Notebook 169 21 Updated Feb 27, 2023

4 1 Updated May 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aaron Han vhzy

Block or report vhzy

Starred repositories

OpenGVLab / PhyGenBench

apple / ml-slowfast-llava

jun0wanan / awesome-large-multimodal-agents

OpenGVLab / Ask-Anything

QwenLM / Qwen2-VL

LDLINGLINGLING / AutoPlan

LDLINGLINGLING / adan_application

orrzohar / Video-STaR

AlibabaResearch / DAMO-ConvAI

mbzuai-oryx / Video-LLaVA

MLNLP-World / Paper-Writing-Tips

ziplab / LongVLM

kahnchana / mvu

declare-lab / Sealing

imagegridworth / IG-VLM

ttengwang / Awesome_Long_Form_Video_Understanding

facebookresearch / atlas

darcula1993 / diffusion-models-class-CN

WengLean / hands-on-research-tutorial

EvolvingLMMs-Lab / LongVA

YueFan1014 / VideoAgent

BradyFU / Video-MME

lntzm / CVPR24Track-LongVideo

Ziyang412 / VideoTree

Espere-1119-Song / Paper-Writing-Tips

boheumd / MA-LMM

Stanford-ILIAD / explore-eqa

kkahatapitiya / LangRepo

AlonzoLeeeooo / awesome-text-to-image-studies

Liuziyu77 / Soda

Starred topics

Deep learning