Skip to content
View Davidup1's full-sized avatar
🎯
Focusing
🎯
Focusing
  • BUPT
  • Beijing
  • 12:45 (UTC -12:00)

Block or report Davidup1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 744 41 Updated Jul 21, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,568 118 Updated Aug 22, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 31,496 3,619 Updated Aug 23, 2024

Collection of AWESOME vision-language models for vision tasks

2,126 191 Updated Jul 24, 2024

Visualizing the attention of vision-language models

Jupyter Notebook 13 1 Updated Aug 7, 2024

A simple and elegant Jekyll theme for an academic personal homepage

CSS 578 476 Updated Aug 15, 2024

个人主页,我的个人主页,个人主页源码,主页模板,homepage

Vue 3,075 1,859 Updated Aug 19, 2024

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

Python 1,333 223 Updated Aug 20, 2024

RS5M: a large-scale vision language dataset for remote sensing

Python 189 7 Updated Jul 31, 2024

A professionally curated list of awesome resources (paper, code, data, etc.) on transformers in time series.

2,336 234 Updated Aug 8, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 9,897 725 Updated Aug 21, 2024

[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Python 19 4 Updated Jul 28, 2024

简单易懂的LLaMA微调指南。

Python 332 33 Updated Jul 5, 2023

PyTorch implementation of 'A Decoupling Paradigm With Prompt Learning for Remote Sensing Image Change Captioning'

Python 29 1 Updated May 21, 2024

Typora plugin. Feature enhancement tool | Typora 插件,功能增强工具

JavaScript 1,535 77 Updated Aug 24, 2024
6 Updated Apr 24, 2023

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

135 8 Updated Jun 17, 2024

😆 国内外互联网技术大牛们都写了哪些书籍:计算机基础、网络、前端、后端、数据库、架构、大数据、深度学习...

HTML 6,009 997 Updated Jun 19, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,804 200 Updated Jul 27, 2024

A generative speech model for daily dialogue.

Python 29,615 3,229 Updated Aug 24, 2024

UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization

84 7 Updated May 22, 2024

A massively parallel, high-level programming language

Rust 17,099 420 Updated Aug 23, 2024
Jupyter Notebook 1,112 545 Updated May 13, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,882 2,070 Updated Aug 12, 2024

Evaluate your LLM's response with Prometheus and GPT4 💯

Python 731 42 Updated Aug 13, 2024

When do we not need larger vision models?

Python 299 9 Updated Aug 19, 2024

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,944 298 Updated Jul 16, 2024

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Jupyter Notebook 624 36 Updated Jul 30, 2024

Painter & SegGPT Series: Vision Foundation Models from BAAI

Python 2,486 168 Updated Oct 31, 2023
Next