Skip to content
View zwhus's full-sized avatar
Block or Report

Block or report zwhus

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

Showing results

Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.

Python 4 Updated Jul 15, 2024

Retrieval and Retrieval-augmented LLMs

Python 6,256 450 Updated Jul 30, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,613 99 Updated Jul 26, 2024

Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)

Python 52 2 Updated Jul 12, 2024

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,468 281 Updated Jul 30, 2024

Boosting Driving Scene Understanding with Advanced Vision-Language Models

Python 31 2 Updated May 19, 2023

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

Python 1,322 222 Updated Jun 24, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 129,979 25,830 Updated Jul 31, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,438 2,023 Updated Jul 14, 2024
Python 65 3 Updated May 8, 2023
Python 713 45 Updated Jul 8, 2024

Awesome Incremental Learning

3,629 558 Updated Jul 17, 2024

[AAAI2024]Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

Python 157 22 Updated Apr 2, 2024

[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

Python 836 127 Updated Oct 11, 2023

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Python 484 25 Updated Jun 11, 2024

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,319 921 Updated Jul 17, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,625 238 Updated Jun 4, 2024

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…

Jupyter Notebook 6,774 1,041 Updated Mar 15, 2024

Multimodal-GPT

Python 1,450 120 Updated Jun 4, 2023

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 2,904 237 Updated Jul 5, 2024

GIT: A Generative Image-to-text Transformer for Vision and Language

Python 539 67 Updated Dec 2, 2023

A central hub for gathering and showcasing amazing projects that extend OpenMMLab with SAM and other exciting features.

Python 1,066 122 Updated Mar 29, 2024

Google Brain AutoML

Jupyter Notebook 6,190 1,445 Updated Apr 2, 2024

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.

Jupyter Notebook 5,199 1,268 Updated Oct 24, 2021

OpenMMLab Detection Toolbox and Benchmark

Python 28,730 9,323 Updated Jul 27, 2024