Skip to content
View yjcaimeow's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report yjcaimeow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Empowering Unified MLLM with Multi-granular Visual Generation

105 1 Updated Oct 18, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,015 465 Updated Nov 15, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,099 188 Updated Oct 4, 2024

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,825 131 Updated Nov 13, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,046 385 Updated Aug 7, 2024

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,649 2,224 Updated Jul 29, 2024

A curated list of awesome LLM for Autonomous Driving resources (continually updated)

1,015 52 Updated Sep 25, 2024

[CVPR2024] The code for "MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction"

Python 88 5 Updated Apr 13, 2024

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,194 58 Updated Mar 14, 2024

PytorchAutoDrive: Segmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, RESA, LSTR, LaneATT, BézierLaneNet...) based on PyTorch with fast training, visualization, ben…

Python 851 138 Updated Oct 4, 2023

[ICLR'23 Spotlight & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction

Python 1,131 172 Updated Oct 28, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Python 2,521 154 Updated Oct 10, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,653 614 Updated Nov 14, 2024

✨✨Latest Advances on Multimodal Large Language Models

12,662 809 Updated Nov 10, 2024

LlamaIndex is a data framework for your LLM applications

Python 36,747 5,272 Updated Nov 15, 2024

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 723 52 Updated Mar 25, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,239 2,236 Updated Aug 12, 2024

Aligning pretrained language models with instruction data generated by themselves.

Python 4,154 486 Updated Mar 27, 2023

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 32,269 4,758 Updated Nov 14, 2024

A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.

241 10 Updated Jul 1, 2024

T2I-Adapter

Python 3,473 207 Updated Jun 21, 2024

Official code for "DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics" (NeurIPS 2023)

Jupyter Notebook 100 5 Updated Mar 25, 2024
Python 1 Updated Jan 11, 2024

An Open-source Toolkit for LLM Development

Python 2,720 176 Updated May 24, 2024

Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"

Python 1,373 86 Updated May 31, 2023

Using Low-rank adaptation to quickly fine-tune diffusion models.

Jupyter Notebook 7,063 481 Updated Mar 22, 2024

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch

Python 1,253 129 Updated May 3, 2024

Official PyTorch implementation for a conditional diffusion probability model in BEV perception

Python 247 12 Updated Apr 4, 2023

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

Python 950 42 Updated Jan 17, 2024

Layout-Guided multi-view driving scene video generation with latent diffusion model

Python 557 15 Updated Dec 15, 2023
Next