Skip to content
View SikaStar's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SikaStar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,966 128 Updated Sep 3, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 1,844 103 Updated Sep 13, 2024

The official Python library for the OpenAI API

Python 21,992 3,029 Updated Sep 14, 2024

Implementation of paper - Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation.

Python 35 3 Updated Aug 12, 2024

Implementation of paper - Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation.

Python 14 1 Updated Aug 12, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,165 376 Updated Sep 13, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 536 56 Updated Jun 7, 2024

Use PEFT or Full-parameter to finetune 300+ LLMs or 80+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 3,401 292 Updated Sep 14, 2024

Kolors Team

Python 3,510 224 Updated Sep 4, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 25,131 5,191 Updated Sep 14, 2024

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Python 13,325 1,546 Updated Jul 10, 2024

Dense Connector for MLLMs

Python 98 3 Updated Aug 19, 2024

AI2-THOR Data Collection Tool Based On Keyboard Interaction

Python 54 10 Updated Jun 21, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,782 140 Updated Sep 10, 2024

[IJCV 2024] The official implementation of "Pattern-Expandable Image Copy Detection"

Python 5 Updated Jul 13, 2024
Python 3 Updated Jul 16, 2024

This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".

Python 202 11 Updated Jul 25, 2024

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Python 103 1 Updated Aug 23, 2024

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,047 54 Updated Jul 17, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 4,594 363 Updated Sep 11, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,861 401 Updated May 29, 2024

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)

Python 252 10 Updated Aug 28, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,350 419 Updated Jul 30, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Python 3,278 281 Updated Aug 15, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,479 425 Updated Sep 10, 2024
Python 2,386 165 Updated Sep 14, 2024

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,025 82 Updated Aug 8, 2024

Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.

Python 191 12 Updated Jul 3, 2024

Open-source and strong foundation image recognition models.

Jupyter Notebook 2,742 265 Updated Aug 1, 2024

A Survey on Text-to-Video Generation/Synthesis.

565 74 Updated Jul 24, 2024
Next