ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,394 221 Updated Jul 16, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,580 84 Updated Jul 16, 2024

THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 3,738 271 Updated Jul 16, 2024

poloclub / unitable

UniTable: Towards a Unified Table Foundation Model

Jupyter Notebook 286 15 Updated Jun 4, 2024

yannqi / COMBO-AVS

[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation

Python 26 3 Updated Jul 5, 2024

facebookresearch / detr

End-to-End Object Detection with Transformers

Python 13,124 2,379 Updated Mar 12, 2024

open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

Python 700 82 Updated Jul 16, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Python 4,166 312 Updated Jul 12, 2024

Kedreamix / Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…

Python 1,406 237 Updated Jul 9, 2024

VamosC / CLIP4STR

An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".

Python 90 12 Updated May 3, 2024

OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Python 4,439 322 Updated Jul 16, 2024

AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 3,969 387 Updated Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fujingling fujingling

Block or report fujingling

Stars

Yuliang-Liu / MultimodalOCR

QwenLM / Qwen-VL

yuzhimanhua / Awesome-Scientific-Language-Models

clovaai / donut

xinke-wang / OCRDatasets

cv-small-snails / Text-Recognition-Material

WenmuZhou / OCR_DataSet

microsoft / Megatron-DeepSpeed

Tencent / HunyuanDiT

Tencent / MimicMotion

TMElyralab / MusePose

huggingface / text-generation-inference

vllm-project / vllm

huggingface / tokenizers

microsoft / DeepSpeed

huggingface / diffusion-models-class

nftblackmagic / Diffusion-Tryon-Trainer

huggingface / diffusers

modelscope / swift