The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 31,070 4,667 Updated Aug 8, 2024

czczup / FAST

Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

Python 144 19 Updated Mar 11, 2024

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 10,094 706 Updated Aug 12, 2024

Lyken17 / pytorch-OpCounter

Count the MACs / FLOPs of your PyTorch model.

Python 4,806 523 Updated Jul 8, 2024

roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Jupyter Notebook 286 58 Updated Apr 9, 2024

csxmli2016 / MARCONet

Learning Generative Structure Prior for Blind Text Image Super-resolution [CVPR 2023]

Python 188 14 Updated Jan 2, 2024

FangShancheng / ABINet

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

Jupyter Notebook 420 72 Updated Oct 14, 2022

ThunderVVV / RCLSTR

Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`

Python 16 1 Updated Sep 22, 2023

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,744 97 Updated Jul 31, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,829 245 Updated Aug 12, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Python 4,941 383 Updated Aug 9, 2024

X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 1,188 72 Updated Jul 16, 2024

wenyu1009 / RTSRN

Python 16 2 Updated Sep 19, 2023

baudm / parseq

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

Python 539 123 Updated May 29, 2024

Yuliang-Liu / Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,632 113 Updated Aug 12, 2024

csguoh / LEMMA

[IJCAI2023] An official implement of the paper "Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement"

Python 49 5 Updated Sep 1, 2023

01-ai / Yi

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,550 460 Updated Aug 9, 2024

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 1,941 184 Updated Apr 24, 2024

Ucas-HaoranWei / Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,685 151 Updated Jul 21, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

11,080 732 Updated Aug 12, 2024

D641593 / MixNet

Python 56 7 Updated Oct 12, 2023

yeungchenwa / Recommendations-Diffusion-Text-Image

A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten ge…

172 4 Updated Aug 2, 2024

mjq11302010044 / TPGSR

Code for Text Prior Guided Scene Text Image Super-Resolution (TIP 2023)

Python 135 17 Updated Dec 12, 2023

FudanVI / FudanOCR

A toolbox of scene text super-resolution and recognition

Python 341 61 Updated Jul 25, 2024

mjq11302010044 / TATT

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Python 166 17 Updated Jun 30, 2022

HumanAIGC / EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

7,307 886 Updated Aug 6, 2024

Dod-o / Statistical-Learning-Method_Code

手写实现李航《统计学习方法》书中全部算法

Python 10,928 2,860 Updated Nov 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OneYearIsEnough

Achievements