Starred repositories
A General Framework for Jersey Number Recognition in Sports Video
An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Code for ALBEF: a new vision-language pre-training method
LAVIS - A One-stop Library for Language-Vision Intelligence
Tool for robust segmentation of >100 important anatomical structures in CT and MR images
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
The largest pre-trained medical image segmentation model (1.4B parameters) based on the largest public dataset (>100k annotations), up until April 2023.
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
✨✨Latest Advances on Multimodal Large Language Models
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
Nightly release of ControlNet 1.1
Curated list of project-based tutorials
中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Supervised Multimodal Bitransformers for Classifying Images and Text
State-of-the-Art Text Embeddings
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
A simple Jekyll theme for words and pictures.