- Shanghai
Block or Report
Block or report zhangjx123
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
fault-tolerant Python3 package for searching, navigating, and modifying LaTeX documents
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
Implementation of Nougat Neural Optical Understanding for Academic Documents
Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
EfficientViT is a new family of vision models for efficient high-resolution vision.
[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
一个基于可视水印检测识别的数字媒体溯源应用系统,是我的大作业项目,包含这个系统以及一个开源的大规模常见水印图像数据集(Large-scale Common Watermark Dataset, LCWD)。 输入一个带有可视水印的图片或视频,系统会检测定位到水印所在的区域,然后将其提取出来,然后借助百度AI开放平台的OCR和logo识别以及Bing搜索引擎,溯源到这个图片或视频的源头。
A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
Open-Sora: Democratizing Efficient Video Production for All
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
✨✨Latest Advances on Multimodal Large Language Models
ZZZHANG-jx / Awesome-Document-Image-Rectification
Forked from fh2019ustc/Awesome-Document-Image-RectificationA comprehensive list of awesome document image rectification papers.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
Deep Learning for Camera Calibration and Beyond: A Survey
A comprehensive list of awesome document image rectification papers.
The official implementation of SPTS v2: Single-Point Text Spotting
Official implementation of SPTS: Single-Point Text Spotting (ACM MM 2022 Oral)
Google Research