Stars
one summary of diffusion-based image processing, including restoration, enhancement, coding, quality assessment
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
Few Shot Semantic Segmentation Papers
Code for the paper "UVDoc: Neural Grid-based Document Unwarping" - Dataset capture and creation
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
LPIPS metric. pip install lpips
High-Resolution Image Synthesis with Latent Diffusion Models
Unofficial implementation of Palette: Image-to-Image Diffusion Models by Pytorch
Transformer: PyTorch Implementation of "Attention Is All You Need"
Document Rectification and Illumination Correction using a Patch-based CNN
A hybrid dataset for document unwarping (Paper: https://www3.cs.stonybrook.edu/~cvl/projects/dewarpnet/storage/paper.pdf)
HanziJS is a Chinese character and NLP module for Chinese language processing for Node.js
Synthesize distorted document image and control points.
Text page dewarping using a "cubic sheet" model
Turning a CLIP Model into a Scene Text Detector (CVPR2023) | Turning a CLIP Model into a Scene Text Spotter (TPAMI)
Official Code for DragGAN (SIGGRAPH 2023)
A playbook for systematically maximizing the performance of deep learning models.
Official repo for consistency models.
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
End-to-End Object Detection with Transformers
NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
A brief computer graphics / rendering course
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.