- DocParser: Hierarchical Document Structure Parsing from Renderings
- A Multi-layered Approach To Information Extraction From Tables In Biomedical Documents
- PDFFigures 2.0: Mining Figures from Research Papers
- User-Guided Information Extraction from Print-Oriented Documents
- High precision text extraction from PDF documents
- Layout analysis and content classification in digitized books
- LayoutParser: A Uni ed Toolkit for Deep Learning Based Document Image Analysis
- FigureSeer: Parsing Result-Figures in Research Papers
- PubLayNet: largest dataset ever for document layout analysis
- Detect2Rank : Combining Object Detectors Using Learning to Rank
- New Methods for Metadata Extraction from Scientific Literature
- Fast Visual Object Tracking with Rotated Bounding Boxes
- Document Structure and Layout Analysis
- DocBank: A Benchmark Dataset for Document Layout Analysis
- Recognition of Multi-Oriented, Multi-Sized, and Curved Text
- Unsupervised document structure analysis of digital scientific articles
- Document image zone classification : A simple high-performance approach
- Two Geometric Algorithms for Layout Analysis
- TableBank: Table Benchmark for Image-based Table Detection and Recognition
- Building Non-overlapping Polygons For Image Document Layout Analysis Results
- Design of an end-to-end method to extract information from tables
- Chargrid: Towards Understanding 2D Documents
- A Retrieval Framework and Implementation for Electronic Documents with Similar Layouts
- Dehyphenation: Some empirical methods
- Improved Dehyphenation of Line Breaks for PDF Text Extraction
- Handwritten Arabic Digits Recognition Using Bézier Curves
- Recognition of Tables and Forms
- LayoutLM: Pre-training of Text and Layout for Document Image Understanding
- Looking Beyond Text: Extracting Figures, Tables and Captions from Computer Science Papers
- Integrating and querying similar tables from PDF documents using deep learning
- Object-Level Document Analysis of PDF Files
- Voronoi++: A Dynamic Page Segmentation approach based on Voronoi and Docstrum features
- A Font Setting Based Bayesian Model to Extract Mathematical Expression in PDF Files
- Ensure Non-Overlapping in Document Layout Analysis
- Multi-Task Handwritten Document Layout Analysis
- Algorithms For The Reduction Of The Number Of Points Required To Represent A Digitized Line Or Its Caricature
- Page Segmentation and Zone Classification: The State of the Art
- TAO: System for Table Detection and Extraction from PDF Documents
- Chargrid-OCR: End-to-end Trainable Optical Character Recognition through Semantic Segmentation and Object Detection
- Document Image Segmentation as a Spectral Partitioning Problem
- Identifying Table Boundaries in Digital Documents via Sparse Line Detection
- Extracting Tables from Documents using Conditional Generative Adversarial Networks and Genetic Algorithms
- Configurable Table Structure Recognition in Untagged PDF Documents
- Document understanding for a broad class of documents
- Complicated Table Structure Recognition
- Automatic Table Ground Truth Generation and A Background-analysis-based Table Structure Extraction Method
- Mathematical Formula Identification in PDF Documents
- Table Header Detection and Classification
- Detecting Table Region in PDF Documents Using Distant Supervision
- Combining Linguistic and Spatial Information for Document Analysis
- A System for Converting PDF Documents into Structured XML Format
- Dehyphenation of Words and Guessing Ligatures
- The Zonemap Metric For Page Segmentation And Area Classification In Scanned Documents
- BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding
- Hybrid Page Layout Analysis via Tab-Stop Detection
- A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures
- Header and Footer Extraction by Page-Association
- Improving the Table Boundary Detection in PDFs by Fixing the Sequence Error of the Sparse Lines
- A Study on the Document Zone Content Classification Problem
- PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents
- A Data Mining Approach to Reading Order Detection
- A mixed approach to auto-detection of page body
- Edge Detection Based Shape Identification
- pdf2table: A Method to Extract Table Information from PDF Files
- Extraction, layout analysis and classification of diagrams in PDF documents
- A Rectangle Mining Method for Understanding the Semantics of Financial Tables
- Beta-Shape Using Delaunay-Based Triangle Erosion
- A survey of table recognition
- Graphics Recognition in PDF documents
- Analysing layout information: searching PDF documents for pictures
- Kd-Trees for Document Layout Analysis
- Benchmarking Page Segmentation Algorithms
- Polygon Detection from a Set of Lines
- How Document Pre-processing affects Keyphrase Extraction Performance
- Improving typography and minimising computation for documents with scalable layouts
- Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers
- ICDAR 2021 Competition on Historical Map Segmentation
- A Large Dataset of Historical Japanese Documents with Complex Layouts
- DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis
-
Notifications
You must be signed in to change notification settings - Fork 2
"Religion is an insult to human dignity. Without it you would have good people doing good things and evil people doing evil things. But for good people to do evil things, that takes religion."― Steven Weinberg
manjunath5496/Document-Layout-Analysis-Papers
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
"Religion is an insult to human dignity. Without it you would have good people doing good things and evil people doing evil things. But for good people to do evil things, that takes religion."― Steven Weinberg
Resources
Stars
Watchers
Forks
Releases
No releases published