Document Visual Question Answering (DocVQA)

This repo hosts the basic functional code for our approach entitled HyperDQA in the Document Visual Question Answering competition hosted as a part of Workshop on Text and Documents in Deep Learning Era at CVPR2020. Our approach stands at position 4 on the Leaderboard.

Read more about our approach in this blogpost!

Installation

Virtual Environment Python 3 (Recommended)

Clone the repository

git clone https://github.com/anisha2102/docvqa.git

Install libraries

pip install -r requirements.txt

Downloads

Download the dataset The dataset for Task 1 can be downloaded from the Competition Website from the Downloads Section. The dataset consists of document images and their corresponding OCR transcriptions.
Download the pretrained model Download the pretrained model for LayoutLM-Base, Uncased from here

Prepare dataset

python create_dataset.py \
         <data-ocr-folder> \
         <data-documents-folder> \
         <path-to-train_v1.0.json> \
         <train-output-json-path> \
         <validation-output-json-path>

Train the model

CUDA_VISIBLE_DEVICES=0 python run_docvqa.py \
    --data_dir <data-folder> \
    --model_type layoutlm \
    --model_name_or_path <pretrained-model-path> \ #example ./models/layoutlm-base-uncased
    --do_lower_case \
    --max_seq_length 512 \
    --do_train \
    --num_train_epochs 15 \
    --logging_steps 500 \
    --evaluate_during_training \
    --save_steps 500 \
    --do_eval \
    --output_dir  <data-folder>/<exp-folder> \
    --per_gpu_train_batch_size 8 \
    --overwrite_output_dir \
    --cache_dir <data-folder>/models \
    --skip_match_answers \
    --val_json <train-output-json-path> \
    --train_json <train-output-json-path> \

Model Checkpoints

Download the pytorch_model.bin file from the link below and copy it to the models folder. Google Drive Link

Demo

Try out the demo on a sample datapoint with demo.ipynb

Acknowledgements

The code and pretrained models are based on LayoutLM and HuggingFace Transformers. Many thanks for their amazing open source contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
models		models
LICENSE		LICENSE
README.md		README.md
create_dataset.py		create_dataset.py
demo.ipynb		demo.ipynb
modeling_layoutlm.py		modeling_layoutlm.py
requirements.txt		requirements.txt
run_docvqa.py		run_docvqa.py
tokenization.py		tokenization.py
utils_docvqa.py		utils_docvqa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Visual Question Answering (DocVQA)

Installation

Virtual Environment Python 3 (Recommended)

Downloads

Prepare dataset

Train the model

Model Checkpoints

Demo

Acknowledgements

About

Releases

Packages

Languages

License

anisha2102/docvqa

Folders and files

Latest commit

History

Repository files navigation

Document Visual Question Answering (DocVQA)

Installation

Virtual Environment Python 3 (Recommended)

Downloads

Prepare dataset

Train the model

Model Checkpoints

Demo

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages