Skip to content
/ GRAPE Public

Official repository for EMNLP'22 paper: Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering

Notifications You must be signed in to change notification settings

jumxglhf/GRAPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Source Code for GRAPE:grapes:

grape

Hi all, this is the official repository for EMNLP 2022 paper: GRAPE:grapes: : Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering. Our paper can be found at [arXiv link]. We sincerely apprecaite your interests in our projects!

Instruction

To run our experiments, first please put the data folder in current directory like this:

GRAPE
|   README.md
|   train_reader.py    
|
|----src
|    |   data.py
|    |   evaluation.py
|    |   model.py
|    |   ...
|       
└----bash
|    |   script_multi_gpu.sh  # Multi-GPU training and evaluation
|
|----data # download from the data source
     |
     |----graphs
     |    |   bin_files
     |    |   ...   
     |
     |----json_input
          |    |   jsonl_input_data
          |    |   relation_json 
          |    |   test_relation_json # to test the EM on the subset where examples are fact-enhanced

With the directory structured like above, you can simply run:

cd bash
bash script_multi_gpu.sh # Multi-GPU
  • For the multi-GPU version please change CUDA_VISIBLE_DEVICES and --nproc_per_node accordingly.
  • Changing --n_context requires different graph files. (In this demo we only support --n_context 100)
  • Ablation experiments can be tested by setting --gnn_mode to "NO_RELATION" or "NO_ATTENTION".

Detailed GPU consumption can be found in our paper. (roughly 30GB for large config with --per_gpu_batch_size set to 1, and 29GB for base config with --per_gpu_batch_size set to 3.)

Dependencies

The main libraries we utilize are:

  • torch==1.11.0
  • transformers==4.18.0
  • dgl==0.8.2
  • sentencepiece

Data

To reproduce our results, please download the data folder from this google drive [drive link].

Inside the data folder are the json inputs as well as DGL graphs. We are still cleaning the code for data pre-processing and it is not included in this repository. Please contact the authors for these code.

We also uploaded the model checkpoints (both large and base configurations) for NQ and TQA as well as the intermediate outputs to the google drive [drive link].

Cite

If you find this repository useful in your research, please cite our paper:

@inproceedings{ju2022grape,
  title={GRAPE: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering},
  author={Ju, Mingxuan and Yu, Wenhao and Zhao, Tong and Zhang, Chuxu and Ye, Yanfang},
  booktitle={Findings of Empirical Methods in Natural Language Processing},
  year={2022}
}

Credit

We modified our code from the repository of Fusion-in-Decoder (FiD) [repo].

Contact

Mingxuan Ju ([email protected]), Wenhao Yu ([email protected])

About

Official repository for EMNLP'22 paper: Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published