Skip to content

hupidong/Match-Ignition

 
 

Repository files navigation

Match-Ignition

PyTorch implementation of [CIKM 2021] Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching.

Python 3.6 License

Usage

Environment Preparation

pip install -r requirements.txt
cd transformers-v4.30.0.dev0
pip install -e .

Data Preparation

cd data/dataset/cnse
tar xzvf orig.tar.gz

Note: the original dataset can be downloaded from here.

Sentence-level Noise Filtering

# CNSE dataset
python generate_data.py --data_dir=data/dataset/cnse/orig --save_dir==data/dataset/cnse/model --from_raw_text=0 --append_keyword=1

# general raw-text dataset
python generate_data.py --data_dir=data/dataset/yuqing_news/example/orig --save_dir=data/dataset/yuqing_news/example/model --from_raw_text=1 --append_keyword=0

Word-level Noise Filtering

python run.py 

Citation

If you use Match-Ignition in your research, please use the following BibTex entry.

@inproceedings{pang2021matchignition,
    title={Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching},
    author={Liang Pang and Yanyan Lan and Xueqi Cheng},
    booktitle = {Proceedings of the 30th ACM International Conference on Information and Knowledge Management},
    series = {CIKM'21},
    year = {2021},
}

License

Apache-2.0

Copyright (c) 2019-present, Liang Pang (pl8787)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 99.0%
  • Jupyter Notebook 0.5%
  • Cuda 0.4%
  • Shell 0.1%
  • Dockerfile 0.0%
  • C++ 0.0%