This repository is the implementation presented in the paper "Modeling Selective Feature Attention for Lightweight Text Matching"
To use the model defined in this repository, you will first need to install PyTorch on your machine by following the steps
described on the package's official page (this step is only necessary if you use
Then, to install the dependencies necessary to run the model, simply execute the command pip install --upgrade .
from within
the cloned repository (at the root, and preferably inside of a virtual environment).
The script located in the scripts/ folder of this repository can be used to download some NLI dataset and pretrained word embeddings. By default, the script fetches the SNLI corpus and the GloVe 840B 300d embeddings.
The script's usage is the following:
where the downloaded data must be saved (defaults to ../data/).
Before the downloaded corpus and embeddings can be used in the base model, they need to be preprocessed. This can be done with the scripts in the scripts/preprocessing folder of this repository.
The scripts' usage is the following:
where config
is the path to a configuration file defining the parameters to be used for preprocessing. Default
configuration files can be found in the config/preprocessing folder of this repository.
The scripts in the scripts/training folder can be used to train the model on some training data and validate it on some validation data.
The script's usage is the following:
python [-h] [--config CONFIG] [--checkpoint CHECKPOINT]
where config
is a configuration file (default ones are located in the config/training folder), and checkpoint
is an
optional checkpoint from which training can be resumed. Checkpoints are created by the script after each training epoch, with
the name esim_*.pth.tar, where '*' indicates the epoch's number.
If you find the code is helpful, please cite:
title={Modeling Selective Feature Attention for Lightweight Text Matching},
author={Zang, Jianxiang and Liu, Hui}