The classifier is used to do a task such as
Input
a question (e.g., "When was the first Wall Street Journal published ?")
Output
one of N predefined classes (e.g., NUM:date, which is the class label for questions that require an answer of date)
(Ensure you have installed the basic python3 environment)
# Install pytorch library if you do not
pip3 install --user torch
# Install sklearn library if you do not
pip3 install --user scikit-learn
document
: a document Function_Description.md containing a description for each function, a README file with instructions on how to use the code, a Requirements.txt containing the used python3 library.
data
: training, dev, test, configuration files and some extra files needed for models (e.g., vocabulary). For each model, one configuration file is provided(e.g., bow.config, bilstm.config).
src
: source code.
Dowload all the three folders and move to src
folder.
cd src
Below are the training steps , every training needs few minutes to complete.
# If you want to train the Bilstm model
python3 question_classifier.py --train --config '../data/bilstm.config'
# If you want to train the Bag of Words model
python3 question_classifier.py --train --config '../data/bow.config'
Below are the testing steps, REMEMBER you should run the corresponding training step of the classified model above firstly, and for the Ensemble model, you need to train both classified models firstly, i.e. training the Bilstm and Bow models at first, and then you are allowed to ensemble a model and test it.
# If you want to test the Bilstm model
python3 question_classifier.py --test --config '../data/bilstm.config'
# If you want to test the Bag of Words model
python3 question_classifier.py --test --config '../data/bow.config'
# If you want to test the ensemble model
python3 question_classifier.py --test --config '../data/ensemble.config'
Move to the data
folder
cd data
Open the corresponding configuration file for bilstm and bow model and the below parameters are allowed to change.
## If pretrained is False, Randomly initialised word embeddings are set;
## If pretrained is True, Pre-trained word embeddings are set.
pretrained : True
## If freeze is false, model fine-tune the pre-trained word embeddings.
## If freeze is True, model freeze the pre-trained word embeddings.
freeze : True
## The number of training epochs setting for the model
epoch: 10
## The learning rate used for training the models.
lr_param: 0.02