[PYTORCH] Very deep convolutional networks for Text Classification

Introduction

Here is my pytorch implementation of the model described in the paper Very deep convolutional networks for Text Classification paper.

Datasets:

Statistics of datasets I used for experiments. These datasets could be download from link

Dataset	Classes	Train samples	Test samples
AG’s News	4	120 000	7 600
Sogou News	5	450 000	60 000
DBPedia	14	560 000	70 000
Yelp Review Polarity	2	560 000	38 000
Yelp Review Full	5	650 000	50 000
Yahoo! Answers	10	1 400 000	60 000
Amazon Review Full	5	3 000 000	650 000
Amazon Review Polarity	2	3 600 000	400 000

Setting:

I almost keep default setting as described in the paper. For optimizer and learning rate, there are 2 settings I use:

SGD optimizer with different learning rates (0.01 in most cases).
Adam optimizer with different learning rates (0.001 in most case).

Additionally, in the original model, one epoch is seen as a loop over batch_size x num_batch records (128x5000 or 128x10000 or 128x30000), so it means that there are records used more than once for 1 epoch. In my model, 1 epoch is a complete loop over the whole dataset, where each record is used exactly once.

Training

If you want to train a model with common dataset and default parameters, you could run:

python train.py -d dataset_name: For example, python train.py -d dbpedia

If you want to train a model with common dataset and your preference parameters, like the depth of network, you could run:

python train.py -d dataset_name -t depth: For example, python train.py -d dbpedia -t 9

If you want to train a model with your own dataset, you need to specify the path to input and output folders:

python train.py -i path/to/input/folder -o path/to/output/folder

Test

You could find all trained models I have trained in link

Experiments:

I run experiments in 2 machines, one with NVIDIA TITAN X 12gb GPU and the other with NVIDIA quadro 6000 24gb GPU.

Results for test set are presented as follows: A(B):

A is accuracy reproduced here.
B is accuracy reported in the paper.

It should be noted that in experiments with depth is 49 layers, there is no accuracy reported in the paper. Therefore here I only show the results obtained from my experiments.

Depth	9	17	29	49
ag_news	87.67(90.17)	88.09(90.61)	88.01(91.33)	84.71
sogu_news	95.67(96.42)	95.89(96.49)	95.73(96.82)	95.35
db_pedia	98.33(98.44)	98.28(98.39)	98.07(98.59)	97.38
yelp_polarity	94.57(94.73)	95.20(94.95)	94.95(95.37)	95.08
yelp_review	62.44(61.96)	63.44(62.59)	62.70(63.00)	62.83
yahoo_answer	69.57(71.76)	70.03(71.75)	70.34(72.84)	69.16
amazon_review	60.34(60.81)	60.98(61.19)	60.67(61.61)	59.80
amazon_polarity	94.30(94.31)	94.60(94.57)	94.53(95.06)	94.10

Below are the training/test loss/accuracy curves for each dataset's experiments (figures for 9, 17, 29-layer model are from left to right) :

ag_news

sogou_news

db_pedia

yelp_polarity

yelp_review

yahoo! answers

amazon_review

amazon_polarity

You could find detail log of each experiment containing loss, accuracy and confusion matrix at the end of each epoch in output/datasetname_depth_number/logs.txt, for example output/ag_news_depth_29/logs.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
input		input
output		output
src		src
tensorboard		tensorboard
visualization		visualization
LICENSE		LICENSE
README.md		README.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[PYTORCH] Very deep convolutional networks for Text Classification

Introduction

Datasets:

Setting:

Training

Test

Experiments:

About

Releases

Packages

Contributors 2

Languages

License

vietnh1009/Very-deep-cnn-pytorch

Folders and files

Latest commit

History

Repository files navigation

[PYTORCH] Very deep convolutional networks for Text Classification

Introduction

Datasets:

Setting:

Training

Test

Experiments:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages