We pretrain our models using Microsoft COCO Dataset. Then, we train the models using SentiCap Dataset.
- python 3.7.4
- numpy 1.18.1
- hickle 3.4.6
- scikit-image 0.16.2
- tensorflow 1.14 or tensorflow-gpu 1.14
- tqdm 4.44.1
- torch 1.4.0
- matplotlib 3.1.3
- COCO Dataset loader and build pre-processing engine
- Build LSTM Generator
- Incorporate emotions into the Generator
- Generator Logger
- Build Conventional Discriminator
- Discriminator Logger
- GAN train engine
- Validation engines
- Record examples of generated captions in GAN structure
- SentiCap Dataset loader and build pre-processing engine
- Build CapsNet Discriminator
- Inference engine
- Train and evaluate
- Plots
- Run
./download.sh
and go to step 4, otherwise go to step 2. - Download Microsoft COCO Dataset including neutral image caption data: images: 2014 Train images [83K/13GB] (download), 2014 Val images [41K/6GB] (download), 2014 Test images [41K/6GB] (download), captions: 2014 Train/Val annotations [241MB] (download) and extract them to the folder data/images.
- Download SentiCap Dataset including sentiment-bearing image caption data: captions (download) and only extract the file data/senticap_dataset.json to data/annotations.
- Download the VGG network used for feature extraction download and move it to the folder data/
- Run
python resize.py --input_folder_dir ./data/images/train2014/ --output_folder_dir ./data/images/train2014_resized/ && python resize.py --input_folder_dir ./data/images/val2014/ --output_folder_dir ./data/images/val2014_resized/
(reseizes the downloded images into [224, 224] and puts them in data/images). - Run
python prepro.py --coco_dataset_portions 1. 0.8 0.2 --senticap_dataset_portions 0.8 0.19 0.01
, where the first second and third entries are the split portion from the original dataset. - Run
python train.py --gen_train --gen_save_model_dir ./model/generator/ --gen_dataset coco --batchsize 8 --gen_epochs 10
to pretrain the generator. - Run
python train.py --disc_train --disc_network capsnet --gen_load_model_dir ./model/generator/ --disc_save_model_dir ./model/discriminator/ --disc_dataset coco --batchsize 8 --disc_epochs 10
to pretrain the discriminator. - Run
python train.py --gan_train --disc_network capsnet --gen_load_model_dir ./model/generator/ --disc_load_model_dir ./model/discriminator/ --gan_save_model_dir ./model/gan/ --gan_dataset senticap --batchsize 8 --gan_epochs 10
to train the GAN. You can add the arguments--gen_load_model_dir
and/or--disc_load_model_dir
to initialize your model with a pretrained generator and/or discriminator.
- Run
python inference.py --word_to_idx_dir data/word_to_idx.pkl --image "test.jpg" --load_model_dir model/gan/
to describe an image.