Data Augmentation with GAN

An A.I. used to train another A.I. so the second A.I. can get more powerful

How does it work

Basically, you train a GAN with a dataset, then use the trained GAN to generate another dataset that will feed the CNN. In general it works just like a normal data augmentation, where you create data that doesn't necessarily exist.

Tests

Based on the tests made with data from Fashion MNIST, this method performed ~2% better on F1 score than only training with the normal dataset. Tests also appointed that it doesn't work with some datasets (like normal MNIST). Tests were made with 200 epochs and with dataset set to 2,000. Details:

200 Epochs on normal dataset: 0.8122 F1 score.
200 Epochs on generated dataset: 0.8305 F1 score.

Usage

If you want to use, be sure to install Jupyter Notebook on your computer. The code could be written in a .py file, but we need to check the outputs of the some pieces of code and have the control over the models.

Make sure you have keras, sklearn, numpy, cv2 and matplotlib installed.
Extract dataset.zip into the folder.
Make a folder called Samples.
Run jupyter notebook on the directory and open SelfTrainer.ipynb file.
Configure the code, if there's something that might need a little tweaking.
Run the code cells and check their output as needed.

Could it help fixing an unbalanced dataset?

Yes. If you've got an unbalanced dataset, you can use a GAN to help you balance the classes. The results showed an improvement of ~4% on F1 score.

Example

Samples per class: 134, 59, 18, 479, 31, 526, 510, 125, 33, 81 Total samples: 1996

Type	F1 score
Dataset only	74.59%
With augmented data from GAN	78.73%

Notes

This method works better with limited data. In here it was used 2,000 images.
This method doesn't work with all datasets, normal MNIST as an example.
If you want to test, be sure to test both GAN and GAN + DATASET options, performances may vary.
When training the GAN, samples of the outputs will go into the Samples folder, this way you can check the GAN performance.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
SelfTrainer.ipynb		SelfTrainer.ipynb
dataset.zip		dataset.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Augmentation with GAN

An A.I. used to train another A.I. so the second A.I. can get more powerful

How does it work

Tests

Usage

Could it help fixing an unbalanced dataset?

Example

Notes

About

Releases

Packages

Languages

ivanch/self-trainer

Folders and files

Latest commit

History

Repository files navigation

Data Augmentation with GAN

An A.I. used to train another A.I. so the second A.I. can get more powerful

How does it work

Tests

Usage

Could it help fixing an unbalanced dataset?

Example

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages