Skip to content

Image caption generator using CNN as an encoder and RNN as an decoder.

Notifications You must be signed in to change notification settings

mmilunovic/a-picture-is-a-thousand-words

Repository files navigation

A picture is worth a thousand (coherent) words

Implementation of CNN-RNN architecture for image caption generation proposed in this paper.

Google AI Blog about this problem.

Getting started

git clone https://github.com/mmilunovic/a-picture-is-a-thousand-words.git
pip install -r requirements.txt

Usage

apply_model_to_image_raw_bytes(open("test-image.jpg", "rb").read())

Training the model yourself

If you want to train the model by yourself, you'll need to download training and validation datasets and place them in the train_data and test_data directories:

References

About

Image caption generator using CNN as an encoder and RNN as an decoder.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages