This is an experimental tensorflow implementation of synthesizing Images from captions using Skip Thought Vectors. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow. The following is the model architecture. The blue bars represent the text encoding using Skip Thought Vectors.
Image Source : Generative Adversarial Text-to-Image Synthesis Paper
- Python 2.7.6
- Tensorflow
- h5py
- Theano : for skip thought vectors
- scikit-learn : for skip thought vectors
- NLTK : for skip thought vectors
- The model is currently trained on the flowers dataset. Download the images from this link and save them in
Data/flowers/jpg
. Also download the captions from this link. Extract the archive, copy thetext_c_10
folder and paste it inData/flowers
. - Download the pretrained models and vocabulary for skip thought vectors as per the instructions give here. Save the downloaded files in
Data/skipthoughts
. - Make empty directories in Data,
Data/samples
,Data/val_samples
andData/Models
. They will be used for sampling the generated images, while training.
- Data Processing : Extract the skip thought vectors for the flowers data set using :
python data_loader.py --data_set="flowers"
-
Training
- Basic usage
python train.py --data_set="flowers"
- Options
z_dim
: Noise Dimension. Default is 100.t_dim
: Text feature dimension. Default is 256.batch_size
: Batch Size. Default is 64.image_size
: Image dimension. Default is 64.gf_dim
: Number of conv in the first layer generator. Default is 64.df_dim
: Number of conv in the first layer discriminator. Default is 64.gfc_dim
: Dimension of gen untis for for fully connected layer. Default is 1024.caption_vector_length
: Length of the caption vector. Default is 1024.data_dir
: Data Directory. Default isData/
.learning_rate
: Learning Rate. Default is 0.0002.beta1
: Momentum for adam update. Default is 0.5.epochs
: Max number of epochs. Default is 600.resume_model
: Resume training from a pretrained model path.data_set
: Data Set to train on. Default is flowers.
- Basic usage
-
Generating Images from Captions
- Write the captions in text file, and save it as
Data/sample_captions.txt
. Generate the skip thought vectors for these captions using:
python generate_thought_vectors.py --caption_file="Data/sample_captions.txt"
- Generate the Images for the thought vectors using:
python generate_images.py --model_path=<path to the trained model>
- Write the captions in text file, and save it as
Caption | Actual Image | Generated Images |
---|
- Train the model on MS-COCO data set. The dataset is huge and with the resource I have, it will take several days to train the model.
- Try out different caption embeddings. Also try to train the caption embedding RNN along with the model.