Skip to content

A faster, smaller GAN leveraging the power of pre-trained CLIP for efficient and high quality image synthesis from text.

License

Notifications You must be signed in to change notification settings

VinayHajare/EfficientCLIP-GAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visitors License: GNU GPL v2.0 Python 3.9 Packagist hardware Last Commit Maintenance Ask Me Anything ! Updated

EfficientCLIP-GAN: High-Speed Image Generation with Compact CLIP-GAN Architecture

A high-quality, fast, and efficient text-to-image synthesis model

Generated Images

Requirements

  • python 3.9
  • Pytorch 1.9
  • At least 1xTesla v100 32GB GPU (for training)
  • Only CPU (for inference)

EfficientCLIP-GAN is a small, rapid and efficient generative model which can generate multiple pictures in one second even on the CPU as compared to Diffusion Models.

Installation

Clone this repo.

git clone https://github.com/VinayHajare/EfficientCLIP-GAN
pip install -r requirements.txt

Install CLIP

Preparation

Datasets

  1. Download the preprocessed metadata for birds and extract them to data/
  2. Download the birds image data. Extract them to data/birds/

OR

  1. Download the preprocessed metadata and CUB dataset in a single zip download it and extract to data/

Training

cd EfficientCLIP-GAN/code/

Train the EfficientCLIP-GAN model

  • For bird dataset: bash scripts/train.sh ./cfg/bird.yml

Resume training process

If your training process is interrupted unexpectedly, set state_epoch, log_dir, and pretrained_model_path in train.sh with appropriate values to resume training.

TensorBoard

Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.

  • For bird dataset: tensorboard --logdir=./code/logs/bird/train --port 8166

Evaluation

Download Pretrained Model

Evaluate EfficientCLIP-GAN model

cd EfficientCLIP-GAN/code/

set pretrained_model in test.sh to models path

  • For bird dataset: bash scripts/test.sh ./cfg/bird.yml

Performance

The released model achieves better performance than the Latent Diffusion.

Model Birds-FID↓ Birds-CS↑
EfficientCLIP-GAN 11.806 31.70

Try Now

The gradio demo is available as a hosted HuggingFace Space here.
You can run this app locally

cd EfficientCLIP-GAN/gradio app
pip install -r requirements.txt

then

python app.py

Note :

Weights are available on HuggingFace Hub

Inference (Sampling)

Synthesize images from your text descriptions/Prompts

  • the inference.ipynb can be used to sample

Support EfficientCLIP-GAN

If you find this useful in your research, please consider giving a star to repository

The code is released for academic research use only. For commercial use, please contact Vinay Hajare.

Contributors

Contributors Display

About

A faster, smaller GAN leveraging the power of pre-trained CLIP for efficient and high quality image synthesis from text.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published