Analysis of visual text generation

This repo contains the code for my independent study research "Analysis of visual text generation".

The imagen open source code is provide via https://github.com/lucidrains/imagen-pytorch, the local copy in this repo provides the modifications of that project needed for this research.

Text-In-Image Dataset

This dataset provides a simple baseline for text within the image space with binary colors. The dataset is generated programmatically with the image manipulation library PIL. The goal is to generate text inside images with two varying factors:

Position of the text
Style of the text

In addition to the the style and position is captured in an associated caption/label for each image. Ideally these captions will help models learn the keywords and map features such as position and style to an output text within an image.

As of right now positions are limited to the following:

Top Left
Top Right
Bottom
Center

Styles are limited to the following:

Serif
San Serif
Cursive
Hand Painted

Words are sourced using the english-words package. By default position and areas are decided at random to promote diversity in the dataset.

Usage

Generating the dataset

python generate.py

Dataset

This code includes a pytorch-compatible wrapper class for loading the images.

from dataset import TextInImageDataset

dataset = TextInImageDataset(csv_file='data.csv', imgs_dir='imgs/')

Examples

Some examples of the dataset are shown below:

Caption: the cursive text 'forewish' at the bottom

Caption: the Hand-Painted text 'chasubled' in the top right

Caption: the Serif text 'triobolon' in the middle

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
imagen-pytorch		imagen-pytorch
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
generate.py		generate.py
labels.csv		labels.csv
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of visual text generation

Text-In-Image Dataset

Usage

Dataset

Examples

About

Releases

Packages

Languages

suneettipirneni/text-in-image-dataset

Folders and files

Latest commit

History

Repository files navigation

Analysis of visual text generation

Text-In-Image Dataset

Usage

Dataset

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages