Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images are very distorted #36

Open
khawar-islam opened this issue Oct 20, 2022 · 6 comments
Open

Images are very distorted #36

khawar-islam opened this issue Oct 20, 2022 · 6 comments

Comments

@khawar-islam
Copy link

Hello @moonbings During synthetic dataset generation, some image are very distorted and i don't have any idea how to fix it. I played with some parameters but it didnt work for me. Any solution?

Samples image are not clear
23
33
83

@moonbings
Copy link
Collaborator

moonbings commented Oct 23, 2022

Hi,

This engine often generates poorly images. There are 3 problems in this engine.
Here are some ways to solve this problem.


1. Color difference threshold

Currently, this engine coverts to grayscale images and checks color differences at the border of text and background. If the color difference does not exceed a given threshold, this image is skipped.
If you adjust the threshold, it can generate better images.
Edit loDiff, upDiff parameter values in floodFill function from 16, 16 to 0, 16 or 0, 32 in this code.

cv2.floodFill(gray, visit, (x, y), 1, 16, 16, flag)

2. Color difference between text and text effect

The first solution has a limitation.
Currently, this engine does not consider the color difference between text and text effect (ex2 "그릇").
You can solve the problem by adding logic to check the color difference between text and text effect.

3. Post-processing

Currently, this engine does not consider post-processing (ex1 "가?", ex3 "긍정적").
The text is not visible due to post-processing like blur, even if the color difference between text and background is large.
If you add color difference checking logic after post-processing, it can genertate better images.
Add the following code in this code.

image, mask = self._postprocess_image(image, mask)

if not _check_visibility(image, fg_image[..., 3]):
    raise RuntimeError("Text is not visible")

I introduced some solutions, but these have not been tested.
Test performance may be lower than expected.

Thanks.

@aadit2697
Copy link

aadit2697 commented Mar 2, 2023

@moonbings I was facing a similar issue. Solution 1 you provided has helped to some extent.
Also, I tried out solution 3, but to no avail. 1 in 100 images are distorted and this will affect the training of my data for OCR application.
Can you suggest a workaround?
Edit: This is an example [ https://user-images.githubusercontent.com/39117677/222642021-e3d40153-ce93-4455-9422-881d91c478ea.jpg ]

the word is: દીધું!

@khawar-islam were you able to solve this issue?

@khawar-islam
Copy link
Author

@aadit2697 No, I am still facing same issue and of course it affects training

@aadit2697
Copy link

aadit2697 commented Mar 8, 2023

@khawar-islam
you want to play around with the style and post-process in the config.yaml These numbers worked for me.
Hope this helps! Do let us know if this works out for you :)

style:
prob: 0.25
args:
weights: [1, 2, 2]
args:
# text border
- size: [1, 2]
alpha: [1, 1]
grayscale: 0
# text shadow
- distance: [1, 2]
angle: [0, 0]
alpha: [0.3, 0.7]
grayscale: 0
# text extrusion
- length: [1, 2]
angle: [0, 360]
alpha: [1, 1]
grayscale: 0
postprocess:
args:
# gaussian noise
- prob: 0.0
args:
scale: [4, 8]
per_channel: 0
# gaussian blur
- prob: 0.0
args:
sigma: [0, 2]
# resample
- prob: 0.0
args:
size: [0.4, 0.4]
# median blur
- prob: 0.0
args:
k: [1, 1]

@khawar-islam
Copy link
Author

khawar-islam commented Mar 8, 2023

@aadit2697 thank you!

What do you think about two parameters min_length: 6, max_length: 25 ? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?

@aadit2697
Copy link

@aadit2697 thank you!

What do you think about two parameters min_length: 6, max_length: 25 ? I already modified it but sometimes images are distorted. To generate a more clean dataset, if we reduce the max_length, would it be good? What is your opinion?

You could try it out. Personally, for my use case, I used default values for max and min length. Although If your words are smaller than the 25-character length, you could give it a shot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants