Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kolors] Add PAG #8934

Merged
merged 16 commits into from
Aug 7, 2024
Merged

[Kolors] Add PAG #8934

merged 16 commits into from
Aug 7, 2024

Conversation

asomoza
Copy link
Member

@asomoza asomoza commented Jul 22, 2024

What does this PR do?

Adds PAG to kolors

Note: Kolors with PAG behaves different than with SDXL, I had to use a lower scale and different layers to get good results.

How to test

Txt2Img

import torch

from diffusers import AutoPipelineForText2Image, DPMSolverMultistepScheduler


pipe = AutoPipelineForText2Image.from_pretrained(
    "Kwai-Kolors/Kolors-diffusers",
    torch_dtype=torch.float16,
    variant="fp16",
    enable_pag=True,
    pag_applied_layers=["down.blocks.2.attentions.1", "up.blocks.0.attentions.1"],
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

prompt = "cinematic film still of a pirate ship is navigating the milk inside of a steaming cup of coffee. The ship’s wooden hull glistens with droplets of coffee, and tiny pirates can be seen on deck, adjusting sails and peering through miniature telescopes. The tilt-shift effect adds a magical touch, high budget hollywood movie, cinemascope, epic, film grain"

image = pipe(
    prompt=prompt,
    negative_prompt="",
    guidance_scale=5.5,
    num_inference_steps=25,
    pag_scale=1.5,
).images[0]
W/O PAG
20240721003413_227659598 20240722134235_227659598

After the PAG refactor, the image changed a bit, not sure why:

before after
20240722134235_227659598 20240806211616_227659598

Who can review?

@yiyixuxu @a-r-r-o-w @sunovivid @sayakpaul

@asomoza asomoza changed the title Kolors pag [Kolors] Add PAG Jul 22, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!
add a test?

@asomoza
Copy link
Member Author

asomoza commented Jul 26, 2024

Added the fast tests and I fixed the old ones because I needed to get the original Kolors Pipeline to test the AutoPipeline. In short, I had to add all the missing tests I had as TODO for kolors.

@yiyixuxu @sayakpaul should I add the slow tests for PAG? this one has a really big text encoder which is not in transformers yet.

@sayakpaul
Copy link
Member

@yiyixuxu @sayakpaul should I add the slow tests for PAG? this one has a really big text encoder which is not in transformers yet.

I would say, it'd be okay skipping the SLOW tests for now.

@a-r-r-o-w
Copy link
Member

We should try and merge this before next release, no?

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's add the pipeline to the list of pipelines we support in the docs?

Comment on lines +146 to +148
@unk_token.setter
def unk_token(self, value: str):
self._unk_token = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like these should have been done in a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I don't do this, the tests fails (throws AttributeError), I'm ok with doing this in a separate PR but then I'll have to skip the tests in this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. Then no worries.

Comment on lines -136 to +137
# throws AttributeError: property 'eos_token' of 'ChatGLMTokenizer' object has no setter
# not sure if it is worth to fix it before integrating it to transformers
def test_save_load_optional_components(self):
# TODO (Alvaro) need to fix later
pass
super().test_save_load_optional_components(expected_max_difference=2e-4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Probably should have been done in a separate PR.

@sayakpaul sayakpaul merged commit 39e1f7e into huggingface:main Aug 7, 2024
15 checks passed
@asomoza asomoza deleted the kolors-pag branch August 7, 2024 04:01
@yiyixuxu yiyixuxu added the PAG label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants