Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PuLID #2838

Merged
merged 7 commits into from
May 4, 2024
Merged

Support PuLID #2838

merged 7 commits into from
May 4, 2024

Conversation

huchenlei
Copy link
Collaborator

@huchenlei huchenlei commented May 3, 2024

Closes #2835.

Overview

PuLID is an ip-adapter alike method to restore facial identity. It uses both insightface embedding and CLIP embedding similar to what ip-adapter faceid plus model does. However, there is an extra process of masking out the face from background environment using facexlib before passing image to CLIP. PuLID also uses Eva CLIP instead of normal CLIP. In the attn overrides, PuLID also does something more than IPAdapter, as it zero pads the tensor and adds to ortho of hidden states. If you are interested you can read their paper on how this is handled.

image

How to use

Install sd-webui-controlnet-evaclip extension

PuLID uses both evaclip embedding and insightface embedding as proj module input. So in order to use the feature, you need to install https://github.com/huchenlei/sd-webui-controlnet-evaclip extension.

Download Model

You can download fp16 model here: https://huggingface.co/huchenlei/ipadapter_pulid/resolve/main/ip-adapter_pulid_sdxl_fp16.safetensors

ControlNet Unit Setting

image
There is an extra radio group to let you select PuLID mode. According to my testing, they only make slight difference, but still something nice to have.

Example

Input
liuyifei

Output [Fidelity]:
image

Output [Style]:
image

Generation params

masterpiece painting, buildings in the backdrop, kaleidoscope, lilac orange blue cream fuchsia bright vivid gradient colors, the scene is cinematic, (1girl:1.2), portrait,, emotional realism, double exposure, watercolor ink pencil, graded wash, color layering, magic realism, figurative painting, intricate motifs, organic tracery, pol
Negative prompt: flaws in the eyes, flaws in the face, flaws, lowres, non-HDRi, low quality, worst quality,artifacts noise, text, watermark, glitch, deformed, mutated, ugly, disfigured, hands, low resolution, partially rendered objects, deformed or partially rendered eyes, deformed, deformed eyeballs, cross-eyed,blurry
Steps: 20, Sampler: Euler a, Schedule type: Automatic, CFG scale: 7, Seed: 3390099943, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Clip skip: 2, ControlNet 0: "Module: ip-adapter-auto, Model: ip-adapter_pulid_sdxl_fp16 [d86d05ea], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0, Guidance End: 1, Pixel Perfect: False, Control Mode: Balanced, Hr Option: Both, Save Detected Map: True", Version: v1.9.3-2-g3d3fc81f

Note

Comparing to instant id, you need stronger prompt to make the output stylized. Following is the same neon style generated by instant id:
300220459-2093167a-6d95-4855-a8cf-4c036f2aefad

Known issue

facexlib and evaclip in total consumes about 2~3GB of vram. So if you have low vram, that can be an issue. There is also about 500MB of vram not unloaded after unloading the preprocessors. I will need to investigate more on why vram here is not properly freed.

@huchenlei huchenlei merged commit 784b6d0 into Mikubill:main May 4, 2024
2 checks passed
@huchenlei huchenlei deleted the pulid branch May 4, 2024 19:58
@George0726
Copy link
Contributor

Hi author, thanks for the amazing support! I am trying to follow your PuLID mode and padding/ orthogonal stuff in code, however, the author seems not illustrating this part in the paper?

huchenlei added a commit that referenced this pull request May 6, 2024
* Replace mask upload with effective region mask (#2830)

* 📝 Update to version v1.1.446 (#2832)

* 📝 Update to version v1.1.446

* update readme

* Fix blur_gaussian slider param (#2834)

* Support PuLID (#2838)

* Add preprocessors

* Fix resolution param

* Fix various issues

* Add PuLID attn

* remove unused import

* Resize img before passing to facexlib

* safe unload

* 📝 Update to version v1.1.447 (#2842)

* 📝 Update to version v1.1.447

* update readme

* Allow pulid accept multiple inputs (#2843)

* Quick fix enum issue (#2844)

* Move enum (#2845)

* Move enums to enums.py

* Add missing import

* Remove unused import

* Remove legacy test (#2846)

* Quickfix Enum Issue 2 (#2849)

* Drop external_code. prefix (#2850)

* Drop external_code. prefix

* Remove unused imports

* fix test (#2851)

* Adjust test template's threshold a/b value (#2852)

* Validate ControlNetUnit using pydantic (#2847)

* Add Pydantic ControlNetUnit

Add test config

Add images field

Adjust image field handling

fix various ui

Fix most UI issues

accept greyscale image/mask

Fix infotext

Fix preset

Fix infotext

nit

Move infotext parsing test

Remove preset

Remove unused js code

Adjust test payload

By default disable unit

refresh enum usage

Align resize mode

change test func name

remove unused import

nit

Change default handling

Skip bound check when not enabled

Fix batch

Various batch fix

Disable batch hijack test

adjust test

fix test expectations

Fix unit copy

nit

Fix test failures

* Change script args back to ControlNetUnit for compatibility

* import enum for compatibility

* Fix unit test

* simplify unfold

* Add test coverage

* handle directly set np image

* re-enable batch test

* Add back canvas scribble support

* nit

* Fix batch hijack test

* 📝 Update to version v1.1.448 (#2853)

* Update UI image (#2855)

* Fix duplicated version logging (#2856)
@yanghongquan521
Copy link

EVA02_CLIP_L_336_psz14_s6B 放在哪个文件夹下面

@maqianzhao
Copy link

EVA02_CLIP_L_336_psz14_s6B 哪里下载?

@dan4ik94
Copy link

@huchenlei Forge support please

1 similar comment
@prog-ape
Copy link

@huchenlei Forge support please

@eurotaku
Copy link

@huchenlei
any idea why the previews first show a great representation of the source image and then it falls apart so awfully at the end? played around a bit with the weights and especially block 5 and 6 seem to be responsible for that overburned end result. but they also seem to have the main effect on the reproduction of the person's likeness.

00084-020-3043948951-masterpiece painting buildings in the backdrop kaleidoscope lilac
ken
https://github.com/Mikubill/sd-webui-controlnet/assets/129174287/d5572282-16f7-4df5-ac4d-ef82bd5e27bb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DevTask] Port PuLID
7 participants