LoRA Adapter #871

slyalin · 2024-09-16T22:06:14Z

TODO:

Eliminate additional constructors in LLMPipeline with explicit AdapterConfig arguments, use properties instead like it is available in generate.
Add an additional constructor to LLMPipeline that accepts a variadic list of properties. It is more convenient than AnyMap with properties, and aligned with generate. A trivial attempt to do that failed due to a specific implementation of Property::Forward helper class that confuses the compiler if multiple such arguments are passed. Requires a more accurate definition of AdatperConfig constructors.
Avoid two-stage weight preparation in MODE_FUSE when multiple LoRAs are combined. Now there are two separately compiled CPU models: one is for LoRAs concatenation and another one is for main weight fusion. These two helper models can be merged into one.
Python API
User friendly comments for publicly exposed classes
Functional tests
Benchmark
One selected variant for demonstration in lora_greedy_causal_lm sample instead of a series of different modes.
Part of FIXMEs that can be addressed without waiting for fixes at the plugins side

…ors loading.

…g prints.

… for ReadValue. It works around a bug in CPU with state initialization.

…nd separate base LoRA classes reuse outside (e.g. in SD). Code duplication, not compiling.

…elines with new API (+ several WAs for CPU bugs exposed during testing with various adapters).

…config for more convenient user experience.

…aptersConfig methods. Compilation is broken.

… way (wip).

…tensors and non-empty tensors of minimal size.

…generation twice for each variant.

…meter

…e transformation activated by a new config flag.

…pha is not specified in command line

… defaults, less need of explicit use of AdapterConfig, introduced mode instead of bool flags.

…DE_STATIC explicitly in SD and lcm. Sorted examples in lora sample from static to dynamic. Passed MODE_STATIC_RANK to internal rank calculation.

…ependency. Switch lora sample to CPU. Disable DEBUG_PRINT in Release configuration.

… behaviour in the scenario when adapters are switched without setting alpha.

…ed cached alpha constants -- creating them each time, because they are small.

…apters simultaniously.

…a couple of TODOs: migrated to InferRequestCache. Reshafled the order of functions for a more fluent review.

…ngle class because there is no need for multiple classes in hierachy. Device-dependent LoRA mode deduction in AdapterController. Passed device type in all places where applicable.

…nd compile time as well as LoRA preparation time separately.

slyalin and others added 30 commits August 4, 2024 13:35

Switch to LoRA adapter merge in compile_model. Re-implement LoRA tens…

0cd4cc2

…ors loading.

Merge remote-tracking branch 'origin/master' into dynamic_lora

1ce8850

WIP: New LoRA variants ported to both LLMs and SD

d916d99

Enable LoRA via env var in text gen pipeline. Suppress too bulky debu…

2642357

…g prints.

LoRA weights in states

0fa8e30

Set state for infer_request and don't use Constant in init expression…

aa0568e

… for ReadValue. It works around a bug in CPU with state initialization.

Fixed SD wtih separate A and B

fffa5ff

[WIP] LoRA prototype refactoring for better LLMPipeline integration a…

afa7a77

…nd separate base LoRA classes reuse outside (e.g. in SD). Code duplication, not compiling.

[WIP] LoRA Refactoring

6127fef

[WIP] LaRA refactored: restored functionality of both text and SD pip…

863e81b

…elines with new API (+ several WAs for CPU bugs exposed during testing with various adapters).

Clean up the greedy sample with LoRA implemented set method for LoRA …

55dba14

…config for more convenient user experience.

[WIP] Added a dedicated LoRA LLMPipeline generation sample, adjust Ad…

bcc7782

…aptersConfig methods. Compilation is broken.

Polish AdaptersConfig methods. Start chaning alphas in a more optimal…

e80580e

… way (wip).

[WIP] Complitelly OV-inference-based LoRA weights preparation.

5642179

Restored empty adapter functionality via two alternative ways: empty …

57dd39d

…tensors and non-empty tensors of minimal size.

Added lora_greedy_causal_lm sample in build.

3ec4907

Report compilation/preparation time in ms instead of sec.

c87fdcf

Adjust sample to use adapters.remove instead of modifying alpha. Run …

d02fbee

…generation twice for each variant.

[WIP] Code refactoring: common part + SD

08e3dcf

Fixed old code path in SD with new multi-value lora command line para…

33321cb

…meter

[WIP] Code clean up. Expose fuse mode in genai API by re-enabling fus…

c4969fb

…e transformation activated by a new config flag.

Migrated lcm sample to the new api. Fixed issue with old lora when al…

0724a79

…pha is not specified in command line

Simplifying high-level API: syntactic sugar for better short cuts and…

d7312d7

… defaults, less need of explicit use of AdapterConfig, introduced mode instead of bool flags.

Switched lora sample to GPU. Removed most of the debug output. Use MO…

cae1592

…DE_STATIC explicitly in SD and lcm. Sorted examples in lora sample from static to dynamic. Passed MODE_STATIC_RANK to internal rank calculation.

Removed old LoRA impelementation from SD/LCM samples, removed Eigen d…

863a050

…ependency. Switch lora sample to CPU. Disable DEBUG_PRINT in Release configuration.

Removed setting default alpha in Adapter because of possible confused…

1f5c2ac

… behaviour in the scenario when adapters are switched without setting alpha.

Remove obsolete flags from AdapterConfig and use mode directly. Remov…

504735a

…ed cached alpha constants -- creating them each time, because they are small.

Switched image generation samples to their default mode -- MODE_FUSE.

777084c

Extend image generation REAMEs with a way to provide multiple LoRA ad…

dc84ba4

…apters simultaniously.

Merge remote-tracking branch 'origin/master' into dynamic_lora

5a10d6b

slyalin added 5 commits September 13, 2024 16:39

Removed obsolete parts in the code. Added more comments. Implemented …

75c5242

…a couple of TODOs: migrated to InferRequestCache. Reshafled the order of functions for a more fluent review.

Removed base class for AdapterControllerImpl moving all parts to a si…

7bfe9bd

…ngle class because there is no need for multiple classes in hierachy. Device-dependent LoRA mode deduction in AdapterController. Passed device type in all places where applicable.

More clear timings in image generation samples to report model load a…

9153494

…nd compile time as well as LoRA preparation time separately.

Reverted changes in stable_diffusion Timer.

48f90cd

Merge remote-tracking branch 'origin/master' into dynamic_lora

6764c18

slyalin marked this pull request as draft September 16, 2024 22:07

ilya-lavrenov self-assigned this Sep 17, 2024

ilya-lavrenov added this to the 2024.5 milestone Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA Adapter #871

LoRA Adapter #871

slyalin commented Sep 16, 2024

LoRA Adapter #871

Are you sure you want to change the base?

LoRA Adapter #871

Conversation

slyalin commented Sep 16, 2024