Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoRA Adapter #871

Draft
wants to merge 35 commits into
base: master
Choose a base branch
from
Draft

LoRA Adapter #871

wants to merge 35 commits into from

Conversation

slyalin
Copy link
Collaborator

@slyalin slyalin commented Sep 16, 2024

TODO:

  • Eliminate additional constructors in LLMPipeline with explicit AdapterConfig arguments, use properties instead like it is available in generate.
  • Add an additional constructor to LLMPipeline that accepts a variadic list of properties. It is more convenient than AnyMap with properties, and aligned with generate. A trivial attempt to do that failed due to a specific implementation of Property::Forward helper class that confuses the compiler if multiple such arguments are passed. Requires a more accurate definition of AdatperConfig constructors.
  • Avoid two-stage weight preparation in MODE_FUSE when multiple LoRAs are combined. Now there are two separately compiled CPU models: one is for LoRAs concatenation and another one is for main weight fusion. These two helper models can be merged into one.
  • Python API
  • User friendly comments for publicly exposed classes
  • Functional tests
  • Benchmark
  • One selected variant for demonstration in lora_greedy_causal_lm sample instead of a series of different modes.
  • Part of FIXMEs that can be addressed without waiting for fixes at the plugins side

slyalin and others added 30 commits August 4, 2024 13:35
… for ReadValue. It works around a bug in CPU with state initialization.
…nd separate base LoRA classes reuse outside (e.g. in SD). Code duplication, not compiling.
…elines with new API (+ several WAs for CPU bugs exposed during testing with various adapters).
…aptersConfig methods. Compilation is broken.
…tensors and non-empty tensors of minimal size.
…e transformation activated by a new config flag.
… defaults, less need of explicit use of AdapterConfig, introduced mode instead of bool flags.
…DE_STATIC explicitly in SD and lcm. Sorted examples in lora sample from static to dynamic. Passed MODE_STATIC_RANK to internal rank calculation.
…ependency. Switch lora sample to CPU. Disable DEBUG_PRINT in Release configuration.
… behaviour in the scenario when adapters are switched without setting alpha.
…ed cached alpha constants -- creating them each time, because they are small.
…a couple of TODOs: migrated to InferRequestCache. Reshafled the order of functions for a more fluent review.
…ngle class because there is no need for multiple classes in hierachy.

Device-dependent LoRA mode deduction in AdapterController. Passed device type in all places where applicable.
…nd compile time as well as LoRA preparation time separately.
@slyalin slyalin marked this pull request as draft September 16, 2024 22:07
@ilya-lavrenov ilya-lavrenov self-assigned this Sep 17, 2024
@ilya-lavrenov ilya-lavrenov added this to the 2024.5 milestone Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants