Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoRA Adapter #871

Draft
wants to merge 35 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0cd4cc2
Switch to LoRA adapter merge in compile_model. Re-implement LoRA tens…
slyalin Aug 4, 2024
1ce8850
Merge remote-tracking branch 'origin/master' into dynamic_lora
slyalin Aug 4, 2024
d916d99
WIP: New LoRA variants ported to both LLMs and SD
slyalin Aug 5, 2024
2642357
Enable LoRA via env var in text gen pipeline. Suppress too bulky debu…
slyalin Aug 7, 2024
0fa8e30
LoRA weights in states
slyalin Aug 8, 2024
aa0568e
Set state for infer_request and don't use Constant in init expression…
slyalin Aug 8, 2024
fffa5ff
Fixed SD wtih separate A and B
slyalin Aug 13, 2024
afa7a77
[WIP] LoRA prototype refactoring for better LLMPipeline integration a…
slyalin Aug 16, 2024
6127fef
[WIP] LoRA Refactoring
slyalin Aug 17, 2024
863e81b
[WIP] LaRA refactored: restored functionality of both text and SD pip…
slyalin Aug 19, 2024
55dba14
Clean up the greedy sample with LoRA implemented set method for LoRA …
slyalin Aug 20, 2024
bcc7782
[WIP] Added a dedicated LoRA LLMPipeline generation sample, adjust Ad…
slyalin Aug 21, 2024
e80580e
Polish AdaptersConfig methods. Start chaning alphas in a more optimal…
slyalin Aug 21, 2024
5642179
[WIP] Complitelly OV-inference-based LoRA weights preparation.
slyalin Aug 23, 2024
57dd39d
Restored empty adapter functionality via two alternative ways: empty …
slyalin Aug 23, 2024
3ec4907
Added lora_greedy_causal_lm sample in build.
slyalin Aug 23, 2024
c87fdcf
Report compilation/preparation time in ms instead of sec.
slyalin Aug 23, 2024
d02fbee
Adjust sample to use adapters.remove instead of modifying alpha. Run …
slyalin Aug 23, 2024
08e3dcf
[WIP] Code refactoring: common part + SD
slyalin Aug 25, 2024
33321cb
Fixed old code path in SD with new multi-value lora command line para…
slyalin Aug 25, 2024
c4969fb
[WIP] Code clean up. Expose fuse mode in genai API by re-enabling fus…
slyalin Aug 25, 2024
0724a79
Migrated lcm sample to the new api. Fixed issue with old lora when al…
slyalin Aug 26, 2024
d7312d7
Simplifying high-level API: syntactic sugar for better short cuts and…
slyalin Aug 28, 2024
cae1592
Switched lora sample to GPU. Removed most of the debug output. Use MO…
slyalin Aug 29, 2024
863a050
Removed old LoRA impelementation from SD/LCM samples, removed Eigen d…
slyalin Sep 12, 2024
1f5c2ac
Removed setting default alpha in Adapter because of possible confused…
slyalin Sep 12, 2024
504735a
Remove obsolete flags from AdapterConfig and use mode directly. Remov…
slyalin Sep 12, 2024
777084c
Switched image generation samples to their default mode -- MODE_FUSE.
slyalin Sep 12, 2024
dc84ba4
Extend image generation REAMEs with a way to provide multiple LoRA ad…
slyalin Sep 13, 2024
5a10d6b
Merge remote-tracking branch 'origin/master' into dynamic_lora
slyalin Sep 13, 2024
75c5242
Removed obsolete parts in the code. Added more comments. Implemented …
slyalin Sep 13, 2024
7bfe9bd
Removed base class for AdapterControllerImpl moving all parts to a si…
slyalin Sep 16, 2024
9153494
More clear timings in image generation samples to report model load a…
slyalin Sep 16, 2024
48f90cd
Reverted changes in stable_diffusion Timer.
slyalin Sep 16, 2024
6764c18
Merge remote-tracking branch 'origin/master' into dynamic_lora
slyalin Sep 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Removed setting default alpha in Adapter because of possible confused…
… behaviour in the scenario when adapters are switched without setting alpha.
  • Loading branch information
slyalin committed Sep 12, 2024
commit 1f5c2ac65018d098ae699023e0f66cd165c119e0
2 changes: 1 addition & 1 deletion image_generation/lcm_dreamshaper_v7/cpp/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ StableDiffusionModels compile_models(const std::string& model_path,
Timer t("Loading LoRA weights");
DEBUG_PRINT("Adapters registered:" << lora_path.size());
for(size_t i = 0; i < lora_path.size(); ++i) {
lora_config.add(ov::genai::Adapter(lora_path[i], i < alpha.size() ? alpha[i] : 0.75f)); // TODO: Consider using default alpha from LoRA file
lora_config.add(ov::genai::Adapter(lora_path[i]), i < alpha.size() ? alpha[i] : 0.75f); // TODO: Consider using default alpha from LoRA file
}
}

Expand Down
2 changes: 1 addition & 1 deletion image_generation/stable_diffusion_1_5/cpp/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ StableDiffusionModels compile_models(const std::string& model_path,
Timer t("Loading LoRA weights");
DEBUG_PRINT("Adapters registered:" << lora_path.size());
for(size_t i = 0; i < lora_path.size(); ++i) {
lora_config.add(ov::genai::Adapter(lora_path[i], i < alpha.size() ? alpha[i] : 0.75f)); // TODO: Consider using default alpha from LoRA file
lora_config.add(ov::genai::Adapter(lora_path[i]), i < alpha.size() ? alpha[i] : 0.75f); // TODO: Consider using default alpha from LoRA file
}
}

Expand Down
23 changes: 7 additions & 16 deletions samples/cpp/lora_greedy_causal_lm/lora_greedy_causal_lm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,22 @@ int main(int argc, char* argv[]) try {

std::cout << "MODE_STATIC" << std::endl;
{
Adapter adapter(adapter_path, 0.75);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, AdapterConfig::MODE_STATIC));
Adapter adapter(adapter_path);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, 0.75, AdapterConfig::MODE_STATIC));
std::cout << pipe.generate(prompt, max_new_tokens(100)) << std::endl;
}

std::cout << "MODE_STATIC_RANK" << std::endl;
{
Adapter adapter(adapter_path, 0.75);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, AdapterConfig::MODE_STATIC_RANK));
Adapter adapter(adapter_path);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, 0.75, AdapterConfig::MODE_STATIC_RANK));
std::cout << pipe.generate(prompt, max_new_tokens(100)) << std::endl;
}

std::cout << "MODE_DYNAMIC" << std::endl;
{
Adapter adapter(adapter_path, 0.75);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, AdapterConfig::MODE_DYNAMIC));
Adapter adapter(adapter_path);
LLMPipeline pipe(model_path, device, AdapterConfig(adapter, 0.75, AdapterConfig::MODE_DYNAMIC));
std::cout << pipe.generate(prompt, max_new_tokens(100)) << std::endl;
}

Expand All @@ -47,7 +47,7 @@ int main(int argc, char* argv[]) try {

std::cout << "MODE_AUTO/explicit alpha" << std::endl;
{
LLMPipeline pipe(model_path, device, Adapter(adapter_path, 0.75));
LLMPipeline pipe(model_path, device, AdapterConfig(Adapter(adapter_path), 0.75));
std::cout << pipe.generate(prompt, max_new_tokens(100)) << std::endl;
}

Expand All @@ -59,15 +59,6 @@ int main(int argc, char* argv[]) try {
std::cout << pipe.generate(prompt, max_new_tokens(100)) << std::endl;
}

std::cout << "MODE_AUTO/blended" << std::endl;
{
Adapter adapter1(adapter_path1, 0.5);
Adapter adapter2(adapter_path2, 0.25);
LLMPipeline pipe(model_path, device, {adapter1, adapter2});
std::cout << pipe.generate(prompt, adapters(adapter1), max_new_tokens(100)) << std::endl;
std::cout << pipe.generate(prompt, adapters(adapter2), max_new_tokens(100)) << std::endl;
}

std::cout << "MODE_AUTO/blended with late alpha set" << std::endl;
{
Adapter adapter1 = Adapter(adapter_path1);
Expand Down
2 changes: 2 additions & 0 deletions src/cpp/include/openvino/genai/lora_adapter.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ struct OPENVINO_GENAI_EXPORTS AdapterConfig {
std::vector<std::shared_ptr<ov::op::v0::Constant>> alpha_constants;

AdapterConfig (const Adapter& adapter, Mode mode = MODE_AUTO) : AdapterConfig(std::vector<Adapter>{adapter}, mode) {}
AdapterConfig (const Adapter& adapter, float alpha, Mode mode = MODE_AUTO) : AdapterConfig(std::vector<std::pair<Adapter, float>>{{adapter, alpha}}, mode) {}
AdapterConfig (const Adapter& adapter, double alpha, Mode mode = MODE_AUTO) : AdapterConfig(adapter, float(alpha), mode) {}
AdapterConfig (const std::vector<Adapter>& adapters, Mode mode = MODE_AUTO);
AdapterConfig (const std::vector<std::pair<Adapter, float>>& adapters, Mode mode = MODE_AUTO);

Expand Down
23 changes: 5 additions & 18 deletions src/cpp/src/lora_adapter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -676,30 +676,17 @@ namespace genai {

class Adapter::Impl {
public:
Impl(const std::string& path, std::optional<float> default_alpha = std::nullopt) : // FIXME: Pass lora patterns
tensors(group_lora_tensors(read_safetensors(path), default_lora_patterns())), // FIXME: Accept directory with the config as well
default_alpha(default_alpha) {
}

std::optional<float> get_default_alpha () const {
return default_alpha;
}
Impl(const std::string& path) :
tensors(group_lora_tensors(read_safetensors(path), default_lora_patterns()))
{}

LoRATensors tensors;
std::optional<float> default_alpha;
};

Adapter::Adapter(const std::string& path, float default_alpha) :
m_pimpl(std::make_shared<Adapter::Impl>(path, default_alpha)) {
}

Adapter::Adapter(const std::string& path) :
m_pimpl(std::make_shared<Adapter::Impl>(path)) {
}

std::optional<float> Adapter::get_default_alpha() const {
return m_pimpl->get_default_alpha();
}


struct AdapterControllerImpl {
Expand Down Expand Up @@ -1245,7 +1232,7 @@ AdapterConfig::AdapterConfig (const std::vector<Adapter>& adapters, Mode mode) :
decompose_mode();
alphas.reserve(adapters.size());
for(const auto& adapter: adapters) {
auto const alpha = adapter.get_default_alpha().value_or(1);
auto const alpha = 1;
alphas.push_back(alpha);
alpha_constants.push_back(alpha_as_constant(alpha));
}
Expand Down Expand Up @@ -1279,7 +1266,7 @@ AdapterConfig& AdapterConfig::add(const Adapter& adapter, float alpha) {
}

AdapterConfig& AdapterConfig::add(const Adapter& adapter) {
return add(adapter, adapter.get_default_alpha().value_or(1));
return add(adapter, 1);
}

AdapterConfig& AdapterConfig::set_alpha(const Adapter& adapter, float alpha) {
Expand Down