Allow user to specify huggingface link or local path to pretrained lora weights #3572

Infernaught · 2023-08-31T21:10:53Z

Allows user to specify huggingface link or local path to adapter weights. Tested using Arnav's Code Alpaca V3 model (loaded the V3 adapter weights using this change and ran the results through human-eval -- scores matched those of the original V3 model)

ludwig/models/llm.py

ludwig/schema/llms/peft.py

ludwig/models/llm.py

…trained_adapter_weights

github-actions · 2023-09-01T00:12:21Z

Unit Test Results

  6 files ±      0   6 suites ±0 1h 12m 13s ⏱️ - 17m 25s
34 tests - 2 792 29 ✔️ - 2 759   5 💤 - 7 0 ❌ - 26
88 runs - 2 781 72 ✔️ - 2 750 16 💤 - 5 0 ❌ - 26

Results for commit effcce0. ± Comparison against base commit 63f4924.

♻️ This comment has been updated with latest results.

jeffkinnison

Just one additional comment, it shouldn't be blocking if we want to land this.

jeffkinnison · 2023-09-06T22:53:37Z

tests/integration_tests/test_llm.py

    }
    config_obj = ModelConfig.from_dict(config)
    assert config_obj.input_features[0].preprocessing.max_sequence_length is None
    assert config_obj.output_features[0].preprocessing.max_sequence_length is None


+def test_load_pretrained_adapter_weights():


A couple of tests we should add (possibly in a followup PR):

Checking a null input

Checking an invalid weights path

arnavgarg1 · 2023-09-06T22:58:35Z

ludwig/schema/llms/peft.py

+    target_modules: Optional[list] = schema_utils.List(
+        str,
+        default=None,
+        allow_none=True,
+        description="List of modules to apply Lora to. If None, apply to all modules.",
+    )


Is this needed?

I recall this causing an error if this wasn't set.

Got it! I would be good to know what the error was exactly so we can understand it and also leave a comment to explain it - might be useful when we come back to it in the future

If I recall correctly, there was an error involving target_modules not being a parameter of a LoraConfig.

arnavgarg1 · 2023-09-06T22:58:53Z

ludwig/trainers/trainer.py

+                try:
+                    loss.requires_grad = True
+                except RuntimeError:
+                    pass


Why do we have to add this?

This was a workaround I had because, when the lora weights were loaded in, some of the loss functions did not have requires_grad set to True. However, without the try-except block, this would try to set requires_grad to True for some intermediate loss functions, which isn't valid.

Questions:

Why would loading lora weights have loss functions with requires_grad != True?

What does the training error look like when some loss functions have requires_grad != True?

Do you have an example of an intermediate loss function that raises an error when you try to set requires_grad=True?

At least, we should be adding a comment explaining why this is here, e.g.

"When loading adpater weights from huggingface or a local path, some of the loss functions do not have requires_grad=True. requires_grad=True is necessary for back-propogation, but we wrap this in a try/except because some intermediate losses like __ raise an error if requires_grad is explicitly set to True in this way."

Can you elaborate a bit more? What are the intermediate loss functions? I'm also not totally sure how wrapping the model with the adapter is causing these issues?

I agree with @justinxzhao and have the same questions as well

I'm not quite sure on why this was the case. My hypothesis is that, when lora weights are loaded, they overwrite some of the parameters of certain layers.

When some loss functions have requires_grad != True, training stops and errors out.

I don't have an example of this, but I think my terminology was incorrect here. Specifically, requires_grad can only be changed on leaf variables, so if there is a loss function that is not a leaf node, setting requires_grad=True would cause an error.

By default, when a PEFT pretrained adapter is loaded in, is it set to training mode or inference mode? Does that maybe have something to do with requires_grad not being set?

Ah, definitely worth checking if the module is being loaded in eval mode, which would explain requires_grad=False. Some useful references:

https://discuss.pytorch.org/t/check-if-model-is-eval-or-train/9395

https://stackoverflow.com/questions/60018578/what-does-model-eval-do-in-pytorch

arnavgarg1 · 2023-09-06T23:27:44Z

ludwig/constants.py

@@ -282,6 +282,7 @@
 GENERATION = "generation"
 PROMPT = "prompt"
 ADAPTER = "adapter"
+PRETRAINED_WEIGHTS = "pretrained_weights"


nit, might be more clear to call it pretrained_adapter_weights since pretrained weights also come from the model! So just to avoid confusion

arnavgarg1 · 2023-09-06T23:29:15Z

ludwig/models/llm.py

+                    if param_name is None:
+                        continue


When would param_name be None? This dictionary is the parameters for the PEFT adapter that we already have in the schema right? For e.g., for LoRA, it will have r, alpha, bias etc. If so, I'd assume the value can be none, but name being None feels a bit odd.

You're correct. This should be param_value.

ludwig/models/llm.py

justinxzhao · 2023-09-11T15:33:08Z

ludwig/config_validation/checks.py

@@ -477,7 +477,7 @@ def check_llm_finetuning_output_feature_config(config: "ModelConfig"):  # noqa:
    if config.model_type != MODEL_LLM:
        return

-    if config.trainer.type != "finetune":
+    if config.trainer.type != "finetune" and config.adapter.pretrained_adapter_weights is not None:


Should this be an OR? Why does specifying pretrained_adapter_weights no longer require that the first output feature be TEXT?

Or is it that we want to make it so that using the none trainer type doesn't require an output feature?

if config.trainer.type == "none": return

CC: @arnavgarg1

I think this was an oversight on my part. I was trying to go through the code to see where my change might break something down the line, and I might have gotten a little overzealous.

justinxzhao · 2023-09-11T15:40:21Z

ludwig/config_validation/checks.py

@@ -493,6 +493,9 @@ def check_llm_finetuning_trainer_config(config: "ModelConfig"):  # noqa: F821
    if config.model_type != MODEL_LLM:
        return

+    if config.trainer.type == "none" and config.adapter.pretrained_adapter_weights is not None:


Should this be more simply

if config.trainer.type == "none": # The NoneTrainer for ZS is valid. return

But in this case, we would load in untrained LoRA weights if pretrained adapter weights weren't specified in the config, right? Would that be a problem?

ludwig/models/llm.py

tests/integration_tests/test_llm.py

justinxzhao

Thanks!

Add functionality for pretrained lora weights

4ee94bd

Infernaught requested review from arnavgarg1, justinxzhao and jeffkinnison August 31, 2023 21:10

jeffkinnison reviewed Aug 31, 2023

View reviewed changes

ludwig/models/llm.py Outdated Show resolved Hide resolved

ludwig/schema/llms/peft.py Outdated Show resolved Hide resolved

ludwig/models/llm.py Show resolved Hide resolved

Infernaught added 2 commits August 31, 2023 17:58

Merge branch 'master' of https://github.com/ludwig-ai/ludwig into pre…

e242ce8

…trained_adapter_weights

Address PR comments

1c00724

tgaddair mentioned this pull request Sep 6, 2023

Load LORA adapter for inference #3589

Closed

jeffkinnison approved these changes Sep 6, 2023

View reviewed changes

arnavgarg1 reviewed Sep 6, 2023

View reviewed changes

Address PR comments

650e5e6

Infernaught changed the title ~~Add functionality for pretrained lora weights~~ Allow user to specify huggingface link or local path to pretrained lora weights Sep 7, 2023

justinxzhao reviewed Sep 7, 2023

View reviewed changes

ludwig/models/llm.py Show resolved Hide resolved

Infernaught added 7 commits September 7, 2023 18:12

Add comments explaining PEFT config update

c18e9e8

Allow for adalora and adaption prompt loading

d92190a

Explain target_modules

1fbd1b2

Add tests for adalora and adaption prompt

a8b674f

Change try-except to conditional for transparency

d3d66ef

Allow pretrained weights for inference only

4578552

Load pre-trained weights when trainer type is none

4c63915

justinxzhao reviewed Sep 11, 2023

View reviewed changes

Infernaught added 3 commits September 11, 2023 12:14

Fix check for LLM text output features

ac439cd

Remove print statement

9b94d91

Fix finetune check and add comment for clarity

9e5dfec

justinxzhao approved these changes Sep 11, 2023

View reviewed changes

Fix backend check

f6695bb

Infernaught added 3 commits September 11, 2023 18:02

Fix conditional to avoid checking adapter in ECDs

4355353

Fix qlora check

1ab20cc

Revert change to qlora check

effcce0

Infernaught merged commit 6178b48 into master Sep 12, 2023

Infernaught deleted the pretrained_adapter_weights branch September 12, 2023 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow user to specify huggingface link or local path to pretrained lora weights #3572

Allow user to specify huggingface link or local path to pretrained lora weights #3572

Infernaught commented Aug 31, 2023 •

edited by arnavgarg1

Loading

github-actions bot commented Sep 1, 2023 •

edited

Loading

jeffkinnison left a comment

jeffkinnison Sep 6, 2023

arnavgarg1 Sep 6, 2023

Infernaught Sep 6, 2023

arnavgarg1 Sep 6, 2023 •

edited

Loading

Infernaught Sep 6, 2023

arnavgarg1 Sep 6, 2023

Infernaught Sep 6, 2023

justinxzhao Sep 6, 2023

arnavgarg1 Sep 6, 2023

arnavgarg1 Sep 6, 2023

Infernaught Sep 6, 2023

arnavgarg1 Sep 7, 2023

justinxzhao Sep 7, 2023 •

edited

Loading

arnavgarg1 Sep 6, 2023

Infernaught Sep 6, 2023

arnavgarg1 Sep 6, 2023 •

edited

Loading

Infernaught Sep 6, 2023

justinxzhao Sep 11, 2023

Infernaught Sep 11, 2023

justinxzhao Sep 11, 2023

Infernaught Sep 11, 2023

justinxzhao left a comment

Allow user to specify huggingface link or local path to pretrained lora weights #3572

Allow user to specify huggingface link or local path to pretrained lora weights #3572

Conversation

Infernaught commented Aug 31, 2023 • edited by arnavgarg1 Loading

github-actions bot commented Sep 1, 2023 • edited Loading

Unit Test Results

jeffkinnison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnavgarg1 Sep 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinxzhao Sep 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnavgarg1 Sep 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinxzhao left a comment

Choose a reason for hiding this comment

Infernaught commented Aug 31, 2023 •

edited by arnavgarg1

Loading

github-actions bot commented Sep 1, 2023 •

edited

Loading

arnavgarg1 Sep 6, 2023 •

edited

Loading

justinxzhao Sep 7, 2023 •

edited

Loading

arnavgarg1 Sep 6, 2023 •

edited

Loading