Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to specify huggingface link or local path to pretrained lora weights #3572

Merged
merged 18 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ludwig/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,7 @@
GENERATION = "generation"
PROMPT = "prompt"
ADAPTER = "adapter"
PRETRAINED_WEIGHTS = "pretrained_weights"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, might be more clear to call it pretrained_adapter_weights since pretrained weights also come from the model! So just to avoid confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it


# CrossEntropyLoss for LLMs
IGNORE_INDEX_TOKEN_ID = -100
Expand Down
34 changes: 29 additions & 5 deletions ludwig/models/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,12 +221,36 @@ def initialize_adapter(self):
"`finetune` or remove the adapter config."
)

from peft import get_peft_model, TaskType
from peft import get_peft_model

pretrained = False
if self.config_obj.adapter.pretrained_weights:
print(f"PRETRAINED_WEIGHTS: {self.config_obj.adapter.pretrained_weights}")
jeffkinnison marked this conversation as resolved.
Show resolved Hide resolved
# If pretrained adapter weights are provided, we want to load them into the model
from peft import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PeftConfig

pretrained = True
peft_config = PeftConfig.from_pretrained(self.config_obj.adapter.pretrained_weights)
peft_dict = peft_config.to_dict()
for param_name, param_value in self.config_obj.adapter.to_config().to_dict().items():
jeffkinnison marked this conversation as resolved.
Show resolved Hide resolved
if param_name is None:
continue
Copy link
Contributor

@arnavgarg1 arnavgarg1 Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would param_name be None? This dictionary is the parameters for the PEFT adapter that we already have in the schema right? For e.g., for LoRA, it will have r, alpha, bias etc. If so, I'd assume the value can be none, but name being None feels a bit odd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. This should be param_value.


if param_name not in peft_dict:
setattr(peft_config, param_name, param_value)

self.model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type].from_pretrained(
self.model, self.config_obj.adapter.pretrained_weights
)
else:
# If no pretrained adapter is provided, we want to load untrained weights into the model
from peft import TaskType

peft_config = self.config_obj.adapter.to_config(
task_type=TaskType.CAUSAL_LM, tokenizer_name_or_path=self.model_name
)
self.model = get_peft_model(self.model, peft_config)
peft_config = self.config_obj.adapter.to_config(
task_type=TaskType.CAUSAL_LM, tokenizer_name_or_path=self.model_name
)

self.model = get_peft_model(self.model, peft_config, pretrained=pretrained)

logger.info("==================================================")
logger.info("Trainable Parameter Summary For Fine-Tuning")
Expand Down
12 changes: 12 additions & 0 deletions ludwig/schema/llms/peft.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,18 @@ class LoraConfig(BaseAdapterConfig):
description="Bias type for Lora.",
)

pretrained_weights: Optional[str] = schema_utils.String(
default="none",
jeffkinnison marked this conversation as resolved.
Show resolved Hide resolved
description="Path to pretrained weights for Lora.",
)

target_modules: Optional[list] = schema_utils.List(
str,
default=None,
allow_none=True,
description="List of modules to apply Lora to. If None, apply to all modules.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall this causing an error if this wasn't set.

Copy link
Contributor

@arnavgarg1 arnavgarg1 Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I would be good to know what the error was exactly so we can understand it and also leave a comment to explain it - might be useful when we come back to it in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, there was an error involving target_modules not being a parameter of a LoraConfig.


def to_config(self, task_type: str = None, **kwargs) -> "PeftConfig":
from peft import LoraConfig as _LoraConfig

Expand Down
1 change: 1 addition & 0 deletions ludwig/trainers/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ def closure():
targets, model_outputs, self.regularization_type, self.regularization_lambda
)
loss = loss / self.gradient_accumulation_steps
loss.requires_grad = True

# Begin the backward pass
variables = self.dist_model.parameters()
Expand Down
31 changes: 31 additions & 0 deletions tests/integration_tests/test_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
MODEL_TYPE,
OUTPUT_FEATURES,
PREPROCESSING,
PRETRAINED_WEIGHTS,
PROMPT,
TRAINER,
TYPE,
Expand Down Expand Up @@ -481,6 +482,36 @@ def test_llama_rope_scaling():
assert model.model.config.rope_scaling["factor"] == 2.0


def test_load_pretrained_adapter_weights():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of tests we should add (possibly in a followup PR):

  • Checking a null input
  • Checking an invalid weights path

from peft import PeftModel
from transformers import PreTrainedModel

config = {
MODEL_TYPE: MODEL_LLM,
BASE_MODEL: TEST_MODEL_NAME,
INPUT_FEATURES: [text_feature(name="input", encoder={"type": "passthrough"})],
OUTPUT_FEATURES: [text_feature(name="output")],
TRAINER: {
TYPE: "finetune",
BATCH_SIZE: 8,
EPOCHS: 2,
},
ADAPTER: {TYPE: "lora", PRETRAINED_WEIGHTS: "Infernaught/test_adapter_weights"},
BACKEND: {TYPE: "local"},
}

print(ModelConfig)
config_obj = ModelConfig.from_dict(config)
model = LLM(config_obj)

assert model.config_obj.adapter.pretrained_weights
assert model.config_obj.adapter.pretrained_weights == "Infernaught/test_adapter_weights"

model.prepare_for_training()
assert not isinstance(model.model, PreTrainedModel)
assert isinstance(model.model, PeftModel)


def _compare_models(model_1: torch.nn.Module, model_2: torch.nn.Module) -> bool:
# Source: https://discuss.pytorch.org/t/check-if-models-have-same-weights/4351/6
for key_item_1, key_item_2 in zip(model_1.state_dict().items(), model_2.state_dict().items()):
Expand Down