[Config] Validation in CI + CLI utility #468

RdoubleA · 2024-03-08T07:31:21Z

Context

There is currently no testing around any of our config files that we expose to the user. E2E recipe tests typically use their own test configs. In order to have fully tested configs, we need to add the following to the CI:

Ensuring configs are well-formed: components can be instantiated correctly
Ensuring configs can run the recipe as intended

This PR addresses the first gap.

Addresses #466.

Changelog

Create validate function that simply iterates through a given configs and tries to instantiate any components
Use this as an added test that loops through all our configs and validates them
Expose this as a convenient CLI utility that users can employ to quickly confirm that their custom configs are well-formed using tune validate --config my_config.yaml

Test plan

Added unit tests
pytest tests
tune validate --config recipes/configs/alpaca_llama2_full_finetune.yaml

netlify · 2024-03-08T07:31:43Z

✅ Deploy Preview for torchtune-preview ready!

Name	Link
🔨 Latest commit	`c17834f`
🔍 Latest deploy log	https://app.netlify.com/sites/torchtune-preview/deploys/65f4d60d964daa00089e2572
😎 Deploy Preview	https://deploy-preview-468--torchtune-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

rohan-varma

Looks great overall, thanks for addding this additional checking! A couple of comments inline.

rohan-varma · 2024-03-08T20:26:22Z

tests/torchtune/_cli/test_validate.py

+        conf = OmegaConf.create(self.invalid_config_string())
+        OmegaConf.save(conf, dest)
+
+        args = f"tune validate --config {dest}".split()


Wonder if tune validate will throw at the first invalid formation it detects, or if it should wait and find all of them, then throw an error with all of the issues.

this is an interesting idea, I created a custom error to handle raising multiple exceptions. Here's what it would look like to the user:

rohan-varma · 2024-03-08T20:26:45Z

tests/torchtune/config/test_validate.py

+          dummy: 3
+        """
+
+    def test_validate(self):


how does this test differ from the one testing the tune CLI?

it doesn't, maybe I can just mock the actual validate since it's already covered in its own unit test

rohan-varma · 2024-03-08T20:28:26Z

torchtune/config/_validate.py

+from torchtune.config._instantiate import _has_component, _instantiate_node
+
+
+def validate(cfg: DictConfig) -> None:


if I pass in a config that has:

"blah: 8" where "blah" is just some random string that doesn't mean anything to torchtune's configs, will the system throw with an invalid parameter error or similar? Basically the case I'm thinking of is - if the user makes a typo and sets like "dtyype" instead of "dtype" and we still silently continue, not informing the user that their dtype isn't what they may think it is.

It's hard to distinguish a random, unused parameter from a parameter that has a default value already that you're overriding. For random strings, if the recipe does not use it there there will be no error. It is difficult to enforce not having random parameters because we don't have defined dataclasses/params/structured configs for the recipes, so I don't think it's worth it. In the case of typos, if the parameter is actually used in the recipe it should throw from the recipe itself.

pytorch-bot · 2024-03-14T20:32:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/468

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c17834f with merge base 9c75d48 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ebsmothers

Thanks for creating this PR. I think this is a good first step for config validation. I do think we should be very explicit about the errors that this catches vs the ones it doesn't catch, specifically around missing fields. But I acknowledge that's prob quite a bit harder and some validation is a lot better than none. No major concerns from my side on landing this version of validation.

ebsmothers · 2024-03-15T18:21:34Z

torchtune/utils/argparse.py

-
-        https://docs.python.org/3/library/argparse.html#the-add-argument-method
-        """
-        assert not kwargs.get("required", False), "Required not supported"


Why did we have this to begin with?

👀 @pbontrager

ebsmothers · 2024-03-15T18:22:54Z

torchtune/config/_validate.py

+                _component_ = _get_component_from_path(nodedict.get("_component_"))
+                kwargs = {k: v for k, v in nodedict.items() if k != "_component_"}
+                sig = inspect.signature(_component_)
+                sig.bind(**kwargs)


This is cool

ebsmothers · 2024-03-15T18:28:40Z

torchtune/config/_validate.py

+            except TypeError as e:
+                if "missing a required argument" in str(e):
+                    sig.bind_partial(**kwargs)


Just curious, does this actually help catch missing arguments? Seems we are giving a lot of leeway here. E.g. if I have:

class ModelClass: def __init__(self, a: float, b: str): self.a = a self.b = b ...

Then a config like

model: _component_: model.path.ModelClass a: something

will pass this check, right? I guess in general we would need to know whether the instantiation in the recipe is instantiate(cfg.model) or instantiate(cfg.model, b='something i construct inside the recipe'), which we can't really know on the config side alone. (Not sure there's a way around this, just pointing it out.)

Yeah I don't think we'll be able to know as you pointed out. We'll have to rely on an instantiation error from the recipe itself for catching actual missing arguments and not recipe instantiated arguments.

ebsmothers · 2024-03-15T18:33:24Z

tests/torchtune/_cli/test_validate.py

@@ -0,0 +1,84 @@
+#!/usr/bin/env python3


Maybe a dumb q: why do we need the shebang here?

I don't think this is needed, just copied over from another file's license stuff

ebsmothers · 2024-03-15T18:34:33Z

tests/torchtune/_cli/test_validate.py

+        test2:
+          _component_: torchtune.utils.get_dtype
+          dtype: fp32
+          dummy: 3


Isn't this redundant? (Maybe I am missing something)

OK I see how this is copied from config/test_validate.py. I agree with the discussion below, would consolidate these in one place.

ebsmothers · 2024-03-15T18:36:01Z

tests/torchtune/config/test_validate.py

+        with pytest.raises(ConfigError) as excinfo:
+            config.validate(conf)
+        exc_config = excinfo.value
+        for e in exc_config.errors:


Maybe assert length here too

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 8, 2024

rohan-varma self-requested a review March 8, 2024 20:25

rohan-varma reviewed Mar 8, 2024

View reviewed changes

RdoubleA force-pushed the rafiayub/validate_config branch from 6539fc6 to 50e923e Compare March 13, 2024 19:25

RdoubleA added 8 commits March 14, 2024 13:16

add validate API and test all configs in CI

edb7fe7

add cli utility

5cf6c40

add test

9eefb52

move test_configs

3be5e8c

add unit tests

0d107c3

patch test

0f25c69

add custom multiexception class

f1b31f0

enable merge cli in validate

8805e70

RdoubleA force-pushed the rafiayub/validate_config branch from 50e923e to 8805e70 Compare March 14, 2024 20:32

RdoubleA added 4 commits March 14, 2024 13:56

try skipping instantiate test

f8a12ec

fix circular import

5aa2955

switch to inspect to prevent potential OOM

153b8fa

fix partial instantiates

3b28c3c

ebsmothers approved these changes Mar 15, 2024

View reviewed changes

consolidate test configs

c17834f

RdoubleA merged commit daf5467 into main Mar 15, 2024
21 checks passed

RdoubleA deleted the rafiayub/validate_config branch March 15, 2024 23:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Config] Validation in CI + CLI utility #468

[Config] Validation in CI + CLI utility #468

RdoubleA commented Mar 8, 2024 •

edited

Loading

netlify bot commented Mar 8, 2024 •

edited

Loading

rohan-varma left a comment

rohan-varma Mar 8, 2024

RdoubleA Mar 11, 2024

rohan-varma Mar 8, 2024

RdoubleA Mar 11, 2024

rohan-varma Mar 8, 2024

RdoubleA Mar 11, 2024

pytorch-bot bot commented Mar 14, 2024 •

edited

Loading

ebsmothers left a comment

ebsmothers Mar 15, 2024

RdoubleA Mar 15, 2024

ebsmothers Mar 15, 2024

ebsmothers Mar 15, 2024

RdoubleA Mar 15, 2024

ebsmothers Mar 15, 2024

RdoubleA Mar 15, 2024

ebsmothers Mar 15, 2024

ebsmothers Mar 15, 2024

ebsmothers Mar 15, 2024

		from torchtune.config._instantiate import _has_component, _instantiate_node


		def validate(cfg: DictConfig) -> None:

[Config] Validation in CI + CLI utility #468

[Config] Validation in CI + CLI utility #468

Conversation

RdoubleA commented Mar 8, 2024 • edited Loading

Context

Changelog

Test plan

netlify bot commented Mar 8, 2024 • edited Loading

✅ Deploy Preview for torchtune-preview ready!

rohan-varma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pytorch-bot bot commented Mar 14, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/468

✅ No Failures

ebsmothers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RdoubleA commented Mar 8, 2024 •

edited

Loading

netlify bot commented Mar 8, 2024 •

edited

Loading

pytorch-bot bot commented Mar 14, 2024 •

edited

Loading