-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add torchtune CLI documentation page #1052
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1052
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 38299a3 with merge base 71741df (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent documentation, this really sets the example of what our API usage pages should look like.
docs/source/tune_cli.rst
Outdated
**Specify model files you don't want to download** | ||
|
||
Some checkpoint directories can be very large and it can eat up a lot of bandwith and local storage to download the all of the files every time, even if you might | ||
not need a lot of them. This is especially common when the same checkpoint exists in different formats. You can specify patterns to ignore, by default we ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not need a lot of them. This is especially common when the same checkpoint exists in different formats. You can specify patterns to ignore, by default we ignore | |
not need a lot of them. This is especially common when the same checkpoint exists in different formats. You can specify patterns to ignore to prevent downloading files with matching names. By default we ignore |
docs/source/tune_cli.rst
Outdated
Run a recipe | ||
------------ | ||
|
||
The ``tune run <recipe> --config <config>`` is a conveniance wrapper around `torchrun <https://pytorch.org/docs/stable/elastic/run.html>`_. ``tune run`` allows you to specify your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ``tune run <recipe> --config <config>`` is a conveniance wrapper around `torchrun <https://pytorch.org/docs/stable/elastic/run.html>`_. ``tune run`` allows you to specify your | |
The ``tune run <recipe> --config <config>`` is a convenient wrapper around `torchrun <https://pytorch.org/docs/stable/elastic/run.html>`_. ``tune run`` allows you to specify by name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using it as a noun like convenience function
docs/source/tune_cli.rst
Outdated
------------ | ||
|
||
The ``tune run <recipe> --config <config>`` is a conveniance wrapper around `torchrun <https://pytorch.org/docs/stable/elastic/run.html>`_. ``tune run`` allows you to specify your | ||
recipe and config by name to use library versions, or by path to use your local recipes/configs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recipe and config by name to use library versions, or by path to use your local recipes/configs. | |
a built-in recipe or config or by path one of your local recipes/configs. |
**Specifying distributed (torchrun) arguments** | ||
|
||
``tune run`` supports launching distributed runs by passing through arguments preceding the recipe directly to torchrun. This follows the pattern used by torchrun | ||
of specifying distributed and host machine flags before the script (recipe). For a full list of available flags for distributed setup, see the `torchrun docs <https://pytorch.org/docs/stable/elastic/run.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth discussing the two most common flags, nnodes and nproc_per_node?
.. code-block:: bash | ||
|
||
tune run <RECIPE> --config <CONFIG> epochs=1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add a note here that links to config tutorial for more override options. But this is also clear as is, so your call
... | ||
|
||
|
||
**Download a gated model** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be underlined so it renders as a H3 block. This will show up in the side outline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
H3 blocks render identically to H1 blocks which I found very confusing when scrolling through the page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to underline this with ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
so that it renders correctly as a subheading and shows up in the sidebar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to fix the subheadings rendering but otherwise looks great!
Co-authored-by: Philip Bontrager <[email protected]>
Context
What is the purpose of this PR? Is it to
Changelog
Added CLI document page outlining all the functionality of the tune cli. Also added a reference to models to make it linkable.
Test plan
Rendered documentation and proofread it.
Joint PR from @joecummings and @pbontrager