Allow BF16 dtype support on CPU #1218

sanchitintel · 2024-07-24T19:56:39Z

Description

PyTorch supports BF16 dtype for CPUs. If CPUs don't support some BF16-related ISAs such as AVX512_BF16 & AMX_BF16, BF16 <-> FP32 conversions are done (compute happens in FP32, in these cases).

Changelog

BF16 dtype support check would now return True for CPUs.
It was returning False earlier in the quantize.py recipe if PyTorch was installed without CUDA support and device type was set to cpu in a config yaml file because no value for the device argument was being provided in the invocation to utils.get_dtype(), so it was defaulting to None, and utils.get_dtype() was throwing an error about the device not supporting BFloat16 .
Added device argument in calls to utils.get_dtype().

Test plan

Verified manually that the quantization recipe on CPU is not failing due to utils.get_dtype() returning False.
I could add a UT for CPU device, if required. Thanks!

pytorch-bot · 2024-07-24T19:56:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1218

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c832dd6 with merge base 6e4809a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-07-24T19:56:44Z

Hi @sanchitintel!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2024-07-24T20:07:47Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

joecummings · 2024-07-24T20:12:54Z

@ebsmothers Any reason not to do this?

joecummings

@sanchitintel Great PR! Could you just run lint for these files?

sanchitintel · 2024-07-24T20:18:07Z

Hi @joecummings, sorry, I haven't verified this change (and it doesn't look correct). I'll request review after verifying it, and will remove Draft mode for this PR. Thanks!

sanchitintel · 2024-07-24T22:22:33Z

Hi @joecummings, the PR is now ready for review. Thanks!

At my end, though, running pre-commit install to set up the linter results in a run-time error importlib.metadata.PackageNotFoundError: No package metadata was found for pre-commit, although I installed pre-commit:

SalmanMohammadi · 2024-07-24T23:02:57Z

Thanks for adding this @sanchitintel : ) pretty neat change

Perhaps one suggestion while we're here. RE this comment:

    # TODO (rohan-varma): prefer to use get_default_device() here to figure out whether user is training on
    # CPU or GPU, but it is not supported in versions of torch we test.

it looks like get_default_device has landed in stable - not sure whether it's worth just switching to use this and omit the device param?

ebsmothers · 2024-07-24T23:55:10Z

Thanks for the PR! I partially agree with @SalmanMohammadi -- get_default_device is probably the cleanest way to handle this. But I also think we should still pass the device explicitly once we infer it in the recipe (otherwise we have to infer it in the dtype utilities which creates needless separation on where defaults are defined).

So I think we can delete _get_device_type_from_env and just replace its usage here with a call to get_default_device. Then when we call get_device in the recipe (e.g. here for the generate recipe) we can pass that explicitly to get_dtype as you've done here.

ebsmothers · 2024-07-25T15:18:58Z

Ah sorry @sanchitintel I did not look at the code for get_default_device closely enough. I actually don't think we should use this after all (this is why our GPU unit tests are failing). This just gives the default device that tensors will be allocated to (which is generally CPU, even if there are GPUs available) rather than telling us if there are CUDA devices available. For example, on my machine with GPUs:

>>> import torch
>>> torch.get_default_device()
device(type='cpu')
>>> torch.cuda.is_available()
True

So this is actually doing something different than our existing _get_device_type_from_env. Sorry for the thrash on this, but I think you should revert to the previous version of the PR and we can just land that. Please also remove the TODO referenced by @SalmanMohammadi as this turned out to be misleading.

sanchitintel · 2024-07-25T19:28:06Z

Thanks for your prompt feedback, @SalmanMohammadi & @ebsmothers!

I reverted the change pertaining to torch.get_default_device, and also removed the related comment.

ebsmothers

Looks good, thanks for fixing this!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 24, 2024

joecummings reviewed Jul 24, 2024

View reviewed changes

sanchitintel force-pushed the patch-1 branch 3 times, most recently from 55d9518 to 12c2dad Compare July 24, 2024 21:18

sanchitintel marked this pull request as ready for review July 24, 2024 22:22

sanchitintel force-pushed the patch-1 branch from 096890a to 70b38fe Compare July 25, 2024 00:37

Allow BF16 dtype support for CPU

c832dd6

sanchitintel force-pushed the patch-1 branch from 70b38fe to c832dd6 Compare July 25, 2024 19:26

ebsmothers approved these changes Jul 26, 2024

View reviewed changes

ebsmothers merged commit e101420 into pytorch:main Jul 26, 2024
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow BF16 dtype support on CPU #1218

Allow BF16 dtype support on CPU #1218

sanchitintel commented Jul 24, 2024 •

edited

Loading

pytorch-bot bot commented Jul 24, 2024 •

edited

Loading

facebook-github-bot commented Jul 24, 2024

facebook-github-bot commented Jul 24, 2024

joecummings commented Jul 24, 2024

joecummings left a comment

sanchitintel commented Jul 24, 2024 •

edited

Loading

sanchitintel commented Jul 24, 2024 •

edited

Loading

SalmanMohammadi commented Jul 24, 2024 •

edited

Loading

ebsmothers commented Jul 24, 2024

ebsmothers commented Jul 25, 2024

sanchitintel commented Jul 25, 2024 •

edited

Loading

ebsmothers left a comment

Allow BF16 dtype support on CPU #1218

Allow BF16 dtype support on CPU #1218

Conversation

sanchitintel commented Jul 24, 2024 • edited Loading

Description

Changelog

Test plan

pytorch-bot bot commented Jul 24, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1218

✅ No Failures

facebook-github-bot commented Jul 24, 2024

Process

facebook-github-bot commented Jul 24, 2024

joecummings commented Jul 24, 2024

joecummings left a comment

Choose a reason for hiding this comment

sanchitintel commented Jul 24, 2024 • edited Loading

sanchitintel commented Jul 24, 2024 • edited Loading

SalmanMohammadi commented Jul 24, 2024 • edited Loading

ebsmothers commented Jul 24, 2024

ebsmothers commented Jul 25, 2024

sanchitintel commented Jul 25, 2024 • edited Loading

ebsmothers left a comment

Choose a reason for hiding this comment

sanchitintel commented Jul 24, 2024 •

edited

Loading

pytorch-bot bot commented Jul 24, 2024 •

edited

Loading

sanchitintel commented Jul 24, 2024 •

edited

Loading

sanchitintel commented Jul 24, 2024 •

edited

Loading

SalmanMohammadi commented Jul 24, 2024 •

edited

Loading

sanchitintel commented Jul 25, 2024 •

edited

Loading