Add support for groupwise quantization for int8 weight only quantization #1121

jerryzh168 · 2024-10-18T23:57:11Z

Summary:
This is to support deprecating torchchat int8 weight only quantization: https://github.com/pytorch/torchchat/blob/ecc628da7c32c486742d92a751ed045b2a2194be/torchchat/utils/quantize.py#L582

Test Plan:
python test/integration/test_integration.py -k test_weight_only_groupwise_quant

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: This is to support deprecating torchchat int8 weight only quantization: https://github.com/pytorch/torchchat/blob/ecc628da7c32c486742d92a751ed045b2a2194be/torchchat/utils/quantize.py#L582 Test Plan: python test/integration/test_integration.py -k test_weight_only_groupwise_quant Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-10-18T23:57:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1121

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit cf7eafa with merge base 3475aed ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… loading per stage and future perf measurements (pytorch#1121) * add TrackTime, monitor perf for weight loading per stage * add CUDATrackTime * ruff formatting * add device for CUDATrackTime per PR feedback * add comment re: cuda context, ruff format

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 18, 2024

jerryzh168 requested review from vmpuri and Jack-Khuu October 18, 2024 23:57

jainapurva approved these changes Oct 19, 2024

View reviewed changes

jerryzh168 merged commit bc2aaaf into pytorch:main Oct 19, 2024
17 checks passed

jerryzh168 deleted the int8wo-groupwise branch October 19, 2024 01:09

jerryzh168 mentioned this pull request Oct 24, 2024

Revert "Add support for groupwise quantization for int8 weight only quantization" #1164

Closed

yanbing-j mentioned this pull request Dec 18, 2024

INT8 has a poor performance with groupsize > 0 in Torchchat, compared with BF16 and INT8 groupsize == 0 pytorch/torchchat#1427

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for groupwise quantization for int8 weight only quantization #1121

Add support for groupwise quantization for int8 weight only quantization #1121

jerryzh168 commented Oct 18, 2024

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading

Add support for groupwise quantization for int8 weight only quantization #1121

Add support for groupwise quantization for int8 weight only quantization #1121

Conversation

jerryzh168 commented Oct 18, 2024

pytorch-bot bot commented Oct 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1121

✅ No Failures

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading