[MKLDNN] Enable signed int8 support for convolution. #13697

ZhennanQin · 2018-12-20T03:53:22Z

Description

Major changes:

Integrate mkldnn signed int8 convolution.
Introduce new quantization flow out_type auto, which automatically determines the out_type according to calibration information. If negative value is detected from calibration min, out_type will be int8, otherwise, out_type will be uint8.
Introduce new operator quantize_v2, which only has 1 input data and will self-calculate min/max in runtime or use calibration information set by attrs. Then calibration informations are all recorded in model file instead of part in params file. User can easily modify them without touching params file. This operator also used to implement out_type auto.
Add new parameter monitor_all for monitor module to allow monitoring both input & output of operators. This is used for collecting calibration information of in_data to support quantize first layer of graph.

@pengzhao-intel @TaoLv @zheng-da @reminisce @azai91

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

ZhennanQin · 2018-12-20T11:55:31Z

@marcoabreu Looks like CI has trouble to build this PR. Some builds failed to compile this line
https://github.com/apache/incubator-mxnet/blob/84131807d67bb6a256b78cd75bb12a274c22347b/include/mxnet/tensor_blob.h#L290
with comparison between signed and unsigned integer expressions. But this PR doesn't touch that part,~~and looking into the code, both sides are returning size_t, which should be all same~~. Another prove is, clang can build pass with -Wsign-compare, which indicated that the gcc build has some problem. Can you have a look? Thanks.

Update: I found the reason, the right side returns from mshadow::Shape::Size(void), which is index_t.
https://github.com/dmlc/mshadow/blob/696803bd7723ade8230af878460d96c68a550fbc/mshadow/tensor.h#L126
So comparing size_t with index_t causing build failed. Question: why our CI didn't report this previously, and even not from other PRs sent at same time? Does ccache hide this issue?

pengzhao-intel · 2018-12-21T01:30:16Z

@KellenSunderland could you help take a look for the CI issue?

pengzhao-intel · 2018-12-21T01:30:49Z

FYI @yoel-shapiro

yoel-shapiro · 2018-12-23T10:30:55Z

Hi, I'm still getting an issue when using "int8"

setup:
I rebuilt MKLDNN master
cd mxnet-incubator, git pull origin pull/13697/head
verified that the script works with "uint8"

I am able to run calibration but when I use the quantized output model I get the following error, specifically when trying to read the model's predictions:

Traceback (most recent call last):
File "inference_dsalite.py", line 201, in
prob = pred[k_head][k_im].asnumpy()
File "/home/local/ANT/yoelsh/work/mxcode/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 1987, in asnumpy
ctypes.c_size_t(data.size)))
File "/home/local/ANT/yoelsh/work/mxcode/incubator-mxnet/python/mxnet/base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [14:12:29] src/operator/nn/mkldnn/mkldnn_base.cc:326: Unknown MKLDNN format for 4 dimensions: 47

ZhennanQin · 2018-12-23T11:13:55Z

@yoel-shapiro Thanks for trying MKLDNN int8 solution. Did you apply this PR? This PR removed line 346 check, and did lots of changes to support int8. Please try int8 with this PR.

ZhennanQin · 2018-12-23T12:45:32Z

@yoel-shapiro PR is updated to address the issue you mentioned. Please try again. Thanks for reporting this.

yoel-shapiro · 2018-12-23T13:05:30Z

fixed!
thank you!

pengzhao-intel · 2018-12-24T00:58:46Z

@marcoabreu Looks like CI has trouble to build this PR. Some builds failed to compile this line
......
So comparing size_t with index_t causing build failed. Question: why our CI didn't report this previously, and even not from other PRs sent at same time? Does ccache hide this issue?

@marcoabreu Sorry to bother you. Currently, the PR is blocked by CI failure as @ZhennanQin mentioned above.
Would you help take a look for the CI issue after the holiday? Thanks in advance.

example/quantization/imagenet_gen_qsym_mkldnn.py

include/mxnet/c_api.h

include/mxnet/executor.h

python/mxnet/contrib/quantization.py

src/operator/nn/mkldnn/mkldnn_convolution-inl.h

src/operator/quantization/mkldnn/mkldnn_quantize_v2-inl.h

src/operator/quantization/quantize_v2-inl.h

szha

The API change that @sergeykolychev pointed out needs to be addressed first. Requesting change to make it clear.

pengzhao-intel · 2019-02-04T02:23:24Z

Thanks for the suggestions and comments @szha @reminisce @sergeykolychev
We are on the Chinese holiday now and will address all comments when coming back to the office.

API concern is addressed. Thanks for the quick turnaround.

ZhennanQin · 2019-02-04T04:14:55Z

@reminisce As you have concern about entropy change, I removed the parts that for uint8 fix, as it's not the major task of this PR. Other entropy change are re-applying the fix of entropy which removed by mistake. Please review again. Thanks.

xinyu-intel · 2019-02-05T14:35:05Z

@reminisce
I tested resnet152 and inception-bn on Tesla V100 and resnet152 looks good. However, there is a quantization bug with inception-bn because #13297 enabled quantized_concat on CPU side. It seems that #14060 is fixing this bug. Below are accuracy numbers of these two models(after apply #14060 ).

MODE	ResNet152	Inception-bn
FP32	77.18%/93.00%	72.38%/90.61%
online	75.46%/92.24%	72.08%/90.31%
5 batch naive	75.41%/92.10%	71.81%/90.29%

reminisce · 2019-02-05T18:16:30Z

@xinyu-intel Thanks for the verification. I have approved PR #14060, but I cannot merge that PR because it hasn't passed CI test.

reminisce · 2019-02-07T18:09:47Z

PR #14060 is merged. Please rebase to trigger CI again. Thanks.

pengzhao-intel · 2019-02-09T00:59:34Z

@szha @reminisce @KellenSunderland @TaoLv all comments are resolved now.
Could you help take a look again and merge the PR in case no other new comment?

* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci

Enable s8s8 support for MKLDNN convolution.

9ba9d4c

ZhennanQin requested review from anirudh2290, nswamy, sergeykolychev, szha and yzhliu as code owners December 20, 2018 03:53

ZhennanQin mentioned this pull request Dec 20, 2018

MKLDNN int8 support #13516

Closed

ZhennanQin added 3 commits December 20, 2018 12:04

Fix cpp build

c642d33

Fix build.

57c2423

Merge remote-tracking branch 'offical' into s8_conv

f3a2dd2

ZhennanQin force-pushed the s8_conv branch from 1f68848 to e3287cd Compare December 20, 2018 10:50

Fix build

923b9ee

ZhennanQin force-pushed the s8_conv branch from e3287cd to 923b9ee Compare December 20, 2018 10:57

Remove openmp min/max reduction for windows build

5ea756c

TaoLv added Operator MKLDNN Quantization Issues/Feature Requests related to Quantization labels Dec 21, 2018

Add mkldnn_OIhw4i16o4i_s8s8 support

d1ba1ad

ZhennanQin added 2 commits December 23, 2018 21:10

Add all s8s8 weight format

7531e98

Merge branch 'master' into s8_conv

a9fca92

Change ssd quantize script.

3f75e82

TaoLv reviewed Dec 27, 2018

View reviewed changes

szha previously requested changes Feb 3, 2019

View reviewed changes

Add new api MXExecutorSetMonitorCallbackEX

eae2557

Add default value for monitor_all for cpp header.

11217c2

ZhennanQin force-pushed the s8_conv branch from ff4f75b to 11217c2 Compare February 4, 2019 04:40

reminisce approved these changes Feb 4, 2019

View reviewed changes

TaoLv approved these changes Feb 4, 2019

View reviewed changes

ZhennanQin added 4 commits February 5, 2019 08:55

Rerun CI

bfc91a6

Merge remote-tracking branch 'offical' into s8_conv

8932fd1

fix

fe08128

script change for uint8.

63dfdbf

xinyu-intel added 3 commits February 8, 2019 18:34

Merge remote-tracking branch 'upstream/master' into s8_conv

0b5e563

trigger ci

1210b5c

trigger ci

bff42ff

TaoLv approved these changes Feb 10, 2019

View reviewed changes

TaoLv merged commit 8b4a69a into apache:master Feb 10, 2019

ZhennanQin mentioned this pull request Feb 14, 2019

Fix entropy for uint8 #14150

Merged

7 tasks

ZhennanQin deleted the s8_conv branch September 16, 2019 07:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MKLDNN] Enable signed int8 support for convolution. #13697

[MKLDNN] Enable signed int8 support for convolution. #13697

ZhennanQin commented Dec 20, 2018 •

edited

Loading

ZhennanQin commented Dec 20, 2018 •

edited

Loading

pengzhao-intel commented Dec 21, 2018

pengzhao-intel commented Dec 21, 2018

yoel-shapiro commented Dec 23, 2018 •

edited

Loading

ZhennanQin commented Dec 23, 2018

ZhennanQin commented Dec 23, 2018

yoel-shapiro commented Dec 23, 2018

pengzhao-intel commented Dec 24, 2018

szha left a comment

pengzhao-intel commented Feb 4, 2019

ZhennanQin commented Feb 4, 2019

xinyu-intel commented Feb 5, 2019

reminisce commented Feb 5, 2019

reminisce commented Feb 7, 2019

pengzhao-intel commented Feb 9, 2019

[MKLDNN] Enable signed int8 support for convolution. #13697

[MKLDNN] Enable signed int8 support for convolution. #13697

Conversation

ZhennanQin commented Dec 20, 2018 • edited Loading

Description

Checklist

Essentials

ZhennanQin commented Dec 20, 2018 • edited Loading

pengzhao-intel commented Dec 21, 2018

pengzhao-intel commented Dec 21, 2018

yoel-shapiro commented Dec 23, 2018 • edited Loading

ZhennanQin commented Dec 23, 2018

ZhennanQin commented Dec 23, 2018

yoel-shapiro commented Dec 23, 2018

pengzhao-intel commented Dec 24, 2018

szha left a comment

Choose a reason for hiding this comment

pengzhao-intel commented Feb 4, 2019

ZhennanQin commented Feb 4, 2019

xinyu-intel commented Feb 5, 2019

reminisce commented Feb 5, 2019

reminisce commented Feb 7, 2019

pengzhao-intel commented Feb 9, 2019

ZhennanQin commented Dec 20, 2018 •

edited

Loading

ZhennanQin commented Dec 20, 2018 •

edited

Loading

yoel-shapiro commented Dec 23, 2018 •

edited

Loading