-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MKLDNN] Enable signed int8 support for convolution. #13697
Conversation
@marcoabreu Looks like CI has trouble to build this PR. Some builds failed to compile this line Update: I found the reason, the right side returns from mshadow::Shape::Size(void), which is index_t. |
@KellenSunderland could you help take a look for the CI issue? |
FYI @yoel-shapiro |
Hi, I'm still getting an issue when using "int8" setup: I am able to run calibration but when I use the quantized output model I get the following error, specifically when trying to read the model's predictions: Traceback (most recent call last): |
@yoel-shapiro Thanks for trying MKLDNN int8 solution. Did you apply this PR? This PR removed line 346 check, and did lots of changes to support int8. Please try int8 with this PR. |
@yoel-shapiro PR is updated to address the issue you mentioned. Please try again. Thanks for reporting this. |
fixed! |
@marcoabreu Sorry to bother you. Currently, the PR is blocked by CI failure as @ZhennanQin mentioned above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API change that @sergeykolychev pointed out needs to be addressed first. Requesting change to make it clear.
Thanks for the suggestions and comments @szha @reminisce @sergeykolychev |
API concern is addressed. Thanks for the quick turnaround.
@reminisce As you have concern about entropy change, I removed the parts that for uint8 fix, as it's not the major task of this PR. Other entropy change are re-applying the fix of entropy which removed by mistake. Please review again. Thanks. |
@reminisce
|
@xinyu-intel Thanks for the verification. I have approved PR #14060, but I cannot merge that PR because it hasn't passed CI test. |
PR #14060 is merged. Please rebase to trigger CI again. Thanks. |
@szha @reminisce @KellenSunderland @TaoLv all comments are resolved now. |
* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci
* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci
* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci
* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci
* Enable s8s8 support for MKLDNN convolution. * Fix cpp build * Fix build. * Fix build * Remove openmp min/max reduction for windows build * Add mkldnn_OIhw4i16o4i_s8s8 support * Add all s8s8 weight format * Change ssd quantize script. * Update * Manually cast mshadow shape size to size_t * Fix merge. * Fix perl package. * Retrigger CI * Fix GPU test * Fix GPU test * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Remove weight_channelwise_scale from params. * Fix * Keep API compatible. * Rerun CI * Rerun CI * Rerun CI * Rerun CI * Address comments. * fix. * Address debug build. * Add comment for next_impl * Rerun ci * Add new api MXExecutorSetMonitorCallbackEX * Add default value for monitor_all for cpp header. * Rerun CI * fix * script change for uint8. * trigger ci * trigger ci
Description
Major changes:
auto
, which automatically determines the out_type according to calibration information. If negative value is detected from calibration min, out_type will be int8, otherwise, out_type will be uint8.auto
.monitor_all
for monitor module to allow monitoring both input & output of operators. This is used for collecting calibration information of in_data to support quantize first layer of graph.@pengzhao-intel @TaoLv @zheng-da @reminisce @azai91
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.