-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Fix mkldnn backend when using naive engine #15089
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Could you explain why this modification can fix the problem? |
@zheng-da Sure. The root cause is, Now let's look at conv weight case,
At this time, weight is still in default format, so a reorder is registered inside mkldnn stream, but not submitted. And then,
When naive_engine is activated, This will inplace reorder weight into mkldnn format immediately. So weight is reordered twice before mkldnn conv kernel launch, causing incorrect result. The fix is quite simple, change the sequence of above code, to ensure
happens prior to
So for Do we have plan to adjust naive engine |
@mxnet-label-bot add [MKLDNN] |
@zheng-da @eric-haibin-lin please help take a review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Shall we add some documentation in the code, too?
@ZhennanQin NaiveEngine's |
In addition to the docs, should we add test for it? |
We need a separate test which set naive engine env var before mxnet is imported |
@szha @eric-haibin-lin We'd better to create naive engine CI for all UT, not only for some mkldnn tests. Naive engine should have the ability to pass all UT. |
I am going to merge this PR to fix the problem inside MKLDNN implementation under the naive engine. |
* Fix mkldnn backend when using naive engine * Rerun CI * Rerun CI * Rerun CI * Add comment
Description
This can fix #15078 and resolve all mkldnn related failures mentioned in #15005.
@pengzhao-intel @TaoLv @eric-haibin-lin @zheng-da
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments