-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
f2a6f7f
to
03453c4
Compare
c5699d8
to
9c23d1b
Compare
@mxnet-label-bot add [pr-work-in-progress] |
a211d6b
to
3b797e6
Compare
shall we also add a cudnn_off flag to this op? |
d15143d
to
9f29aec
Compare
8a7707c
to
cb3d2b0
Compare
89f497d
to
94a48f9
Compare
9db5fde
to
34288d4
Compare
b3a2af4
to
dcc7636
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TaoLv @pengzhao-intel @ptrendx @DickJC123 could you guys help review?
Thanks for the review, @eric-haibin-lin @TaoLv. @pengzhao-intel @ptrendx @DickJC123 I'm holding onto updating the PR until you get a chance to review this PR. |
@szha sorry I am on the vocation and don't have enough time to look into the details. @TaoLv took the review so please go ahead to move forward for this PR. Happy Chinese New Year @szha @eric-haibin-lin @TaoLv :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. My comments are addressed.
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
I m not able to get the speed in the test case, see #13825 (comment) |
@roywei by default cudnn_off is turned on. You need to turn it off to benefit from cudnn dropout. |
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
Description
Use dropout in CuDNN
Tested on p3.2x (V100). Test case:
46ms4.3ms48ms15msChecklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
cudnnSetDropoutDescriptor
is an expensive call due to initialization on each of the stream multiprocessor on a GPU. Since cudnn 7,cudnnRestoreDropoutDescriptor
becomes available so that the initialized space can be cached. This descriptor is currently used in both Dropout op and RNN op. We need a mechanism for caching this so that initialization on each stream happens only once, as the same desc can be shared among operators.