-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@eric-haibin-lin @szha for review |
CUBE_CONSTANT = 0.044715 | ||
TWO_OVER_PI = 0.7978845608028654 | ||
def g(x): | ||
return TWO_OVER_PI * (x + CUBE_CONSTANT * np.power(x, 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Often xxx is cheaper than x ** 3. Also, if one takes inspiration from https://en.wikipedia.org/wiki/Horner's_method , which is often more numerically stable, then g(x) could be
return TWO_OVER_PI * x * (1 + CUBE_CONSTANT * x * x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the actual implementation so performance is not critical here... I'll change the internal mshadow math kernel in C++ based on the link you provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check this part for the backend implementation: https://github.com/apache/incubator-mxnet/blob/3bf1279fcd5efb80c0d098583ae213cbd2867b66/src/operator/mshadow_op.h#L134-L147. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hendrycks Thanks for your prompt reply! I'll wait for @eric-haibin-lin and @szha to give a review before we merge this in.
3bf1279
to
e0a565c
Compare
@mxnet-label-bot add [Feature request, Operator, pr-awaiting-review] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great contribution! One comment about the macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Remember to also add test for the new gluon block
@szha Gluon test added, also fixed a problem with existing SELU test. |
347e355
to
26d6fba
Compare
fdda420
to
b6cfc08
Compare
@szha Build finally passed, good for merge? |
Description
Address #12984, operator from https://arxiv.org/abs/1606.08415.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
Flakiness also checked: