Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[BUGFIX] fix ELU function will appear nan when calculating the gradient #14673

Merged
merged 6 commits into from
Apr 23, 2019

Conversation

fierceX
Copy link
Contributor

@fierceX fierceX commented Apr 11, 2019

Description

fix ELU function will appear nan when calculating the gradient

e.g:

elu = nn.ELU()
elu.initialize()
x = nd.ones((2, 3))
x.attach_grad()
x[0] = 100
x[1] = -1
with autograd.record():
    y = elu(x)

y.backward()
print(y)
print(x.grad)

output is

[[100.         100.         100.        ]
 [ -0.63212055  -0.63212055  -0.63212055]]
<NDArray 2x3 @cpu(0)>

[[       nan        nan        nan]
 [0.36787945 0.36787945 0.36787945]]
<NDArray 2x3 @cpu(0)>

After experiments, it was found that the where and the exp were used together and that this bug would occur

e.g:

x = nd.ones((2, 3))
x.attach_grad()
x[0] = 100
x[1] = 5

with autograd.record():
    y = nd.where(x > 0, x, nd.exp(x))
y.backward()
print(y)
print(x.grad)

output:

[[100. 100. 100.]
 [  5.   5.   5.]]
<NDArray 2x3 @cpu(0)>

[[nan nan nan]
 [ 1.  1.  1.]]
<NDArray 2x3 @cpu(0)>

When the exp calculation appears inf, even if where does not select the value, but the gradient will still be nan,so this PR does not completely fix the problem, but based on this, modified the ELU calculation method so that it does not appear nan

@fierceX fierceX requested a review from szha as a code owner April 11, 2019 07:43
@@ -158,7 +158,8 @@ def __init__(self, alpha=1.0, **kwargs):
self._alpha = alpha

def hybrid_forward(self, F, x):
return F.where(x > 0, x, self._alpha * (F.exp(x) - 1.0))
_x = F.where(x < 0, x, F.zeros_like(x))
return F.where(x > 0, x, self._alpha * (F.exp(_x) - 1.0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!
Actually we do have ELU implemented in the backend so you can call it directly
Please use:

return F.LeakyReLU(x, act_type='elu', slope=self._alpha)

here instead and also you want to change: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_gluon.py#L1183 to:

return mx.nd.expm1(x) if x <= 0.0 else x

so that the test would still pass after this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug does not affect the forward calculation, but occurs when the gradient is calculated backwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fierceX Yes I understand, and I'm presenting the recommended way to fix this issue to you in the above comment.

@haojin2 haojin2 added the pr-awaiting-response PR is reviewed and waiting for contributor to respond label Apr 12, 2019
@haojin2
Copy link
Contributor

haojin2 commented Apr 17, 2019

@fierceX You also need to change tests/python/unittest/test_gluon.py Line 1183 to:

return mx.nd.expm1(x) if x <= 0.0 else x

so that the test could pass, thanks!

@@ -158,7 +158,7 @@ def __init__(self, alpha=1.0, **kwargs):
self._alpha = alpha

def hybrid_forward(self, F, x):
return F.where(x > 0, x, self._alpha * (F.exp(x) - 1.0))
F.LeakyReLU(x, act_type='elu', slope=self._alpha)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you have an extra whitespace here, please get rid of that and this is good to go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, less return.

@@ -158,7 +158,7 @@ def __init__(self, alpha=1.0, **kwargs):
self._alpha = alpha

def hybrid_forward(self, F, x):
return F.where(x > 0, x, self._alpha * (F.exp(x) - 1.0))
return F.LeakyReLU(x, act_type='elu', slope=self._alpha)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The extra space before return

Copy link
Contributor

@haojin2 haojin2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will merge once the test passes.

@fierceX
Copy link
Contributor Author

fierceX commented Apr 18, 2019

Note: There should still be a bug in the where operator.

@haojin2
Copy link
Contributor

haojin2 commented Apr 18, 2019

@fierceX We'll surely investigate that.

@haojin2
Copy link
Contributor

haojin2 commented Apr 19, 2019

@fierceX Can you rebase and do a force push?

@haojin2 haojin2 merged commit 494c29e into apache:master Apr 23, 2019
@haojin2
Copy link
Contributor

haojin2 commented Apr 23, 2019

@fierceX Merged, thanks for your contribution!

haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-response PR is reviewed and waiting for contributor to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants