Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Avoid adding SegfaultLogger if process already has sig handler. #13842

Merged
merged 1 commit into from
Jan 15, 2019

Conversation

frankfliu
Copy link
Contributor

Description

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
@frankfliu
Copy link
Contributor Author

frankfliu commented Jan 11, 2019

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Jan 11, 2019
@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@lanking520
Copy link
Member

@frankfliu Have you tried to build Scala if have that flag=1?
@szha @anirudh2290 @apeforest Please take a look in here

@lanking520 lanking520 self-requested a review January 11, 2019 18:48
Copy link
Member

@lanking520 lanking520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through the discussion offline with @frankfliu . This change would help JVM and all other process that consume libmxnet.so not messed up with multiple SignalHandling request. Idealy we should always kept it open for all build process since the MXNet Engine would bring useful information of the log of segfault.

This will not have any side effect to Python package as it did not do a register in the process with Signal Handler
Reference:
https://stackoverflow.com/questions/17102919/is-it-valid-to-have-multiple-signal-handlers-for-same-signal

Copy link
Contributor

@apeforest apeforest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@apeforest
Copy link
Contributor

FYI: there is a stackoverflow discussion related to this: https://stackoverflow.com/questions/11871693/checking-for-installed-signal-handler

Copy link
Contributor

@piyushghai piyushghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw two StackOverflow links going around in the comments on the PR.
Can these be added to comments in code around the if block here ?

It's fine for them to be in the PR as well, but having those in code will improve it's readability :)

@lanking520 lanking520 merged commit bc7ea31 into apache:master Jan 15, 2019
@larroy
Copy link
Contributor

larroy commented Jan 15, 2019

Good catch. So the JVM has a handler for segfault installed?

@frankfliu frankfliu deleted the seg branch January 18, 2019 05:37
stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
…he#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
lanking520 pushed a commit to lanking520/incubator-mxnet that referenced this pull request Apr 26, 2019
…he#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
lanking520 pushed a commit to lanking520/incubator-mxnet that referenced this pull request Apr 30, 2019
…he#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
zachgk pushed a commit to zachgk/incubator-mxnet that referenced this pull request May 16, 2019
…he#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
…he#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants