-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send stream failures up the promise chain #1
Conversation
Prior to the change, race conditions could cause errors on the stream to be lost. The symptom for npm users would be the infamous "cb() never called" error.
The test in the original PR fails consistently for me without this patch, and passes consistently with it. It's a race condition, so yeah, I wouldn't be surprised if the test passes spuriously sometimes, but it's still better than nothing. Reading through the code, this is clearly a race condition, and the correct way to fix it. Good job. I'll cut a release with this shortly. |
Hey, interesting news! So, as of this commit, the The problem was that the test was installing a package from the test registry, and getting the wrong integrity value. But, always the same integrity value was desired, and always the same one was showing up instead, so it seemed pretty deterministic. Git bisect found this as the first bad commit. After a few hours of digging, I discovered that it was getting an integrity value from the package-lock.json file from a prior test's install which wasn't cleaned up. (The test was removing ⚡️ this patch exposes a class of integrity issues which were completely ignored before ⚡️ Good job. I'm even more convinced that this was a legit problem before you found and fixed it. Thanks! |
That's great! I suspected this fix was probably going to solve a bunch of difficult-to-diagnose issues, but I did not know how far it would reach. |
The problematic code prior to this PR is this:
There's a race condition in this code. The problem is that neither of the
tarStream.on('error', reject)
lines are capable of reporting all errors withtarStream
. If atarStream
error happens prior to the outer resolve being called, then the firsttarStream.on('error'...
will pass the error up correctly. (Actually, I don't think this case is possible. At some point I had instrumented the callback passed to both calls totarStream.on('error'...
and the callback passed tosetImmediate(...)
, and it never ever ever ever ever happened that the firsttarStream.on('error'
caught an error prior to the call scheduled withsetImmediate
.)However, for those errors that cannot be detected immediately, the first
tarStream.on('error'
call is useless. By the time these errors are raised, the resolve call scheduled bysetImmediate
will have happened, and thus even if the firsttarStream.on('error'
calls reject, this will have no effect whatsoever because the promise has already resolved.So there is a window of time between the first
tarStream.on('error'
and the second one where if atarStream
error occurs, it won't be sent up the promise chain. How big a window it is depends on how longrimraf
andmkdirp
take to do their work. If atarStream
error is raised during that window, the first error event handler cannot do anything about it, and the 2nd event handler is not setup yet so it cannot do anything either.When I first filed this issue in the wrong place I had included a test, but further testing indicates that the test was not reliable. Replicating race conditions in a stable way is extremely hard unless the test can get control of all the conditions that participate in the race so that the events that lead to the failure can always be replicated in the same order. I thought I had it, but nope... hope springs eternal I guess.
The problem I initially ran into was that I was running
npm
in a Docker build wheregit
was not installed, and I got the utterly non-descriptivecb() never called
error message. There was no other message with it. I've seen cases wherecb() never called
is accompanied with some more information but in my case that was the sole error message and--loglevel=silly
did not help either. The versions involved were Node v12.5.0 and npm v6.9.2. This macro-issue can be replicated in the following way:Create a new temporary directory for the test, and move into it.
We want to create a directory that contains just enough executables to allow
npm
to run overall but does not allow it to rungit
specifically:package.json
with this:(The main thing here is to have a dependency that requires the use of
git
.)(This first run of
npm install
to do a real install is necessary. IfPATH=jail
is used here,npm
will error but the error message is useful.)This results in:
Running the same, with the PR being proposed here, results in:
Again, this is a race condition so I would not be surprised if some folks try the above and don't get
cb() never called
.