Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test test-http-client-timeout-event #2555

Closed
joaocgreis opened this issue Aug 26, 2015 · 7 comments
Closed

Investigate flaky test test-http-client-timeout-event #2555

joaocgreis opened this issue Aug 26, 2015 · 7 comments
Assignees
Labels
arm Issues and PRs related to the ARM platform. build Issues and PRs related to build files or the CI. test Issues and PRs related to the tests.

Comments

@joaocgreis
Copy link
Member

Examples of failures:

@joaocgreis joaocgreis changed the title test-http-client-timeout-event Investigate flaky test test-http-client-timeout-event Aug 26, 2015
@joaocgreis joaocgreis added build Issues and PRs related to build files or the CI. test Issues and PRs related to the tests. labels Aug 26, 2015
@brendanashworth brendanashworth added the arm Issues and PRs related to the ARM platform. label Aug 26, 2015
@brendanashworth
Copy link
Contributor

There seems to be a race condition between timers. I'll try to send in a PR in the coming week.

@brendanashworth brendanashworth self-assigned this Sep 11, 2015
@Trott
Copy link
Member

Trott commented Sep 11, 2015

Possible solution might be to use common.platformTimeout() on those timeouts to put a little more space between them on the Pi 1 so that they run in the expected order.

@brendanashworth
Copy link
Contributor

@Trott good tip, thanks. I ended up doing the platform timeout for part of it, but refractored some of it to allow for ending early and reducing racy-ness. Here's the diff if you want it, I'll submit the PR once I get access to a pi to double check:

diff --git a/test/parallel/test-http-client-timeout-event.js b/test/parallel/test-http-client-timeout-event.js
index c9d6594..77776eb 100644
--- a/test/parallel/test-http-client-timeout-event.js
+++ b/test/parallel/test-http-client-timeout-event.js
@@ -25,16 +25,24 @@ server.listen(options.port, options.host, function() {
     server.close();
   });

-  var timeout_events = 0;
   req.setTimeout(1);
-  req.on('timeout', function() {
-    timeout_events += 1;
-  });
-  setTimeout(function() {
+  req.on('timeout', common.mustCall(function() {
+    clearTimeout(timer);
+
     req.destroy();
-    assert.equal(timeout_events, 1);
-  }, 100);
-  setTimeout(function() {
-    req.end();
-  }, 50);
+  }, 1));
+
+  // Emit the request immediately, and once the request is given a socket,
+  // ensure that the timeout has been emitted after ~100ms. If the timeout
+  // is called before that, the test will exit early.
+  req.end();
+
+  var timer;
+  req.on('socket', function() {
+    // If the timer is called, then the timeout is either taking an
+    // extraordinarily long time, or it won't fire.
+    timer = setTimeout(function() {
+      throw new Error('timeout was not called');
+    }, common.platformTimeout(100));
+  });
 });

@brendanashworth
Copy link
Contributor

I wasn't able to see any flakiness before or after on a raspberry pi after 1000 runs each, so the PR may just be in the blind.

cc @rvagg I'm done with the machine, thanks!

@Trott
Copy link
Member

Trott commented Nov 19, 2015

@brendanashworth Stress test on current master shows that this test is still flaky: https://ci.nodejs.org/job/node-stress-single-test/24/nodes=pi1-raspbian-wheezy/console

Fails with:

assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: 0 == 1
    at null._onTimeout (/home/iojs/build/workspace/node-stress-single-test/nodes/pi1-raspbian-wheezy/test/parallel/test-http-client-timeout-event.js:35:12)
    at Timer.listOnTimeout (timers.js:92:15)

If you've got a publicly available branch on GitHub with your proposed solution, we can run that through the stress test and see if it fails or not.

@Trott
Copy link
Member

Trott commented Nov 22, 2015

Just subbing in common.platform() got rid of the flakiness. 3421 runs without a failure: https://ci.nodejs.org/job/node-stress-single-test/nodes=pi1-raspbian-wheezy/27/console It was set for 9999 runs, but that would have taken over 11 hours (at the rate of 4 seconds per test on Raspberry Pi) so it looks like Rod manually killed it at that point.

@brendanashworth I'll submit a PR with just the common.platform() changes. If you'd like to submit your changes here instead, I'll close that PR.

Trott added a commit to Trott/io.js that referenced this issue Nov 22, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: nodejs#2555
@Trott
Copy link
Member

Trott commented Nov 22, 2015

#3968

@Trott Trott closed this as completed in 487de19 Nov 23, 2015
Trott added a commit that referenced this issue Dec 1, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: #2555
PR-URL: #3968

Reviewed-By: Brendan Ashworth <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Roman Klauke <[email protected]>
Trott added a commit that referenced this issue Dec 4, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: #2555
PR-URL: #3968

Reviewed-By: Brendan Ashworth <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Roman Klauke <[email protected]>
Trott added a commit that referenced this issue Dec 5, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: #2555
PR-URL: #3968

Reviewed-By: Brendan Ashworth <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Roman Klauke <[email protected]>
Trott added a commit that referenced this issue Dec 17, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: #2555
PR-URL: #3968

Reviewed-By: Brendan Ashworth <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Roman Klauke <[email protected]>
Trott added a commit that referenced this issue Dec 23, 2015
Use common.platformTimeout() to make test more reliable on Raspberry Pi.

Fixes: #2555
PR-URL: #3968

Reviewed-By: Brendan Ashworth <[email protected]>
Reviewed-By: Roman Reiss <[email protected]>
Reviewed-By: Roman Klauke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm Issues and PRs related to the ARM platform. build Issues and PRs related to build files or the CI. test Issues and PRs related to the tests.
Projects
None yet
Development

No branches or pull requests

3 participants