Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeoutErrors are possible if the response is available but not processed #8

Open
jsocol opened this issue Apr 17, 2017 · 8 comments
Open

Comments

@jsocol
Copy link

jsocol commented Apr 17, 2017

If a request is made and a response comes back, but the response is not handled before the timeout, a TimeoutError is thrown. This can occur when CPU-bound code blocks the event loop. To demonstrate, run a fast webserver in one process, and use this example code:

'use strict';

const sleep = require('sleep');  // npm install sleep
const cu = require('copilot-util');  // npm install copilot-util

function main() {
  cu.http.request({
    'host': 'localhost',
    'port': 5000,
    'path': '/',
    'timeout': 2000  // Set to less than the "sleep"
  }).then(function (response) {
    console.log(response.statusCode);
  });
  // Need to give this enough time to let the request start,
  // setImmediate caused the event loop to block before the request
  // began, but with setTimeout(..., 100) the request was often handled
  // before the sleep began
  setTimeout(function () {
    sleep.msleep(4000);  // blocks the event loop for 4 seconds
  }, 10);
}

main();

In the webserver process, you can observe a response getting written with a 200 status code, but because the promise timeout is handled before the response, we still get a TimeoutError:

Unhandled rejection TimeoutError: Timeout of 2000ms reached (/)
    at ClientRequest.rejectWithTimeoutError (/Users/james/code/tmp/node_modules/copilot-util/lib/http/request/_timeout_listener.js:8:14)
    at ClientRequest.g (events.js:291:16)
    at emitNone (events.js:86:13)
    at ClientRequest.emit (events.js:185:7)
    at Socket.emitRequestTimeout (_http_client.js:565:42)
    at Socket.g (events.js:291:16)
    at emitNone (events.js:91:20)
    at Socket.emit (events.js:185:7)
    at Socket._onTimeout (net.js:342:8)
    at ontimeout (timers.js:380:14)
    at tryOnTimeout (timers.js:244:5)
    at Timer.listOnTimeout (timers.js:214:5)
@jsocol
Copy link
Author

jsocol commented Apr 17, 2017

I'm pretty sure this is related to bluebird, because replacing copilot-util with Node's http module does not exhibit the same behavior, e.g. the timeout passes but the response is still handled correctly:

'use strict';

const sleep = require('sleep');  // npm install sleep
const http = require('http');

function main() {
  http.get({
    'host': 'localhost',
    'port': 5000,
    'path': '/',
    'timeout': 2000  // Set to less than the "sleep"
  }, function (response) {
    console.log(response.statusCode);
  });
  // Need to give this enough time to let the request start,
  // setImmediate caused the event loop to block before the request
  // began, but with setTimeout(..., 100) the request was often handled
  // before the sleep began
  setTimeout(function () {
    sleep.msleep(4000);  // blocks the event loop for 4 seconds
  }, 10);
}

main();

@pgoldrbx
Copy link
Contributor

pgoldrbx commented Jun 30, 2017

@jsocol @paulfryzel — did we come up with a proposal on next steps for this issue?

I'd like to experiment with updating bluebird and seeing if we can reproduce the issue. There are some patches released after the current version that attempt to use native Promise if available to resolve some issues around callbacks not being executed.

e.g. Version 3.3.5 includes:
petkaantonov/bluebird@b8e580c

@pgoldrbx
Copy link
Contributor

Update: When using the native Promise instead of bluebird, wrapped around the core http module, I still get the same timeout.

@jsocol
Copy link
Author

jsocol commented Jul 1, 2017

I don't necessarily think there's anything to do here other than document it. The timeout on the promise is a different timeout than the request, and is arguably behaving correctly since the promise hasn't settled within that time. Open to other suggestions, of course.

@pgoldrbx
Copy link
Contributor

pgoldrbx commented Jul 1, 2017

I think maybe @jsvk had some thoughts on this as well

@jsvk
Copy link

jsvk commented Jul 5, 2017

one thing we can do is setImmediate the timeout check so that it happens in the next tick, which will allow us to check the resolution of the promise before processing it as a timeout.

it won't fix the fact that the end-user is waiting more than 15 seconds for a request, but it will help alleviate some of the confusion around this issue from people seeing reports of some requests taking >15000ms that didn't actually take that long

I also think it would be cleaner; we should check that a response was received before writing it off as a timeout just because it wasn't lucky enough to respond fast enough to be on a previous tick

we had the same issue with our caching layer; it waits 100ms for Redis to respond and if it takes longer, we fall back to the database. We used setTimeout to track the timeout and had the same issue of misreporting timeouts

we added a single setImmediate to push the timeout check onto the next tick and it cut our 'timeout rate' in half (if I remember correctly)

here's the PR for reference: https://github.com/CondeNast-Copilot/tsugu-service/pull/285

https://github.com/CondeNast/copilot-util/compare/setImmediate-timeout-handler

here's a branch where I added the setImmediate, it seems to work more closely to what we expect, but there's still a couple of quirks:

  1. The "timeout" period is now extended to however long your timeout is + however long it takes for the loop to become unblocked. This means that
    • if your timeout is 2 seconds, but you block the loop for 4 seconds, your request has effectively 4 seconds to complete.
    • if your timeout is 2 seconds, but you block the loop for 1 second, your request has the expected 2 seconds to complete.
  2. Some logic needs to be added in the timeout handler as it will still fire regardless of if the response completed

@nosamanuel
Copy link

We're already passing the timeout option to the HTTP request in https://github.com/CondeNast/copilot-util/blob/master/lib/http/request/index.js#L42.

Why is the onTimeout error in https://github.com/CondeNast/copilot-util/blob/master/lib/http/request/index.js#L48 needed at all?

@jsocol
Copy link
Author

jsocol commented Jul 27, 2017

Iirc that's only a response timeout, not a connect timeout. The promise timeout covers all cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants