-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket timeouts with HTTP/1.1 pipelining under load #438
Comments
We've observed this as well in a high traffic environment that is basically a cache where most requests are served within 200-400us at a rate of around 5 req/ms. Today we rolled back to cowboy and the errors dropped to 0. |
Interesting! This is one of those 'this should never happen' cases. A repro for this would be extremely handy! @atavistock your stacktrace notes bandit version 1.6.0. Is that because you're running off a branch? As I'm sure you're aware, pipeline support only landed in 1.6.1 |
It was the branch for fixing the HTTP1.1 pipelining problem before it was a new version of 1.6.1, so the version was still 1.6.0 but its the same as 1.6.1 |
Dug through logs to find some representative examples of how this error (or one similar to it) shows up in our systems. We're using EMQX (an MQTT broker) and a Phoenix app as the authentication/authorization webhook backend for it. On October 31st, we switched from Cowboy to Bandit Here is a typical example, we see an error on the client side and the server side (~17 million occurrences over the month we had Bandit in production):
There was a crash once within the month we had Bandit in production, and it was after we upgraded to 1.6.1, here are the logs from it:
|
@KazW I just fixed the 'one instance' error you indicated above; that should be (and now is) silently eaten since it occurs during Bandit's error fallback routines. |
@KazW I don't think your Header timeout errors are related to the original issue that @atavistock filed here. I'd be interested in knowing if you have seen any client side evidence of the header timeout / connection refusal errors when running Cowboy. Cowboy is quiet by default in the face of most errors (read timeouts included), so it's possible that they happen there as well if you're only looking at server logs. |
@atavistock do you have any details about what transport (HTTP/HTTPS) and topology (localhost/direct/proxied) is inducing those |
@mtrudel That's why I included the client side logs; all the errors on the client side vanished the moment we reverted to Cowboy. |
Hello! I am experiencing an issue with Phoenix that, I believe, is related to Bandit and might be related to this issue? I posted steps to reproduce on the Elixir Forum. The issue I encounter is only with Bandit and only with Edit: The issue I was encountering has disappeared since |
Hi @webdevcat-me - I think its a different issue. The issue I had only happens specifically when the client is sending requests via HTTP/1.1 pipelining (which is kind of unusual in the real world anyway), under every other case the exception I was seeing does not occur. |
My mistake, @atavistock, I didn't catch that |
This is a pretty esoteric edge case, but putting this issue in for informational purposes.
In an ongoing effort around performance and load testing I've found that with HTTP 1.1 pipelining under load we're seeing an exception in about 1 in 1_000 requests . Looks like the connection is closed prematurely, but its distinctly not client side reset (
ECONNRESET
) or due to a timeout threshold (ETIMEDOUT
).The text was updated successfully, but these errors were encountered: