Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent requests slightly larger than 4096 bytes are slow #2062

Closed
ssendev opened this issue Jan 10, 2022 · 20 comments
Closed

Concurrent requests slightly larger than 4096 bytes are slow #2062

ssendev opened this issue Jan 10, 2022 · 20 comments
Labels
triage A bug report being investigated

Comments

@ssendev
Copy link

ssendev commented Jan 10, 2022

Description

A response slightly larger than 4096 bytes is significantly slower than other responses if more than one request is made concurrently

httperf --server localhost --port 8000 --uri /$bytes --num-calls $n

bytes [ms/req] n=1 10 100
4096 1.6 1.3 1.2
4097 1.4 38.1 41.7
10000 1.9 17.9 16.0
40960 4.9 7.2 5.4
409600 79.3 52.6 41.5
4096000 469.8 422.5 418.0

To Reproduce

// main.rs
#[macro_use]
extern crate rocket;

#[get("/<size>")]
fn size(size: usize) -> String {
    let mut s = String::with_capacity(size);
    for _ in 0..s.capacity() {
        s += " ";
    }
    s
}

#[launch]
fn rocket() -> _ {
    rocket::build().mount("/", routes![size])
}
# Cargo.toml
[package]
name = "test"
version = "0.1.0"
edition = "2021"

[dependencies]
rocket = "0.5.0-rc.1"

Expected Behavior

no large jumps in response time

Environment:

  • OS Distribution and Kernel: Fedora 5.15.6-100.fc34.x86_64
  • Rocket Version: 0.5.0-rc.1
@ssendev ssendev added the triage A bug report being investigated label Jan 10, 2022
@somehowchris
Copy link

Did you run that as debug or release?

@ssendev
Copy link
Author

ssendev commented Jan 10, 2022

I have to admit these numbers were run in debug but the original issue surfaced in production

The 40ms spike at 4097 bytes remains if anything it's even more pronounced since the larger requests did speed up.
i also found out that the speed recovers instantly at 8430 bytes

for b in 4096 4097 8429 8430 10000 40960 409600 4096000;do; for n in 1 10 100;do \
  echo -n "b=$b n=$n "; httperf --server localhost --port 8000 \
  --uri /$b  --num-calls $n | grep "ms/req"; done; done
bytes [ms/req] n=1 10 100
4096 0.8 0.6 0.1
4097 0.3 37.4 40.6
8429 0.3 37.8 40.8
8430 0.3 4.3 3.0
10000 0.4 4.3 2.6
40960 0.3 0.2 3.9
409600 1.4 17.5 10.6
4096000 11.9 18.8 24.2

@somehowchris
Copy link

Ok well same for optimized releases, may I ask, have you run any perf benchmarks? just to see if it has something to do with mem preasure/cpu overhead

@ssendev
Copy link
Author

ssendev commented Jan 10, 2022

Running with 1 worker is the same.
I didn't see any obvious differences in cargo flamegraph / hotspot for 4096 and 4097 bytes but it's complaining that the sample rate is too low.

Always allocating the same amount has still the same result

#[get("/<size>")]
fn size(size: usize) -> String {
    let mut s = String::with_capacity(4_000_000);
    if size == 4096 {
            s += " ";
    }
    for _ in 0..size {
        s += " ";
    }
    if size == 4096 {
        s.truncate(size);
    }
    s
}

as has returning a &'static str with manually pasted content of the correct size

@kolbma
Copy link
Contributor

kolbma commented Jan 10, 2022

@ssendev Don't I get the point or this code is for sure slower with 4096 than any other size...

@somehowchris
Copy link

somehowchris commented Jan 10, 2022

I may have identified the problem. Since Sergio is kinda off until mid this year for personal reasons I have created a fork to work on. Well at least using the script with httperf provided by you I couldn‘t notice delays anymore

[dependencies]
rocket = { git = "https://github.com/somehowchris/Rocket", tag = "0.5.0-patch.1"}

Includes the fix

@ssendev
Copy link
Author

ssendev commented Jan 10, 2022

@kolbma

Yes there is a sharp cliff above 4096.
every time it can already be observed with --num-calls 2 though then it's only 20ms with 10 it's grown to 37ms and won't go much above 40 even with 1000. But when using 4096 bytes 100000 calls round to 0.0ms/req (20503.9 req/s)

time httperf --server localhost --port 8000 --uri /4096  --num-calls 100000
real	4,958
user	2,976
sys	1,541
maxmem	8 MB
faults	0
time httperf --server localhost --port 8000 --uri /4097  --num-calls 1000
real	40,996
user	33,519
sys	7,286
maxmem	8 MB
faults	0

and requests stay slow until a size of 8429 where there is another cliff and response times return from ~40ms to 1.7ms

time httperf --server localhost --port 8000 --uri /8429  --num-calls 1000
real	41,005
user	33,381
sys	7,468
maxmem	8 MB
faults	0
time httperf --server localhost --port 8000 --uri /8430  --num-calls 10000
real	16,524
user	13,117
sys	3,224
maxmem	8 MB
faults	0

beware the changing --num-calls to keep total runtime in check

@somehowchris
I tried your branch but can't see any difference. Though there is only one commit updating dependencies was there supposed to be another one?

@kolbma
Copy link
Contributor

kolbma commented Jan 10, 2022

I've put it up here https://github.com/kolbma/rocket-issue-2062 with modification to

#[get("/<size>")]
fn size(size: usize) -> &'static str {

But I see there some more slow requests... Have a look at the n=2... also these are not constant slow in relation to content size...

Have you tried a different benchmark tool?

b=4096 n=1 Request rate: 1677.7 req/s (0.6 ms/req)
b=4096 n=2 Request rate: 1212.1 req/s (0.8 ms/req)
b=4096 n=10 Request rate: 2947.3 req/s (0.3 ms/req)
b=4096 n=100 Request rate: 4558.9 req/s (0.2 ms/req)
b=4097 n=1 Request rate: 1127.5 req/s (0.9 ms/req)
b=4097 n=2 Request rate: 45.0 req/s (22.2 ms/req)
b=4097 n=10 Request rate: 25.4 req/s (39.3 ms/req)
b=4097 n=100 Request rate: 23.0 req/s (43.6 ms/req)
b=8429 n=1 Request rate: 871.1 req/s (1.1 ms/req)
b=8429 n=2 Request rate: 45.2 req/s (22.1 ms/req)
b=8429 n=10 Request rate: 25.2 req/s (39.6 ms/req)
b=8429 n=100 Request rate: 23.0 req/s (43.6 ms/req)
b=8430 n=1 Request rate: 871.1 req/s (1.1 ms/req)
b=8430 n=2 Request rate: 1769.7 req/s (0.6 ms/req)
b=8430 n=10 Request rate: 202.8 req/s (4.9 ms/req)
b=8430 n=100 Request rate: 389.8 req/s (2.6 ms/req)
b=10000 n=1 Request rate: 1672.4 req/s (0.6 ms/req)
b=10000 n=2 Request rate: 45.8 req/s (21.8 ms/req)
b=10000 n=10 Request rate: 214.8 req/s (4.7 ms/req)
b=10000 n=100 Request rate: 389.8 req/s (2.6 ms/req)
b=40960 n=1 Request rate: 1101.4 req/s (0.9 ms/req)
b=40960 n=2 Request rate: 22.6 req/s (44.2 ms/req)
b=40960 n=10 Request rate: 915.4 req/s (1.1 ms/req)
b=40960 n=100 Request rate: 1080.8 req/s (0.9 ms/req)
b=409600 n=1 Request rate: 225.6 req/s (4.4 ms/req)
b=409600 n=2 Request rate: 231.9 req/s (4.3 ms/req)
b=409600 n=10 Request rate: 77.9 req/s (12.8 ms/req)
b=409600 n=100 Request rate: 185.4 req/s (5.4 ms/req)
b=4096000 n=1 Request rate: 12.7 req/s (78.7 ms/req)
b=4096000 n=2 Request rate: 29.3 req/s (34.1 ms/req)
b=4096000 n=10 Request rate: 30.1 req/s (33.2 ms/req)
b=4096000 n=100 Request rate: 25.0 req/s (40.0 ms/req)

Another...

b=4096 n=1 Request rate: 1181.8 req/s (0.8 ms/req)
b=4096 n=2 Request rate: 1167.5 req/s (0.9 ms/req)
b=4096 n=10 Request rate: 2947.1 req/s (0.3 ms/req)
b=4096 n=100 Request rate: 3472.5 req/s (0.3 ms/req)
b=4097 n=1 Request rate: 1519.7 req/s (0.7 ms/req)
b=4097 n=2 Request rate: 43.1 req/s (23.2 ms/req)
b=4097 n=10 Request rate: 25.1 req/s (39.8 ms/req)
b=4097 n=100 Request rate: 23.0 req/s (43.5 ms/req)
b=8429 n=1 Request rate: 811.6 req/s (1.2 ms/req)
b=8429 n=2 Request rate: 45.9 req/s (21.8 ms/req)
b=8429 n=10 Request rate: 25.2 req/s (39.6 ms/req)
b=8429 n=100 Request rate: 23.0 req/s (43.6 ms/req)
b=8430 n=1 Request rate: 1356.9 req/s (0.7 ms/req)
b=8430 n=2 Request rate: 48.7 req/s (20.5 ms/req)
b=8430 n=10 Request rate: 225.9 req/s (4.4 ms/req)
b=8430 n=100 Request rate: 473.5 req/s (2.1 ms/req)
b=10000 n=1 Request rate: 944.2 req/s (1.1 ms/req)
b=10000 n=2 Request rate: 47.9 req/s (20.9 ms/req)
b=10000 n=10 Request rate: 223.8 req/s (4.5 ms/req)
b=10000 n=100 Request rate: 394.1 req/s (2.5 ms/req)
b=40960 n=1 Request rate: 1128.7 req/s (0.9 ms/req)
b=40960 n=2 Request rate: 1109.9 req/s (0.9 ms/req)
b=40960 n=10 Request rate: 110.9 req/s (9.0 ms/req)
b=40960 n=100 Request rate: 692.7 req/s (1.4 ms/req)
b=409600 n=1 Request rate: 229.7 req/s (4.4 ms/req)
b=409600 n=2 Request rate: 266.6 req/s (3.8 ms/req)
b=409600 n=10 Request rate: 120.6 req/s (8.3 ms/req)
b=409600 n=100 Request rate: 224.9 req/s (4.4 ms/req)
b=4096000 n=1 Request rate: 12.6 req/s (79.4 ms/req)
b=4096000 n=2 Request rate: 27.6 req/s (36.2 ms/req)
b=4096000 n=10 Request rate: 24.1 req/s (41.5 ms/req)
b=4096000 n=100 Request rate: 26.8 req/s (37.3 ms/req)

@ssendev
Copy link
Author

ssendev commented Jan 10, 2022

That the slowdowns aren't linearly correlated to content size is what makes this strange. The sharp edge at 4KiB makes me think it's some buffer that causes scheduling issues.

i just tried Apache bench ab -k -c 1 -n 100 http://localhost:8000/4097 and here the problem only occurs if -k (KeepAlive) is present

@kolbma
Copy link
Contributor

kolbma commented Jan 10, 2022

I've tried wrk... the 1st value is average... you see also the breakdown with 4097 and the raise with 8430...

b=4096 n=1		    Req/Sec     4.90k   192.57     5.11k    83.33%
b=4096 n=2		    Req/Sec     4.38k   250.42     4.98k    72.13%
b=4096 n=10		    Req/Sec     1.54k    95.76     1.99k    87.66%
b=4096 n=100		    Req/Sec   151.30     27.81   555.00     89.34%
b=4097 n=1		    Req/Sec    26.97     18.93   121.00     93.33%
b=4097 n=2		    Req/Sec    27.03     19.32   128.00     93.33%
b=4097 n=10		    Req/Sec    26.67     18.68   121.00     93.33%
b=4097 n=100		    Req/Sec    25.75     14.92   141.00     92.66%
b=8429 n=1		    Req/Sec    23.27      5.45    40.00     70.00%
b=8429 n=2		    Req/Sec    23.25      5.48    40.00     70.00%
b=8429 n=10		    Req/Sec    23.02      5.36    40.00     72.00%
b=8429 n=100		    Req/Sec    22.50      4.74    40.00     76.76%
b=8430 n=1		    Req/Sec   424.20    131.67   730.00     70.00%
b=8430 n=2		    Req/Sec   480.78    160.94   830.00     71.67%
b=8430 n=10		    Req/Sec   330.81    104.48   760.00     70.86%
b=8430 n=100		    Req/Sec    87.55     66.76   535.00     87.71%
b=10000 n=1		    Req/Sec   323.00    123.98   636.00     73.33%
b=10000 n=2		    Req/Sec   422.53    132.79     0.90k    70.00%
b=10000 n=10		    Req/Sec   399.53    117.16   820.00     71.71%
b=10000 n=100		    Req/Sec    88.93     63.40   690.00     88.01%
b=40960 n=1		    Req/Sec    59.53     35.75   160.00     70.00%
b=40960 n=2		    Req/Sec    54.45     32.91   171.00     70.00%
b=40960 n=10		    Req/Sec    40.58     29.48   191.00     88.20%
b=40960 n=100		    Req/Sec    55.47     42.02   350.00     82.38%
b=409600 n=1		    Req/Sec    46.67     14.70    80.00     66.67%
b=409600 n=2		    Req/Sec    40.80     15.40    80.00     73.33%
b=409600 n=10		    Req/Sec    28.17      9.88    70.00     80.86%
b=409600 n=100		    Req/Sec    10.33      3.80    60.00     78.97%
b=4096000 n=1		    Req/Sec    18.67      6.81    30.00     53.33%
b=4096000 n=2		    Req/Sec    19.48      6.49    30.00     59.02%
b=4096000 n=10		    Req/Sec    11.20      3.98    20.00     77.78%
b=4096000 n=100		    Req/Sec     1.29      1.99    20.00     93.33%

@kolbma
Copy link
Contributor

kolbma commented Jan 11, 2022

That the slowdowns aren't linearly correlated to content size is what makes this strange. The sharp edge at 4KiB makes me think it's some buffer that causes scheduling issues.

This is the default chunk size in Rocket. But the window to 8430 is strange.
https://github.com/SergioBenitez/Rocket/blob/8cae077ba1d54b92cdef3e171a730b819d5eeb8e/core/lib/src/response/body.rs#L108

@kolbma
Copy link
Contributor

kolbma commented Jan 11, 2022

If you change DEFAULT_MAX_CHUNK the window moves. So let's say you set it to 8430, the break down starts with 8431.
Can't see any special handling of this in Rocket code. This is based on hyper and h2. But I've already updated both crates to latest version without change.
I'll check this further out after some sleep...

@kolbma
Copy link
Contributor

kolbma commented Jan 11, 2022

It doesn't work correctly after 8430 bytes count also!

Look at the Net I/O output of httperf. This is much too slow.

for b in 4060 4096 4097 8192 8430 12288 12500 ; do 
    echo -n "$b:  "
    httperf  --server localhost --port 8000 --uri /$b --num-calls 10 --num-conns 10 | grep "Net I" 
done

With 3x max_chunk_size == 12288 bytes it is fast like with 4096 bytes and because of bigger "file" more effective in Net I/O.

4060:  Net I/O: 1409.4 KB/s (11.5*10^6 bps)
4096:  Net I/O: 1361.7 KB/s (11.2*10^6 bps)
4097:  Net I/O: 106.9 KB/s (0.9*10^6 bps)
8192:  Net I/O: 207.2 KB/s (1.7*10^6 bps)
8430:  Net I/O: 529.2 KB/s (4.3*10^6 bps)
12288:  Net I/O: 2553.8 KB/s (20.9*10^6 bps)
12500:  Net I/O: 836.6 KB/s (6.9*10^6 bps)

Is there anything wrong with this?
https://github.com/SergioBenitez/Rocket/blob/8cae077ba1d54b92cdef3e171a730b819d5eeb8e/core/lib/src/ext.rs#L117

Anyone an idea why it starts to become faster again after the additional 4334 bytes (4096 + 4334 = 8430)?
Or any other idea what might be the cause?

I see there some more not really explainable differences in speed. Looks like there is some race condition and blocking.
How do you trace the polling of tokio crate?

@ssendev
Copy link
Author

ssendev commented Jan 11, 2022

https://github.com/tokio-rs/console advertises itself as capable of it

The fact that the delay is 40ms also made me think of this https://vorner.github.io/2020/11/06/40-ms-bug.html

@kolbma
Copy link
Contributor

kolbma commented Jan 11, 2022

http1_writev disabled doesn't make a difference.

I've setup in the meantime some code for bare hyper which has the same functionality to respond to GET /<size>.
No problem in hyper.

  • hyper.sh port 3000 calls: 500 conns: 1
    4097: Net I/O: 23718.9 KB/s (194.3*10^6 bps)
    4097: Request rate: 5727.0 req/s (0.2 ms/req)

I'm thinking about how the keep_alive suits in the situation with different byte counts.
Because it happens only when reusing the connection with e.g. 500 GET requests.

4096 ok...

  • rocket.sh port 8000 calls: 500 conns: 1
    4096: Net I/O: 19174.5 KB/s (157.1*10^6 bps)
    4096: Request rate: 4463.5 req/s (0.2 ms/req)

4097 slow....

  • rocket.sh port 8000 calls: 500 conns: 1
    4097: Net I/O: 97.8 KB/s (0.8*10^6 bps)
    4097: Request rate: 22.8 req/s (43.9 ms/req)

Creating 500 connections for each 1 GET request is ok...

  • rocket.sh port 8000 calls: 1 conns: 500
    4097: Net I/O: 7457.8 KB/s (61.1*10^6 bps)
    4097: Request rate: 1735.6 req/s (0.6 ms/req)

But 250 connections each reused for a 2nd concurrent GET is a bottleneck...

  • rocket.sh port 8000 calls: 250 conns: 2
    4097: Net I/O: 98.1 KB/s (0.8*10^6 bps)
    4097: Request rate: 22.8 req/s (43.8 ms/req)

There should be more req/s than with calls: 500 conns: 1, because there is lesser TCP connection overhead.

@kolbma
Copy link
Contributor

kolbma commented Jan 11, 2022

So the time is lost in the concurrent call of

https://github.com/SergioBenitez/Rocket/blob/8cae077ba1d54b92cdef3e171a730b819d5eeb8e/core/lib/src/server.rs#L280

In my debug build and a connection with 2 requests, there is waited for one request about 290 milliseconds. The other request is about 2 or 3 milliseconds.

The handle function
https://github.com/SergioBenitez/Rocket/blob/8cae077ba1d54b92cdef3e171a730b819d5eeb8e/core/lib/src/server.rs#L23
I've stripped to...

Some(run().await)

to remove temporarily the unwinding stuff.
And the String handler itself is no culprit. It runs about 100 microseconds.

In the blackbox part called by route.handler.handle(request, data) a lot of future pinning and unsafe memory handling whith a lot of safety info (the caller of method/function would need to make sure a requirement is fulfilled) is called to handle the async.
Need to see what is done by the codegen of Rocket there.

@kolbma
Copy link
Contributor

kolbma commented Jan 12, 2022

After all the code inspection and code timings... I don't believe anylonger there is an issue in Rocket...

If you check the network traffic and its timing, you'll see for 2 requests in one keep-alive connection with content length 4096

4096_calls2_conns1

And for 2 requests in one keep-alive connection with content length 4097

4097_calls2_conns1

The data inclusive proto overhead 4399 bytes is sent by Rocket server always quickly after the GET request.

For content length 4096 there is a single TCP segment, while for 4097 there is a 2nd.
There might be the cause in the async handling between Rocket with its max_chunk_size and sending data via Hyper.
Just using Hyper directly, this is handled differently and it pushes the data also in a single TCP segment like with 4096 bytes.
But this is both ok.

Back to whats going on...
You see in the bottom image that frame 13 is sent over 44ms behind frame 12. So the server at port 8000 is waiting for the ACK from the client, which is here httperf.
The data has been seen by httperf more or less beamed in because the long awaited ACK has the same timestamp in echo (not on images).
Looks like the benchmark tools doesn't handle keep-alive correctly.
While the first request is ACKed correctly, it seems it waits for more data if there are multiple segments to fill up its own buffer. Although it could ACK the already received data.
For confirmation you can also set argument --recv-buffer to e.g. 1 and there is no slow down because of waiting on ACK.
What the value is exactly you set here, I don't know. If you set it too high (13000, 12000 is ok) I think it switches to default and the ACK won't work again.
But httperf --recv-buffer 9000 --server localhost --port 8000 --uri /4097 --num-calls 2 --num-conns 1 works to get similar assembled frames like without the option, but correctly fast ACKed...

4097_calls2_conns1_recv-buffer-9000

  • rocket.sh port 8000 calls: 1000 conns: 10
    4096: Net I/O: 1396.7 KB/s (11.4*10^6 bps)
    4096: Request rate: 325.1 req/s (3.1 ms/req)

  • rocket.sh port 8000 calls: 1000 conns: 10
    4097: Net I/O: 1332.6 KB/s (10.9*10^6 bps)
    4097: Request rate: 310.1 req/s (3.2 ms/req)

With bigger byte counts there is still a problem with the tool...
For curiosity same calls 4 conns 1 sometimes slow but also fast...

  • rocket.sh port 8000 calls: 4 conns: 1
    8430: Net I/O: 546.2 KB/s (4.5*10^6 bps)
    8430: Request rate: 64.0 req/s (15.6 ms/req)

  • rocket.sh port 8000 calls: 4 conns: 1
    8430: Net I/O: 2247.0 KB/s (18.4*10^6 bps)
    8430: Request rate: 263.5 req/s (3.8 ms/req)

With only 3 calls it is always fast. With 5 calls it is always slow.

Problem is the same like above... if it is slow, over 44ms wait for ACK the received data.
But not already the 2nd request, here sometime the 4th but always the 5th.
With bytecount 8000 and recv-buffer 12000 it is always fast again...

  • rocket.sh port 8000 calls: 1000 conns: 10
    8000: Net I/O: 2169.0 KB/s (17.8*10^6 bps)
    8000: Request rate: 267.5 req/s (3.7 ms/req)

All calls against debug build with timing output, so much slower as a release build of Rocket.

@rbtcollins
Copy link

@kolbma thats the 40ms bug that vorner wrote up

  • if you set tcp_nodelay on the socket, the full response will reach the client without waiting for an ACK.
    OR
  • disabling delayed acks requires a per-recv() call, which is tedious, and disabling that is usually a problem anyway.

The issue here is the combination of small content and multiple packets. The writev setting having no effect may just mean that a similar thing is having a writev impact. I suggest a strace: if the syscalls being made by rocket apps split headers and body for small content this problem will occur.

See https://en.wikipedia.org/wiki/Nagle%27s_algorithm and https://datatracker.ietf.org/doc/html/draft-minshall-nagle

The answer is to ensure that userspace - rocket - always writes either a complete response, or at least a packet worth of data.

@kolbma
Copy link
Contributor

kolbma commented Jan 23, 2022

@rbtcollins PR welcome!

@SergioBenitez
Copy link
Member

To summarize, this is due to interaction between two TCP optimizations: Nagle's and delayed ACKs. From Wikipedia:

The additional wait time introduced by the delayed ACK can cause further delays when interacting with certain applications and configurations. If Nagle's algorithm is being used by the sending party, data will be queued by the sender until an ACK is received. If the sender does not send enough data to fill the maximum segment size (for example, if it performs two small writes followed by a blocking read) then the transfer will pause up to the ACK delay timeout. [...]

For example, consider a situation where Bob is sending data to Carol. Bob's socket layer has less than a complete packet's worth of data remaining to send. Per Nagle's algorithm, it will not be sent until he receives an ACK for the data that has already been sent. At the same time, Carol's application layer will not send a response until it gets all of the data. If Carol is using delayed ACKs, her socket layer will not send an ACK until the timeout is reached.

If the application is transmitting data in smaller chunks and expecting periodic acknowledgment replies, this negative interaction can occur. To prevent this delay, the application layer needs to continuously send data without waiting for acknowledgment replies. Alternatively, Nagle's algorithm may be disabled by the application on the sending side.

@rbtcollins The answer is to ensure that userspace - rocket - always writes either a complete response, or at least a packet worth of data.

What are you suggesting, precisely? Rocket does not control the data size: this is entirely up to the application author.

As far as I can tell, there are really only two solutions, assuming you don't control the client:

  1. Enable TCP_NODELAY, which disables Nagle's, preventing the problem altogether.
  2. Issue a flush after each chunk which should force the underlying networking stack to write the data without waiting for an ACK. The flush would need to be written in this loop: https://github.com/SergioBenitez/Rocket/blob/4fcb57b704c3754cdd67028220457bb27d6ab128/core/lib/src/server.rs#L154-L157

Though it seems ideal, I cannot see any mechanism to do 2) with Hyper's Channel API (https://docs.rs/hyper/0.14.17/hyper/body/struct.Sender.html). Thus, it seems we should likely go with 1), which is simply a Server builder option (https://docs.rs/hyper/0.14.17/hyper/server/struct.Builder.html#method.tcp_nodelay).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage A bug report being investigated
Projects
None yet
Development

No branches or pull requests

5 participants