Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side rendering performance degradation with renderToPipeableStream #24232

Open
SuperOleg39 opened this issue Mar 31, 2022 · 22 comments
Open
Labels
React 18 Bug reports, questions, and general feedback about React 18 Status: Unconfirmed A potential issue that we haven't yet confirmed as a bug Type: Discussion

Comments

@SuperOleg39
Copy link

Hello!

When switching from renderToString to renderToPipeableStream, I run load tests on the application, and found a decrease in server throughput, from 50 to 15 RPS, and an increase in response timings.
When profiling the CPU, I see a large overhead on the internal work of the stream, specifically the methods Writable.write and Writable.uncork.
All these method calls together take more than twice as much CPU time (about 50-60ms) as rendering my test page (about 15-20ms)

Also, I don't want to give the HTML to the client in the stream, this approach has some disadvantages.
So I have to buffer the data, and it slows down the application a bit more.

CPU profiler in production mode:

Снимок экрана 2022-03-31 в 16 21 00

CPU profiler in development mode:

Снимок экрана 2022-03-31 в 16 21 36

My custom Writable stream with buffering:

class HtmlWritable extends Writable {
  chunks = [];
  html = '';

  getHtml() {
    return this.html;
  }

  _write(chunk, encoding, callback) {
    this.chunks.push(chunk);
    callback();
  }

  _final(callback) {
    this.html = Buffer.concat(this.chunks).toString();
    callback();
  }
}

And rendering flow:

import { renderToPipeableStream } from 'react-dom/server';
 
new Promise((resolve, reject) => {
  const htmlWritable = new HtmlWritable();

  const { pipe, abort } = renderToPipeableStream(renderResult, {
    onAllReady() {
      pipe(htmlWritable);
    },
    onError(error) {
      reject(error);
    },
  });

  htmlWritable.on('finish', () => {
    resolve(htmlWritable.getHtml());
  });
});
@SuperOleg39 SuperOleg39 added the Status: Unconfirmed A potential issue that we haven't yet confirmed as a bug label Mar 31, 2022
@SuperOleg39
Copy link
Author

@timneutkens, sorry for disturbing, perhaps you have done stress tests on Next.js in combination with react@18 and streaming rendering?
Is there a difference compared to renderToString, regarding the number of requests per second and the timing of the responses?

@bvaughn
Copy link
Contributor

bvaughn commented Mar 31, 2022

Also, I don't want to give the HTML to the client in the stream, this approach has some disadvantages.

Could you elaborate on this?

@SuperOleg39
Copy link
Author

Could you elaborate on this?

Yes, my team maintain SSR meta-framework, so we tested HTML streaming couple of times, and have some feedback in general (non-React specific):

  • Inconsistent behaviour between browsers when loading a page and MPA transitions:
    Safari behaves unpredictably, regardless of rendering type, may blink white screen.
    Firefox - no white screen when streaming, waiting for full page load.
    Chrome - white screen if the stream responds for more than 5 seconds, or if the stream is given the body tag, and then starts streaming rendering a new page.
    Actual problem for us, because we have a lot of different applications on the same domain.
  • There is no possibility to make a server redirect after the first byte is sent to the client
  • It is also not possible to update the meta tags if they have already been sent to the client
  • There is no clear answer as to how this will affect SEO

@gaearon gaearon added the React 18 Bug reports, questions, and general feedback about React 18 label Mar 31, 2022
@sebmarkbage
Copy link
Collaborator

sebmarkbage commented Mar 31, 2022

The perf numbers you mentioned. Are those with the polyfilled Writable that buffers that you mentioned above or with native Node.js streams?

It sounds like you’re saying that your Writable polyfill is slow since that’s where the bulk of the time is? Do you want help writing a faster one? Or are you saying that it’s faster with that polyfill than the native Node.js streams?

Even if you want to buffer it, do you actually need it to be converted back to a string or can it stay as binary?

@SuperOleg39
Copy link
Author

The perf numbers you mentioned. Are those with the polyfilled Writable that buffers that you mentioned above or with native Node.js streams?

I'm using a native stream module.

Even if you want to buffer it, do you actually need it to be converted back to a string or can it stay as binary?

Buffer.toString method takes slightly more than 1ms, so I haven't tried to optimize it yet.
Buffer.concat slows down response more significantly already, seems like a more promising place to optimize.
In both cases, it is not a significant cost compared to the code that calls the flushSubtree and comleteWriting methods

@SuperOleg39
Copy link
Author

Looks like the big part of a problem in large number of Buffer.from calls in Writable internals:

Снимок экрана 2022-03-31 в 19 12 25

@sebmarkbage
Copy link
Collaborator

Can you share a code example that includes the native stream module? Ideally the whole test case.

@SuperOleg39
Copy link
Author

Also, tried a quick test of new Next.js app, production version, react@17 vs react@18 + experimental streaming, on the similar page component that I tested with my framework, here is results:

  • react@17 - 20rps
  • react@18 + streaming - 10rps

But the CPU profile is harder, I can't find obviously slowed down things.

Test command example - npx autocannon -c 50 -d 10 http://localhost:3000/

@SuperOleg39
Copy link
Author

Can you share a code example that includes the native stream module? Ideally the whole test case.

Sorry, do you have some reference sandbox with renderToPipeableStream, like this demo, but with latest react version?
I can make a reproduction on top of such an example.

@gaearon
Copy link
Collaborator

gaearon commented Mar 31, 2022

It's copied from here so you should be able to take this one: https://github.com/facebook/react/tree/main/fixtures/ssr2

It uses a local build but you can replace it with 18 in package.json.

@sebmarkbage
Copy link
Collaborator

sebmarkbage commented Mar 31, 2022

If you run the same benchmark but with your custom HtmlWritable with buffering. What kind of numbers do you get?

Similarly, if you run React 18 with renderToString.

Just to narrow down how much of the perf issues are with the writable vs the rest.

I don't doubt that native Node.js streams have a lot of overhead in the _write calls. There's a lot of stuff going on in there. In theory they wouldn't have to be though. It's not optimized for lots of small calls.

Web Streams have even more overhead in every chunk call. So we added our own buffering to avoid enqueueing so often.

We could do the same for Node streams but there's a principle question there. If we do that, we remove optimization opportunities for user space. There's nothing inherent in the Node stream API that it has to have so much overhead. So should it be our responsibility or the responsibility of the recipient stream?

One way could be to build an intermediate stream API, like the one you have, that buffers for a bit and then flushes to the underlying one periodically just like our Web Streams buffering does.

@SuperOleg39
Copy link
Author

It's copied from here so you should be able to take this one: https://github.com/facebook/react/tree/main/fixtures/ssr2

Thanks! Create few examples from this - https://github.com/SuperOleg39/react-ssr-perf-test

Similarly, if you run React 18 with renderToString.

At first, I made comparsion between renderToString with 17 and 18 versions (call it react@18 legacy).
React 17 - 50 RPS
React 18 - 70 RPS, and I only changed a react and react-dom versions.

Just to narrow down how much of the perf issues are with the writable vs the rest.

Then I tested renderToPipeableStream (call it react@18 stream)
I got timings around 1.5s for responses, problem was in app.use(compress()); middleware:

Снимок экрана 2022-03-31 в 23 36 17

After removal this middleware, I got around 2 RPS, 300+ ms CPU work for request. Still a big overhead for stream internals:

Снимок экрана 2022-03-31 в 23 38 30

If you run the same benchmark but with your custom HtmlWritable with buffering. What kind of numbers do you get?

At last, tried buffering (call it react@18 buffering), and got around 15 RPS.
CPU profile:

Снимок экрана 2022-03-31 в 23 40 47

@SuperOleg39
Copy link
Author

I don't doubt that native Node.js streams have a lot of overhead in the _write calls. There's a lot of stuff going on in there. In theory they wouldn't have to be though. It's not optimized for lots of small calls.

It seems to be(

@SuperOleg39
Copy link
Author

SuperOleg39 commented Mar 31, 2022

react@18 legacy CPU profile, for comparsion (20ms for render, and that's with garbage collector):

Снимок экрана 2022-03-31 в 23 46 37

So, everywere React rendering are fast, around 15ms for example page component, and the response time grows in proportion to the number of writes to the stream.

@gaearon
Copy link
Collaborator

gaearon commented Apr 7, 2022

Ongoing work in #24291.

@gaearon
Copy link
Collaborator

gaearon commented Apr 11, 2022

The initial set of fixes are in 18.1.0-next-df5d32f23-20220411 so you can give that a try.
Here's my results with your benchmark.

17

┌─────────┬────────┬────────┬────────┬────────┬───────────┬───────┬────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg       │ Stdev │ Max    │
├─────────┼────────┼────────┼────────┼────────┼───────────┼───────┼────────┤
│ Latency │ 174 ms │ 424 ms │ 578 ms │ 848 ms │ 426.69 ms │ 85 ms │ 861 ms │
└─────────┴────────┴────────┴────────┴────────┴───────────┴───────┴────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬───────┬────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg   │ Stdev  │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼───────┼────────┼─────────┤
│ Req/Sec   │ 108     │ 108     │ 115     │ 117     │ 114.5 │ 2.34   │ 108     │
├───────────┼─────────┼─────────┼─────────┼─────────┼───────┼────────┼─────────┤
│ Bytes/Sec │ 28.3 MB │ 28.3 MB │ 30.1 MB │ 30.7 MB │ 30 MB │ 612 kB │ 28.3 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴───────┴────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 1145  │
└──────┴───────┘

18+legacy

Before

┌─────────┬────────┬────────┬────────┬────────┬───────────┬──────────┬────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg       │ Stdev    │ Max    │
├─────────┼────────┼────────┼────────┼────────┼───────────┼──────────┼────────┤
│ Latency │ 129 ms │ 268 ms │ 401 ms │ 537 ms │ 270.63 ms │ 53.38 ms │ 551 ms │
└─────────┴────────┴────────┴────────┴────────┴───────────┴──────────┴────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev  │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Req/Sec   │ 178     │ 178     │ 182     │ 185     │ 182     │ 2.2    │ 178     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Bytes/Sec │ 46.6 MB │ 46.6 MB │ 47.7 MB │ 48.5 MB │ 47.7 MB │ 574 kB │ 46.6 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 1820  │
└──────┴───────┘

After

┌─────────┬────────┬────────┬────────┬────────┬───────────┬──────────┬────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg       │ Stdev    │ Max    │
├─────────┼────────┼────────┼────────┼────────┼───────────┼──────────┼────────┤
│ Latency │ 160 ms │ 263 ms │ 376 ms │ 525 ms │ 265.25 ms │ 50.08 ms │ 539 ms │
└─────────┴────────┴────────┴────────┴────────┴───────────┴──────────┴────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev  │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Req/Sec   │ 182     │ 182     │ 186     │ 188     │ 185.6   │ 1.5    │ 182     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Bytes/Sec │ 47.7 MB │ 47.7 MB │ 48.7 MB │ 49.3 MB │ 48.6 MB │ 392 kB │ 47.7 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 1856  │

18+buffering

Before

┌─────────┬────────┬─────────┬─────────┬─────────┬────────────┬───────────┬─────────┐
│ Stat    │ 2.5%   │ 50%     │ 97.5%   │ 99%     │ Avg        │ Stdev     │ Max     │
├─────────┼────────┼─────────┼─────────┼─────────┼────────────┼───────────┼─────────┤
│ Latency │ 355 ms │ 2025 ms │ 2507 ms │ 2809 ms │ 1677.05 ms │ 591.21 ms │ 2958 ms │
└─────────┴────────┴─────────┴─────────┴─────────┴────────────┴───────────┴─────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev   │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Req/Sec   │ 13      │ 13      │ 26      │ 39      │ 25.8    │ 8.48    │ 13      │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Bytes/Sec │ 3.41 MB │ 3.41 MB │ 6.81 MB │ 10.2 MB │ 6.76 MB │ 2.22 MB │ 3.41 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 258   │

After

┌─────────┬────────┬────────┬────────┬────────┬──────────┬───────┬─────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg      │ Stdev │ Max     │
├─────────┼────────┼────────┼────────┼────────┼──────────┼───────┼─────────┤
│ Latency │ 163 ms │ 536 ms │ 620 ms │ 626 ms │ 529.2 ms │ 93 ms │ 1121 ms │
└─────────┴────────┴────────┴────────┴────────┴──────────┴───────┴─────────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev  │ Min     │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Req/Sec   │ 86      │ 86      │ 92      │ 95      │ 91.9    │ 2.3    │ 86      │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┼─────────┤
│ Bytes/Sec │ 22.5 MB │ 22.5 MB │ 24.1 MB │ 24.9 MB │ 24.1 MB │ 601 kB │ 22.5 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 919   │
└──────┴───────┘

18+stream

Before

┌─────────┬─────────┬─────────┬──────────┬──────────┬────────────┬──────────┬──────────┐
│ Stat    │ 2.5%    │ 50%     │ 97.5%    │ 99%      │ Avg        │ Stdev    │ Max      │
├─────────┼─────────┼─────────┼──────────┼──────────┼────────────┼──────────┼──────────┤
│ Latency │ 7592 ms │ 9956 ms │ 10742 ms │ 10768 ms │ 9794.71 ms │ 857.4 ms │ 10768 ms │
└─────────┴─────────┴─────────┴──────────┴──────────┴────────────┴──────────┴──────────┘
┌───────────┬─────┬──────┬─────┬─────────┬─────────┬─────────┬─────────┐
│ Stat      │ 1%  │ 2.5% │ 50% │ 97.5%   │ Avg     │ Stdev   │ Min     │
├───────────┼─────┼──────┼─────┼─────────┼─────────┼─────────┼─────────┤
│ Req/Sec   │ 0   │ 0    │ 0   │ 28      │ 5.56    │ 9.55    │ 5       │
├───────────┼─────┼──────┼─────┼─────────┼─────────┼─────────┼─────────┤
│ Bytes/Sec │ 0 B │ 0 B  │ 0 B │ 25.3 MB │ 5.02 MB │ 8.62 MB │ 4.52 MB │
└───────────┴─────┴──────┴─────┴─────────┴─────────┴─────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 50    │
└──────┴───────┘

After

┌─────────┬────────┬────────┬────────┬────────┬───────────┬──────────┬────────┐
│ Stat    │ 2.5%   │ 50%    │ 97.5%  │ 99%    │ Avg       │ Stdev    │ Max    │
├─────────┼────────┼────────┼────────┼────────┼───────────┼──────────┼────────┤
│ Latency │ 200 ms │ 496 ms │ 568 ms │ 989 ms │ 491.61 ms │ 93.45 ms │ 997 ms │
└─────────┴────────┴────────┴────────┴────────┴───────────┴──────────┴────────┘
┌───────────┬─────────┬─────────┬───────┬─────────┬───────┬────────┬─────────┐
│ Stat      │ 1%      │ 2.5%    │ 50%   │ 97.5%   │ Avg   │ Stdev  │ Min     │
├───────────┼─────────┼─────────┼───────┼─────────┼───────┼────────┼─────────┤
│ Req/Sec   │ 98      │ 98      │ 99    │ 100     │ 99    │ 0.9    │ 98      │
├───────────┼─────────┼─────────┼───────┼─────────┼───────┼────────┼─────────┤
│ Bytes/Sec │ 25.8 MB │ 25.8 MB │ 26 MB │ 26.3 MB │ 26 MB │ 242 kB │ 25.8 MB │
└───────────┴─────────┴─────────┴───────┴─────────┴───────┴────────┴─────────┘
┌──────┬───────┐
│ Code │ Count │
├──────┼───────┤
│ 200  │ 990   │
└──────┴───────┘

@SuperOleg39
Copy link
Author

Impressive work, thanks!

Is there any issues / PR with upcoming related improvements, which I can subscribe for?

@gaearon
Copy link
Collaborator

gaearon commented Apr 11, 2022

I'm not sure if there are any particular ones planned but @gnoff might be able to provide more details. I think it would make sense to track them in this issue as well.

@SuperOleg39
Copy link
Author

Good idea 👍

If we can reach (in perspective) the same perf for streaming render, without this "penalty", it will be bright future for HTML streaming in general :)

@gnoff
Copy link
Collaborator

gnoff commented Apr 11, 2022

I'm going to continue working on this and will likely have future PRs but can't say exactly when. If you look for titles with [Fizz] in the title that's likely where they will be. If I remember I'll ping you here or tag you in the future work

@SuperOleg39
Copy link
Author

Nice, thanks you!

@gaearon
Copy link
Collaborator

gaearon commented Apr 27, 2022

The initial fix is in 18.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
React 18 Bug reports, questions, and general feedback about React 18 Status: Unconfirmed A potential issue that we haven't yet confirmed as a bug Type: Discussion
Projects
None yet
Development

No branches or pull requests

5 participants