-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greatly speed up 'advanced' ipc receiving with big messages #42931
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Just left a suggestion.
// We read the uint manually here, because this is faster than first converting | ||
// it to a buffer and using `readUInt32BE` on that. | ||
const size = | ||
messageBufferHead[0] << 24 | | ||
messageBufferHead[1] << 16 | | ||
messageBufferHead[2] << 8 | | ||
messageBufferHead[3]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// We read the uint manually here, because this is faster than first converting | |
// it to a buffer and using `readUInt32BE` on that. | |
const size = | |
messageBufferHead[0] << 24 | | |
messageBufferHead[1] << 16 | | |
messageBufferHead[2] << 8 | | |
messageBufferHead[3]; | |
const size = Buffer.prototype.readUInt32BE.call(messageBufferHead, 0) + 4; |
This might work as well? I added the plus 4 as well. That way there's no need to add it later on. This might require a new variable name as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added these suggestions, since I can't see any noticable performance impact of this after some testing, and it makes the code easier to read.
@lpinca Is it normal for the osx tests to fail, or is that because of this change? |
while (messageBufferHead.length >= 4) { | ||
// We call `readUInt32BE` manually here, because this is faster than first converting | ||
// it to a buffer and using `readUInt32BE` on that. | ||
const fullMessageSize = Buffer.prototype.readUInt32BE.call(messageBufferHead, 0) + 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const { readUInt32BE } = require('internal/buffer');
// ...
const fullMessageSize = ReflectApply(readUInt32BE, messageBufferHead, [0]) + 4;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change has been added
It's normal for some randomly crash, and I don't know why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments regarding the usage of primordials which has a documentation on https://github.com/nodejs/node/blob/master/doc/contributing/primordials.md
if (messageBuffer.length < 4 + size) { | ||
break; | ||
} | ||
channel[kMessageBuffer].push(readData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage of primordials should be preferred for any new code. Full documentation can be found from https://github.com/nodejs/node/blob/master/doc/contributing/primordials.md
ArrayPrototypePush(channel[kMessageBuffer], readData);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayPrototypePush is mentioned in the list of primordials with known performance issues- https://github.com/nodejs/node/blob/master/doc/contributing/primordials.md#primordials-with-known-performance-issues, so I'm not sure if it's worth using it here given that this PR is about speeding things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's also a little ambiguous what "new code" means here - new APIs (this one is not a new API) or just new lines of code that are not moved in from a different place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some testing it seems that the primordial function is about 12% slower than the normal one. However, this does seem to affect the speed of the ipc in any noticible way, probably because most processing time is spent elsewhere. So I don't see any problem with using the primordial function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@himself65 Is there anything else I need to do to get this merged? Or do I just have to wait until a maintainer merges this? |
@lpinca Hi there, sorry to bother, but this PR has now been a month without any activity. Is this normal, or do I have to do something else to get this merged? |
This would need a fully green Jenkins CI run before landing. |
cc @nodejs/child_process |
while (messageBufferHead.length >= 4) { | ||
// We call `readUInt32BE` manually here, because this is faster than first converting | ||
// it to a buffer and using `readUInt32BE` on that. | ||
const fullMessageSize = ReflectApply(readUInt32BE, messageBufferHead, [0]) + 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by suggestion: read the int inline, that's both faster and shorter (and arguably easier to read)
const fullMessageSize = ReflectApply(readUInt32BE, messageBufferHead, [0]) + 4; | |
const b = messageBufferHead; | |
const fullMessageSize = b[0] * 0x1000000 + b[1] * 65536 + b[2] * 256 + b[3]; |
7e80934
to
e2247c8
Compare
This comment was marked as outdated.
This comment was marked as outdated.
Ah great, all checks have now finally passed. |
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
Landed in 94020ac...8db79cc. |
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: #42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: nodejs/node#42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
PR-URL: nodejs/node#42931 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Zeyu "Alex" Yang <[email protected]>
Problem
I was running into an issue where I needed to transfer large buffers between a child_process and the main process with 'advanced' serializing, and this was suprisingly very slow.
Solution
After some debugging and profiling I found that the
Buffer
functions, and mainly theBuffer.concat
function were to blame. So I have changed the code to use these functions as little as possible.Now it will not concat each and every incoming part, it will instead store these in an array and only concat them when the full message has been collected.
Tests
I wrote a small test where I transferred different buffer sizes from a child process to the parent, and timed how long it took before the message was received in the parent.
(The scripts I used to test this: test-ipc.zip)
Before fix:
After fix: