Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net.connection.onread handler: may crash on socket end #31823

Closed
animetosho opened this issue Feb 17, 2020 · 6 comments
Closed

net.connection.onread handler: may crash on socket end #31823

animetosho opened this issue Feb 17, 2020 · 6 comments
Labels
net Issues and PRs related to the net subsystem.

Comments

@animetosho
Copy link

animetosho commented Feb 17, 2020

  • Version: reproduced on v12.15.0 and v13.8.0
  • Platform: Linux debian 4.19.0-5-amd64 deps: update openssl to 1.0.1j #1 SMP Debian 4.19.37-5+deb10u2 (2019-08-08) x86_64 GNU/Linux
  • Subsystem: net

When using an onread handler for net connections (issue #25436), it seems that the following assertion error may occur when the connection is closed:

node[14962]: ../src/stream_base.cc:497:virtual void node::CustomBufferJSListener::OnStreamRead(ssize_t, const uv_buf_t&): Assertion `(buf.base) == (buffer_.base)' failed.
 1: 0x9dab80 node::Abort() [node]
 2: 0x9dac07  [node]
 3: 0xa860b2 node::CustomBufferJSListener::OnStreamRead(long, uv_buf_t const&) [node]
 4: 0xa90716 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [node]
 5: 0x12e416f  [node]
 6: 0x12ea228  [node]
 7: 0x12d85ab uv_run [node]
 8: 0xa1d120 node::NodeMainInstance::Run() [node]
 9: 0x9acb78 node::Start(int, char**) [node]
10: 0x7f72e736e09b __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
11: 0x94a115  [node]
Aborted

What steps will reproduce the bug?

This test requires a test server - I used the following code to run a test server on localhost:9999:

require('net')
.createServer(conn => conn.write('test'))
.listen({host: 'localhost', port: 9999});

The actual test script (which connects to localhost:9999):

const net = require('net');
const opts = {
	host: 'localhost',
	port: 9999,
	onread: {
		buffer: Buffer.allocUnsafe(4096),
		callback: (size, buf) => {
			console.log('Data received');
		}
	}
};

let count = 0;
function doTest() {
	console.log('=== Attempt ' + (++count) + ' ===')
	const socket = net.connect(opts, () => console.log('Connected'));
	socket.once('end', () => console.log('End received'));
	socket.once('close', () => {
		doTest(); // if we got here, everything worked, so just repeat the test until it fails
	});
	socket.end();
}
doTest();

How often does it reproduce? Is there a required condition?

It doesn't happen that frequently - it usually takes around 4-30 attempts, using the test script above, for a crash to happen.

It seems that it is necessary for the server to send a message to cause the issue, even though the crash appears to be on connection end. The console output suggests that the crash occurs after the data is received, but before the end event is received.

I haven't looked too much into the cause, but it doesn't seem to occur if the server and client are in the same script.

What is the expected behavior?

Test script should endlessly loop without crashing (if it doesn't crash within a few hundred thousand cycles, I'd say it's successful).

Output looks like:

=== Attempt 1 ===
Connected
Data received
End received
=== Attempt 2 ===
Connected
Data received
End received
=== Attempt 3 ===
Connected
Data received
End received
...

What do you see instead?

=== Attempt 25 ===
Connected
Data received
End received
=== Attempt 26 ===
Connected
Data received
node[14962]: ../src/stream_base.cc:497:virtual void node::CustomBufferJSListener::OnStreamRead(ssize_t, const uv_buf_t&): Assertion `(buf.base) == (buffer_.base)' failed.
 1: 0x9dab80 node::Abort() [node]
 2: 0x9dac07  [node]
 3: 0xa860b2 node::CustomBufferJSListener::OnStreamRead(long, uv_buf_t const&) [node]
 4: 0xa90716 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [node]
 5: 0x12e416f  [node]
 6: 0x12ea228  [node]
 7: 0x12d85ab uv_run [node]
 8: 0xa1d120 node::NodeMainInstance::Run() [node]
 9: 0x9acb78 node::Start(int, char**) [node]
10: 0x7f72e736e09b __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
11: 0x94a115  [node]
Aborted

Update: I've tried this on a few other systems. Haven't been able to reproduce on Node v13.8.0 on Windows x64. On another Linux x64 machine, it seems to have taken hundreds of cycles to reproduce.

@addaleax addaleax added the net Issues and PRs related to the net subsystem. label Feb 18, 2020
@addaleax
Copy link
Member

@animetosho Do you think you could get a core dump for these? Or are you able to build modified versions of Node.js yourself? It would be great to know what the values of nread, buf.base and buffer_.base are in this case, i.e. applying

diff --git a/src/stream_base.cc b/src/stream_base.cc
index eaccfc995c74..ab69fa7c49e1 100644
--- a/src/stream_base.cc
+++ b/src/stream_base.cc
@@ -507,6 +507,7 @@ uv_buf_t CustomBufferJSListener::OnStreamAlloc(size_t suggested_size) {
 
 
 void CustomBufferJSListener::OnStreamRead(ssize_t nread, const uv_buf_t& buf) {
+  fprintf(stderr, "OnStreamRead(%zd, %p), buffer_.base = %p\n", nread, buf.base, buffer_.base);
   CHECK_NOT_NULL(stream_);
   CHECK_EQ(buf.base, buffer_.base);
 

and then running seeing what the output before the failure is would be very helpful.

(Alternatively, that information can probably be gathered from a core dump.)

Fwiw, I can’t produce on x64 Linux, so … @nodejs/platform-macos

@animetosho
Copy link
Author

Thanks for the response.

I tried running it in GDB, but it looks like it can't identify those parameters. Doesn't look like the official binary packages include debug symbols (which would be useful to have...).

Unfortunately I don't have access to any fast machines for a while, so compiling Node takes forever here, but I can try.

If it's of any use, I've uploaded a core dump here. This is run with the official Node v13.8.0 binary so you should be able to load it with that.

@animetosho
Copy link
Author

So I managed to get a compile working, though I question the accuracy of this build as it segfaults a lot. For a failure case, I get the following output:

=== Attempt 1 ===
Connected
OnStreamRead(4, 0x3c27d50), buffer_.base = 0x3c27d50
Data received
OnStreamRead(-4095, (nil)), buffer_.base = 0x3c27d50
./node[25196]: ../src/stream_base.cc:512:virtual void node::CustomBufferJSListener::OnStreamRead(ssize_t, const uv_buf_t&): Assertion `(buf.base) == (buffer_.base)' failed.
 1: 0x8b0520
 2: 0x8b05ad
 3: 0x971b34
 4: 0x979dc4
 5: 0x118ac81
 6: 0x1190d88
 7: 0x117f074
 8: 0x8f7c34
 9: 0x8805ca
10: 0x6007b0
11: 0x80d6aa
Aborted (core dumped)

BethGriggs pushed a commit that referenced this issue Apr 7, 2020
If the `onread` socket option is used and a `POLLHUP` event is received,
libuv returns `UV_EOF` along with a `NULL` buffer in the read callback,
causing the crash. Deal with this case.

Fixes: #31823

PR-URL: #32590
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
targos pushed a commit that referenced this issue Apr 12, 2020
If the `onread` socket option is used and a `POLLHUP` event is received,
libuv returns `UV_EOF` along with a `NULL` buffer in the read callback,
causing the crash. Deal with this case.

Fixes: #31823

PR-URL: #32590
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
targos pushed a commit that referenced this issue Apr 22, 2020
If the `onread` socket option is used and a `POLLHUP` event is received,
libuv returns `UV_EOF` along with a `NULL` buffer in the read callback,
causing the crash. Deal with this case.

Fixes: #31823

PR-URL: #32590
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@carlhoerberg
Copy link

Just wanted to mention that I got this in Node 16.13.1 too. Seems hard to reproduce, only happens in our production environment with hundreds of different TLS connections..

node[4]: ../src/stream_base.cc:586:virtual void node::CustomBufferJSListener::OnStreamRead(ssize_t, const uv_buf_t&): Assertion `(buf.base) == (buffer_.base)' failed.
1: 0xb00d90 node::Abort() [node]
2: 0xb00e0e  [node]
3: 0xbdc98a  [node]
4: 0xcb524e node::crypto::TLSWrap::OnStreamRead(long, uv_buf_t const&) [node]
5: 0xbe99f8 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [node]
6: 0x1565fd8  [node]
7: 0x1566620  [node]
8: 0x156c7a4  [node]
9: 0x155ac68 uv_run [node]
10: 0xa3dfe5 node::SpinEventLoop(node::Environment*) [node]
11: 0xb42dc6 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [node]
12: 0xac4812 node::Start(int, char**) [node]
13: 0x7fc9165c70b3 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
14: 0xa3bf3c  [node]

Reverting to .on("data") doesn't seem to have any performance penalty that we can measure so that has been our solution.

@animetosho
Copy link
Author

Just here to add that, whilst I'm unable to reproduce it myself, I've had someone else also report that it's still happening (issue referenced directly above your comment). Have no clue what the cause is, as it doesn't sound like a common case, but it seems not reliable enough to rely on.

@porsager
Copy link

Experiencing the same issue here. Would have been nice to leverage the benefits of onread, but with an error like this it doesn't seem safe to use. Perhaps this issue could be reopened, or should I open another one?

abhishekumar-tyagi pushed a commit to abhishekumar-tyagi/node that referenced this issue May 5, 2024
If the `onread` socket option is used and a `POLLHUP` event is received,
libuv returns `UV_EOF` along with a `NULL` buffer in the read callback,
causing the crash. Deal with this case.

Fixes: nodejs/node#31823
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
net Issues and PRs related to the net subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants