Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault during tests of Node 10.15.3 on PowerPC musl #27068

Closed
awilfox opened this issue Apr 3, 2019 · 13 comments
Closed

Segmentation fault during tests of Node 10.15.3 on PowerPC musl #27068

awilfox opened this issue Apr 3, 2019 · 13 comments
Labels
ppc Issues and PRs related to the Power architecture. v8 engine Issues and PRs related to the V8 dependency.

Comments

@awilfox
Copy link
Contributor

awilfox commented Apr 3, 2019

I've been doing some really deep digging on this issue to attempt to make it as easy as possible for an expert to fix, but sadly, I am not that expert; I've never done anything with V8 before.

I'm the project lead of Adélie Linux, and we're hoping to bring Node.js to our distribution. We are using the musl libc instead of glibc (like Void/musl and others). One of our primary CPU architectures is the 64-bit PowerPC. Eight tests fail on this platform:

=== release test-fs-watch-close-when-destroyed ===                            
Path: parallel/test-fs-watch-close-when-destroyed
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.16/watched-directory'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-close-when-destroyed.js:15:20)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:754:12)
    at startup (internal/bootstrap/node.js:283:19)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-close-when-destroyed.js
=== release test-fs-watch-encoding ===                                  
Path: parallel/test-fs-watch-encoding
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.19'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-encoding.js:45:21)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:754:12)
    at startup (internal/bootstrap/node.js:283:19)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-encoding.js
=== release test-fs-watch-enoent ===                              
Path: parallel/test-fs-watch-enoent
assert.js:85
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Input A expected to strictly equal input B:
+ expected - actual

- 'EMFILE: too many open files, watch \'/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.2/non-existent\''
+ 'ENODEV: no such device, watch \'/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.2/non-existent\''
    at Object.validateError (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js:27:14)
    at expectedException (assert.js:570:19)
    at expectsError (assert.js:659:16)
    at Function.throws (assert.js:690:3)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js:36:10)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js
=== release test-fs-watchfile ===                             
Path: parallel/test-fs-watchfile
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.12/watch'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watchfile.js:95:8
    at /usr/src/packages/user/node/src/node-v10.15.3/test/common/index.js:340:15
    at FSReqWrap.args [as oncomplete] (fs.js:140:20)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watchfile.js
=== release test-process-euid-egid ===                                        
Path: parallel/test-process-euid-egid
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-process-euid-egid.js
--- CRASHED (Signal: 11) ---
=== release test-process-uid-gid ===                                          
Path: parallel/test-process-uid-gid
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-process-uid-gid.js
--- CRASHED (Signal: 11) ---
=== release test-querywrap ===                                                
Path: async-hooks/test-querywrap
Command: out/Release/node --expose-gc /usr/src/packages/user/node/src/node-v10.15.3/test/async-hooks/test-querywrap.js
--- CRASHED (Signal: 11) ---
=== release test-tlswrap ===                                 
Path: async-hooks/test-tlswrap
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/async-hooks/test-tlswrap.js
--- CRASHED (Signal: 11) ---
[03:14|% 100|+ 2423|-   8]: Done                                              

I believe the EMFILE ones are possibly related to customisations in the musl library that we have done and this issue does not cover them. This issue is about test-querywrap.

This crash happens every time. A core dump, from Release or Debug node binaries, can be provided if helpful. The crash happens with the following backtrace:

(gdb) bt
#0  __pthread_mutex_lock (m=0x5870) at src/thread/pthread_mutex_lock.c:5
#1  0x000000010144dc50 in v8::base::LockNativeHandle (mutex=<optimized out>) at ../deps/v8/src/base/platform/mutex.cc:130
#2  v8::base::RecursiveMutex::Lock (this=<optimized out>) at ../deps/v8/src/base/platform/mutex.cc:130
#3  0x0000000100cca420 in v8::internal::ExecutionAccess::Lock (isolate=<optimized out>) at ../deps/v8/src/isolate.h:1773
#4  v8::internal::ExecutionAccess::ExecutionAccess (isolate=<optimized out>, this=<synthetic pointer>) at ../deps/v8/src/isolate.h:1769
#5  v8::internal::StackGuard::CheckAndClearInterrupt (this=0x101d8ef40, flag=<optimized out>) at ../deps/v8/src/execution.cc:409
#6  0x0000000100cca9e8 in v8::internal::StackGuard::HandleInterrupts (this=0x101d8ef40) at ../deps/v8/src/execution.cc:509
#7  0x0000000100fdbe28 in v8::internal::NativeRegExpMacroAssembler::CheckStackGuardState (isolate=0x101d896a0, start_index=<optimized out>, is_direct_call=<optimized out>, return_address=0x3fffffffd040, re_code=0x2af21ec6ae1, subject=0x3fffffffd080, 
    input_start=0x3fffffffd090, input_end=0x3fffffffd098) at ../deps/v8/src/isolate.h:898
#8  0x0000000101212afc in v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=<optimized out>, re_code=<optimized out>, re_frame=<error reading variable: value has been optimized out>)
    at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
#9  0x000002af21c88190 in ?? ()

Breaking at the CheckStackGuardState function:

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffd3c0, re_code=0x3c2bdfc6ae1, re_frame=70368744166464) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
1156    static T* frame_entry_address(Address re_frame, int frame_offset) {
(gdb) x/2x re_frame+176
0x3fffffffd4f0: 0x00000001
0x3fffffffd4f4: 0x01d48140
(gdb) c
Continuing.

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffd0d0, re_code=0x3c2bdfc6ae1, re_frame=70368744165712) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
1156    static T* frame_entry_address(Address re_frame, int frame_offset) {
(gdb) x/2x re_frame+176
0x3fffffffd200: 0x00000001
0x3fffffffd204: 0x01d896a0
(gdb) c
Continuing.

Thread 1 "node" received signal SIGSEGV, Segmentation fault.
__pthread_mutex_lock (m=0x5870) at src/thread/pthread_mutex_lock.c:5
5               if ((m->_m_type&15) == PTHREAD_MUTEX_NORMAL

This shows that the isolate_ pointer in the frame is being corrupted. Next I tried using the Debug build. I had to merge a single line patch from Node master (which removed the ra != r0 assertion from D_TYPE instructions in the PPC assembler), and also correct the V8 memory allocator to align properly on PPC (the hint at heap/spaces.cc:130 had to be changed from GetRandomMmapAddr() to (GetRandomMmapAddr() & ~(alignment-1)) for debugging purposes only). Using this new Debug build and hardware watchpoints, I was able to find:

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffcfd0, re_code=0x384bb6107a1, re_frame=70368744165456) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1165
1165          frame_entry<Isolate*>(re_frame, kIsolate),
(gdb) x/2x re_frame+176
0x3fffffffd100: 0x00000001      0x04608de0
(gdb) watch *0x3fffffffcb04
Hardware watchpoint 2: *0x3fffffffcb04
(gdb) c
Continuing.
Thread 1 "node" hit Hardware watchpoint 5: *0x3fffffffcb04

Old value = -13472
New value = 73009584
0x0000000102c28c3c in v8::internal::RegExpStack::stack_base (this=0x0) at ../deps/v8/src/regexp/regexp-stack.h:47
47        Address stack_base() {
(gdb) bt
#0  0x0000000102c28c3c in v8::internal::RegExpStack::stack_base (this=0x0) at ../deps/v8/src/regexp/regexp-stack.h:47
#1  0x0000000102c2a424 in v8::internal::NativeRegExpMacroAssembler::Execute (code=0x3afd98107a1, input=0x9c967027b9, start_offset=13, input_start=0x9c967027dd "\356\375\257", input_end=0x9c967027dd "\356\375\257", output=0x104611b6c, output_size=2, isolate=0x104608de0)
    at ../deps/v8/src/regexp/regexp-macro-assembler.cc:286
#2  0x0000000102c2a33c in v8::internal::NativeRegExpMacroAssembler::Match (regexp_code=..., subject=..., offsets_vector=0x104611b6c, offsets_vector_length=2, previous_index=13, isolate=0x104608de0) at ../deps/v8/src/regexp/regexp-macro-assembler.cc:263
#3  0x0000000102bec8dc in v8::internal::RegExpImpl::IrregexpExecRaw (regexp=..., subject=..., index=13, output=0x104611b6c, output_size=2) at ../deps/v8/src/regexp/jsregexp.cc:478
#4  0x0000000102becbf0 in v8::internal::RegExpImpl::IrregexpExec (regexp=..., subject=..., previous_index=13, last_match_info=...) at ../deps/v8/src/regexp/jsregexp.cc:571
#5  0x0000000102beae60 in v8::internal::RegExpImpl::Exec (regexp=..., subject=..., index=13, last_match_info=...) at ../deps/v8/src/regexp/jsregexp.cc:196
#6  0x0000000102d00890 in v8::internal::__RT_impl_Runtime_RegExpExec (args=..., isolate=0x104608de0) at ../deps/v8/src/runtime/runtime-regexp.cc:918
#7  0x0000000102d003f0 in v8::internal::Runtime_RegExpExec (args_length=4, args_object=0x3fffffffd110, isolate=0x104608de0) at ../deps/v8/src/runtime/runtime-regexp.cc:906
#8  0x000003afd96e7250 in ?? ()
(gdb) x/i $pc
=> 0x102c28c3c <v8::internal::RegExpStack::stack_base()+24>:    std     r31,-8(r1)
(gdb) x/i $pc-4
   0x102c28c38 <v8::internal::RegExpStack::stack_base()+20>:    std     r30,-16(r1)
(gdb) info registers
r0             0x102c2a424         4341277732
r1             0x3fffffffcb10      70368744164112
r2             0x1045c5d00         4368129280
r3             0x10464a920         4368673056
r4             0x0                 0
r5             0xd                 13
r6             0x9c967027dd        672538830813
r7             0x9c967027dd        672538830813
r8             0x2                 2
r9             0x10464a920         4368673056
r10            0x104681b20         4368898848
r11            0x3fffffffd118      70368744165656
r12            0x3ffff7fa2c00      70368609577984
r13            0x3ffff80069a8      70368609986984
r14            0x4                 4
r15            0x102d002a0         4342153888
r16            0x1                 1
r17            0xd00000000         55834574848
r18            0x1                 1
r19            0x0                 0
r20            0xd00000000         55834574848
r21            0x17                23
r22            0x44                68
r23            0x4                 4
r24            0x4                 4
r25            0xf                 15
r26            0x85d28fe519        574763296025
r27            0x104608de0         4368403936
r28            0x3afd96b9348       4053801866056
r29            0x3afd98107a1       4053803272097
r30            0x1045a09b0         4367976880
r31            0x3fffffffcb10      70368744164112
pc             0x102c28c3c         0x102c28c3c <v8::internal::RegExpStack::stack_base()+24>
msr            0x900000004800d032  10376293542669635634
cr             0x40000222          1073742370
lr             0x102c2a424         0x102c2a424 <v8::internal::NativeRegExpMacroAssembler::Execute(v8::internal::Code*, v8::internal::String*, int, unsigned char const*, unsigned char const*, int*, int, v8::internal::Isolate*)+144>
ctr            0x3ffff7fa2c00      70368609577984
xer            0x0                 0
fpscr          0xa3004100          2734702848
vscr           0x10000             65536
vrsave         0xffffffff          -1
orig_r3        0xc00000000000c120  -4611686018427338464
trap           0xd00               3328
(gdb) c
Continuing.

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffc9d0, re_code=0x3afd98107a1, re_frame=70368744163920) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1165
1165          frame_entry<Isolate*>(re_frame, kIsolate),
(gdb) x/2x re_frame+176
0x3fffffffcb00: 0x00000001      0x045a09b0
(gdb) x/60x re_frame
0x3fffffffca50: 0x00000000      0x0000000f      0x00000085      0xd28fe519
0x3fffffffca60: 0x00000001      0x04608de0      0x000003af      0xd96b9348
0x3fffffffca70: 0x000003af      0xd98107a1      0x00000001      0x045a09b0
0x3fffffffca80: 0x00003fff      0xffffca90      0x00000001      0x02c2a81c
0x3fffffffca90: 0x00003fff      0xffffcb10      0x00000001      0x04608de0
0x3fffffffcaa0: 0x000003af      0xd9810800      0x00000001      0x045c5d00
0x3fffffffcab0: 0x00003fff      0xffffcb10      0x00003fff      0xffffcac0
0x3fffffffcac0: 0x00003fff      0xffffcb10      0x00003fff      0xffffcad0
0x3fffffffcad0: 0x00000001      0x02c2a75c      0x00000001      0x0464a920
0x3fffffffcae0: 0x00000001      0x02c38e48      0x000003af      0xd98107a1
0x3fffffffcaf0: 0x00000001      0x04608de0      0x000003af      0xd98107a1
0x3fffffffcb00: 0x00000001      0x045a09b0      0x00003fff      0xffffcb10
0x3fffffffcb10: 0x00003fff      0xffffcc10      0x00003fff      0xffffcb20
0x3fffffffcb20: 0x00000001      0x02c2a48c      0x00003fff      0xffffcb90
0x3fffffffcb30: 0x00003fff      0xffffcbd8      0x0000009c      0x967027b9
(gdb) x/2x re_frame+16
0x3fffffffca60: 0x00000001      0x04608de0

It appears that the frame pointer is being improperly adjusted at some point. The Isolate pointer is being written at re_frame+16, when it belongs at re_frame+176.

In a Release build, the object being created at re_frame+176 is a RegExpStackScope; its first member is capacity, which is set to 0; trying to use this object as an Isolate is causing the NULL pointer dereference. The Debug build crashes with a different backtrace, because the object is a RegExpStack, which has valid pointers (but not correct pointers):

(gdb) bt
#0  0x0000000101b3a2a4 in v8::internal::HandleScope::HandleScope (this=0x3fffffffc8a0, isolate=0x1045a09b0) at ../deps/v8/src/handles-inl.h:34
#1  0x0000000102c29b80 in v8::internal::NativeRegExpMacroAssembler::CheckStackGuardState (isolate=0x1045a09b0, start_index=21, is_direct_call=false, return_address=0x3fffffffc9d0, re_code=0xe2164907a1, subject=0x3fffffffca10, input_start=0x3fffffffca20, input_end=0x3fffffffca28) at ../deps/v8/src/regexp/regexp-macro-assembler.cc:168

If I manually set the Isolate pointer at re_frame+176 to the correct value, and continue, then the third time it checks the StackGuard state, the Isolate point is correct when it enters the method. The fourth time, it is incorrect again. This repeats every even numbered time the StackGuard state is checked. If I manually correct the value on each even-numered attempt, the test passes correctly.

I am unsure on how to proceed to actually fix this bug. The V8 regexp code is, frankly, a twisty maze that is very hard for me to follow; it is spread out amongst multiple files and even some headers. I am hopeful that the Node community may know of a way to find the root issue so that we may fix it.

@gireeshpunathil
Copy link
Member

very interesting!

when you say every other iteration it miscalculates r31 , do you have evidence whether it goes to JS land and comes back to C++? or all these are happening within C++ sequences?

If there are JS involvement, taking compilation traces with --print-code and checking / searching / pattern matching around the frame pointer to see if you can spot the bad one? I guess frame pointers are almost always adjusted only in the prologue?

@addaleax addaleax added ppc Issues and PRs related to the Power architecture. v8 engine Issues and PRs related to the V8 dependency. labels Apr 3, 2019
@addaleax
Copy link
Member

addaleax commented Apr 3, 2019

Since this seems to affect only V8 itself: It would be interesting to see if you could reproduce this with newer versions of Node or V8, maybe try to find a pure-JS/V8-only reproduction (using d8) and/or report this to the V8 bug tracker as well?

@mscdex
Copy link
Contributor

mscdex commented Apr 3, 2019

Did you try the current node master branch to see if the situation is any different?

@awilfox
Copy link
Contributor Author

awilfox commented Apr 3, 2019

@gireeshpunathil I'm not experienced enough with V8 internals to tell if it is going back to "JavaScript land" or if it is simply JITing the regex itself (ala PCRE-JIT). It is definitely leaving C++ for generated code, though, as some other watchpoints hit ::Execute and some were in generated code.

@addaleax I'm not sure what d8 is. Further investigation led me to the v8 docs; I can try to reproduce it using the Node test JS as a starting point if it helps. Monorail hasn't worked for me for weeks (I've just tried it again):

Content Security Policy: Ignoring “'unsafe-inline'” within script-src: ‘strict-dynamic’ specified  (unknown)
Content Security Policy: Ignoring “https://bugs.chromium.org” within script-src: ‘strict-dynamic’ specified  (unknown)

Even if it did, I don't have a Google account, so I don't think I'd be able to file it.

@mscdex I'm going to try master later this morning to see if anything is different. I just wanted to report the bug while all of this was still fresh in my mind, before I started working on other things and forgot all the details. (Our goal is obviously to package a release, not master - hence my tests were done with 10.15.3.)

@awilfox
Copy link
Contributor Author

awilfox commented Apr 4, 2019

Yep, master sure is different. Lots more crashes :(

=== release test-eslint-alphabetize-errors ===                                
Path: parallel/test-eslint-alphabetize-errors
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-alphabetize-errors.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-eslint-check ===                        
Path: parallel/test-eslint-eslint-check
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-eslint-check.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-crypto-check ===         
Path: parallel/test-eslint-crypto-check
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-crypto-check.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-inspector-check ===      
Path: parallel/test-eslint-inspector-check
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-inspector-check.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-required-modules ===                   
Path: parallel/test-eslint-required-modules
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-required-modules.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-prefer-util-format-errors ===                
Path: parallel/test-eslint-prefer-util-format-errors
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-prefer-util-format-errors.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-lowercase-name-for-primitive ===                   
Path: parallel/test-eslint-lowercase-name-for-primitive
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-lowercase-name-for-primitive.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-prefer-common-expectserror ===                     
Path: parallel/test-eslint-prefer-common-expectserror
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-prefer-common-expectserror.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-prefer-assert-methods ===                       
Path: parallel/test-eslint-prefer-assert-methods
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-prefer-assert-methods.js
--- CRASHED (Signal: 11) ---
=== release test-eslint-require-buffer ===                              
Path: parallel/test-eslint-require-buffer
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-eslint-require-buffer.js
--- CRASHED (Signal: 11) ---
=== release test-fs-watch-encoding ===                                        
Path: parallel/test-fs-watch-encoding
internal/fs/watchers.js:174
    throw error;
    ^

Error: EMFILE: too many open files, watch '/home/awilcox/Code/contrib/node/test/.tmp.25'
    at FSWatcher.start (internal/fs/watchers.js:166:26)
    at Object.watch (fs.js:1312:11)
    at Object.<anonymous> (/home/awilcox/Code/contrib/node/test/parallel/test-fs-watch-encoding.js:47:22)
    at Module._compile (internal/modules/cjs/loader.js:824:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:835:10)
    at Module.load (internal/modules/cjs/loader.js:693:32)
    at Function.Module._load (internal/modules/cjs/loader.js:620:12)
    at Function.Module.runMain (internal/modules/cjs/loader.js:888:10)
    at internal/main/run_main_module.js:17:11
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-fs-watch-encoding.js
=== release test-fs-watchfile ===                                      
Path: parallel/test-fs-watchfile
internal/fs/watchers.js:174
    throw error;
    ^

Error: EMFILE: too many open files, watch '/home/awilcox/Code/contrib/node/test/.tmp.45/watch'
    at FSWatcher.start (internal/fs/watchers.js:166:26)
    at Object.watch (fs.js:1312:11)
    at /home/awilcox/Code/contrib/node/test/parallel/test-fs-watchfile.js:95:8
    at /home/awilcox/Code/contrib/node/test/common/index.js:369:15
    at FSReqCallback.oncomplete (fs.js:152:20)
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-fs-watchfile.js
=== release test-http-pipeline-requests-connection-leak ===                   
Path: parallel/test-http-pipeline-requests-connection-leak
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-http-pipeline-requests-connection-leak.js
--- CRASHED (Signal: 11) ---
=== release test-process-euid-egid ===                                        
Path: parallel/test-process-euid-egid
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-process-euid-egid.js
--- CRASHED (Signal: 11) ---
=== release test-process-uid-gid ===                                          
Path: parallel/test-process-uid-gid
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-process-uid-gid.js
--- CRASHED (Signal: 11) ---
=== release test-repl-top-level-await ===                                     
Path: parallel/test-repl-top-level-await
Command: out/Release/node --expose-internals --experimental-repl-await /home/awilcox/Code/contrib/node/test/parallel/test-repl-top-level-await.js
--- CRASHED (Signal: 11) ---
=== release test-repl-underscore ===                                          
Path: parallel/test-repl-underscore
Command: out/Release/node /home/awilcox/Code/contrib/node/test/parallel/test-repl-underscore.js
--- CRASHED (Signal: 11) ---
=== release test-readline-interface ===                                       
Path: parallel/test-readline-interface
Command: out/Release/node --expose_internals /home/awilcox/Code/contrib/node/test/parallel/test-readline-interface.js
--- CRASHED (Signal: 11) ---
=== release test-querywrap ===                                                
Path: async-hooks/test-querywrap
Command: out/Release/node --expose-gc /home/awilcox/Code/contrib/node/test/async-hooks/test-querywrap.js
--- CRASHED (Signal: 11) ---
=== release test-tlswrap ===            
Path: async-hooks/test-tlswrap
Command: out/Release/node /home/awilcox/Code/contrib/node/test/async-hooks/test-tlswrap.js
--- CRASHED (Signal: 11) ---
[02:59|% 100|+ 2590|-  20]: Done                                              
make[1]: *** [Makefile:275: jstest] Error 1
make: *** [Makefile:304: test-only] Error 2

@mhdawson
Copy link
Member

mhdawson commented Apr 5, 2019

@awilfox thanks for reporting once we see if it is still reproduce-able on master or not we'll make sure to get the team that works on V8 for PPC engaged as well.

@awilfox
Copy link
Contributor Author

awilfox commented Apr 5, 2019

@mhdawson As noted, it is reproducable on master. (See my last comment, directly above yours.)

I will note that if you would like to reproduce it yourself, you will need to apply a single change, because musl uses the ELFv2 ABI:

diff --git a/deps/v8/src/ppc/constants-ppc.h b/deps/v8/src/ppc/constants-ppc.h
index 016bc71d26..930e755054 100644
--- a/deps/v8/src/ppc/constants-ppc.h
+++ b/deps/v8/src/ppc/constants-ppc.h
@@ -21,7 +21,7 @@
 #endif
 
 #if V8_HOST_ARCH_PPC && \
-    (V8_OS_AIX || (V8_TARGET_ARCH_PPC64 && V8_TARGET_BIG_ENDIAN))
+    (V8_OS_AIX || (V8_TARGET_ARCH_PPC64 && _CALL_ELF == 1))
 #define ABI_USES_FUNCTION_DESCRIPTORS 1
 #else
 #define ABI_USES_FUNCTION_DESCRIPTORS 0
@@ -33,13 +33,13 @@
 #define ABI_PASSES_HANDLES_IN_REGS 0
 #endif
 
-#if !V8_HOST_ARCH_PPC || !V8_TARGET_ARCH_PPC64 || V8_TARGET_LITTLE_ENDIAN
+#if !V8_HOST_ARCH_PPC || !V8_TARGET_ARCH_PPC64 || _CALL_ELF == 2
 #define ABI_RETURNS_OBJECT_PAIRS_IN_REGS 1
 #else
 #define ABI_RETURNS_OBJECT_PAIRS_IN_REGS 0
 #endif
 
-#if !V8_HOST_ARCH_PPC || (V8_TARGET_ARCH_PPC64 && V8_TARGET_LITTLE_ENDIAN)
+#if !V8_HOST_ARCH_PPC || (V8_TARGET_ARCH_PPC64 && _CALL_ELF == 2)
 #define ABI_CALL_VIA_IP 1
 #else
 #define ABI_CALL_VIA_IP 0

@john-yan
Copy link

john-yan commented Apr 5, 2019

Hello @awilfox, I am on V8 on pLinux. Thanks for your very detail report. Here is my question:

  • Is there any ABI (Application Binary Interface) differences between Adélie Linux and any other ppc linux like Ubuntu? Any reference to ELFv2?

note: Please check ABI related constants on src/regexp/ppc/regexp-macro-assembler-ppc.h

  // Offsets from frame_pointer() of function parameters and stored registers.
   static const int kFramePointer = 0;
 
   // Above the frame pointer - Stored registers and stack passed parameters.
   // Register 25..31.
   static const int kStoredRegisters = kFramePointer;
   // Return address (stored from link register, read into pc on return).
   static const int kReturnAddress = kStoredRegisters + 7 * kPointerSize;
   static const int kCallerFrame = kReturnAddress + kPointerSize;
   // Stack parameters placed by caller.
  static const int kIsolate =
       kCallerFrame + kStackFrameExtraParamSlot * kPointerSize;
       
   // Below the frame pointer.
   // Register parameters stored by setup code.
   static const int kDirectCall = kFramePointer - kPointerSize;
   static const int kStackHighEnd = kDirectCall - kPointerSize;
   static const int kNumOutputRegisters = kStackHighEnd - kPointerSize;
   static const int kRegisterOutput = kNumOutputRegisters - kPointerSize;
   static const int kInputEnd = kRegisterOutput - kPointerSize;
   static const int kInputStart = kInputEnd - kPointerSize;
   static const int kStartIndex = kInputStart - kPointerSize;
   static const int kInputString = kStartIndex - kPointerSize;
   // When adding local variables remember to push space for them in
   // the frame in GetCode.
   static const int kSuccessfulCaptures = kInputString - kPointerSize;
   static const int kStringStartMinusOne = kSuccessfulCaptures - kPointerSize;
   // First register address. Following registers are below it on the stack.
   static const int kRegisterZero = kStringStartMinusOne - kPointerSize;
  • Is there anyway we can reproduce this issue locally on our machine (Ubuntu)? eg. a docker image?

@john-yan
Copy link

john-yan commented Apr 5, 2019

Further, I recommend to build v8 and run all v8 tests to make sure it works on your distribution. The steps to build/test v8 master could be found here: https://github.com/john-yan/Docker-V8-Build. But you might want to test v8 7.3, which is the stable version. Just need to run

cd v8 && git checkout branch-heads/7.3 && gclient sync

Right after fetch v8, also remove --no-history option for fetch v8

@awilfox
Copy link
Contributor Author

awilfox commented Apr 5, 2019

It looks like kStackFrameExtraParamSlot was defined incorrectly because V8 incorrectly assumes all PPC64 BE systems are AIX.

Correcting it to use _CALL_ELF == 2 instead of V8_TARGET_LITTLE_ENDIAN fixes these crashes. Thank you for pointing me at that; I missed that definition.

I'll have one of our team members that has signed Google's CLA write and submit better patches upstream. Thank you again very much for the pointer!

@awilfox awilfox closed this as completed Apr 5, 2019
@john-yan
Copy link

john-yan commented Apr 6, 2019

@awilfox As we already drop Linux PPC BE support on V8, some of the features still might not work.

@bdragon28
Copy link

FreeBSD PowerPC64 is also working on moving from AIX (ELFv1) to ELFv2 BE. _CALL_ELF is the correct way to detect the ABI in use and is not directly related to the endianness.

@john-yan
Copy link

john-yan commented Apr 6, 2019

@bdragon28 I agree with you. I am just saying endianness might be a concern down the road in terms of v8 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ppc Issues and PRs related to the Power architecture. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

7 participants