Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo test for an LD_PRELOAD dylib causes runtime error in 1.81.0 #130210

Closed
DavSanchez opened this issue Sep 10, 2024 · 6 comments · Fixed by #131233
Closed

cargo test for an LD_PRELOAD dylib causes runtime error in 1.81.0 #130210

DavSanchez opened this issue Sep 10, 2024 · 6 comments · Fixed by #131233
Assignees
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@DavSanchez
Copy link

DavSanchez commented Sep 10, 2024

Problem

Hi! I have a project where I define a dylib to use as LD_PRELOAD by linking an extern "C" function to .init_array. I was using Rust 1.80.1 and unit tests were working without issues.

However, since bumping to 1.81.0, our CI (Linux runners) started failing though no significant changes had been added to the code. The thing is, if you built the project with either version and passed the resulting .so as LD_PRELOAD it seemed to work without issues.

I have set up a minimal reproducer that you can inspect here. It is just a "hello world" lib.rs with a simple unit and integration test, though these tests can be removed and the output will be the same when running the steps below. The reproducer's README.md includes sample commands ran through docker, as I wrote it from a different OS.

Steps

  1. Clone the repository and check the code at src/lib.rs.
  2. Run cargo test with Rust 1.80.1. Should work.
  3. Run cargo test with Rust 1.81.0. Should fail with fatal runtime error: thread::set_current should only be called once per thread.
  4. Run cargo build and call some command passing the resulting .so as LD_PRELOAD, such as LD_PRELOAD=target/debug/libpreload_tests.so ls -lah. Should work for Rust 1.80.1.
  5. Run cargo build and call some command passing the resulting .so as LD_PRELOAD, such as LD_PRELOAD=target/debug/libpreload_tests.so ls -lah. Should work for Rust 1.81.0.

Possible Solution(s)

I have noticed that if I add #[cfg(not(test))] to the static that stores the extern "C" function and is linked to .init_array the tests run without issues in both versions, but I'm not sure if this is a proper solution and the way to work with testing LD_PRELOAD dylibs or if I'm masking some other problem.

Notes

I have opened the issue here because I think the failure might lie in how the "test runner executable" is built and ran. If you read the outputs I pasted on the reproducer's README.md for the cargo test runs, you'll see that the behavior linked to .init_array is present in the test runner as the log line "HOLA FROM LD_PRELOAD!" is shown at the beginning, in both Rust toolchain versions.

Please let me know if this is not the proper place and point me to the appropriate one to open the report.

While I am not very knowledgeable about the internals that might have caused this, I'd like to figure this out, particularly to make sure that this cannot happen on an eventual "production" usage with the compiled project and not only on tests, given the actual problem I'm working with would require performing more than just a println! and any crash during the preload prevents the actual process from running (placing it in /etc/ld.so.preload could seemingly brick a host!).

I get that this kind of lower level, run-before-main behaviors can be pretty hard to do well, so I really want to be sure I'm doing things right.

Version

cargo 1.81.0 (2dbb1af80 2024-08-20)
release: 1.81.0
commit-hash: 2dbb1af80a2914475ba76827a312e29cedfa6b2f
commit-date: 2024-08-20
host: aarch64-unknown-linux-gnu
libgit2: 1.8.1 (sys:0.19.0 vendored)
libcurl: 8.8.0-DEV (sys:0.4.73+curl-8.8.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Debian 12.0.0 [64-bit]
@DavSanchez DavSanchez added the C-bug Category: This is a bug. label Sep 10, 2024
@DavSanchez DavSanchez changed the title Integration testing for an LD_PRELOAD dylib causes runtime error in 1.81.0 cargo test for an LD_PRELOAD dylib causes runtime error in 1.81.0 Sep 10, 2024
@DavSanchez DavSanchez changed the title cargo test for an LD_PRELOAD dylib causes runtime error in 1.81.0 cargo test for an LD_PRELOAD dylib causes runtime error in 1.81.0 Sep 10, 2024
@ehuss ehuss transferred this issue from rust-lang/cargo Sep 10, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 10, 2024
@ehuss
Copy link
Contributor

ehuss commented Sep 10, 2024

Thanks for the report and the detailed reproduction! I have transferred this to rust-lang/rust since this isn't a cargo issue.

I believe this was changed by #124881. cc @Sp00ph and @joboet.

@DavSanchez
Copy link
Author

Thanks @ehuss! Yeah I was in doubt of where to put it, but since I imagined that cargo test managed how the test runner gets built I opened the issue there. Sorry for the inconvenience.

@bjorn3
Copy link
Member

bjorn3 commented Sep 11, 2024

@jieyouxu jieyouxu added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 11, 2024
@DavSanchez
Copy link
Author

cc https://users.rust-lang.org/t/ld-preload-with-init-array-fatal-runtime-error-thread-set-current-should-only-be-called-once-per-thread/117264/8

Hi! From that linked thread that I opened we have had a bit of discussion. Just so I have it clear and see if it makes sense to have this bug report open or not:

if I got this right, the change implies that calling println! initializes the main thread, so when the time of actually reaching main occurs the main thread initialization is attempted again, so it fails and the whole thing aborts?

Given the above is true about the println! call, is this the expected behavior?

Then, this would be impacting at times where I both define a function in .init_array and an entrypoint (I assume in the same executable binary), such as running the test, because a test runner executable will contain both the entrypoint and the function linked to .init_array.

This is the same doubt written in the Solutions section above: is a way to test code linked to .init_array to just make it conditional over not being in a test so it doesn't conflict with the test runner entry-point? I wonder if I might be masking something else that can bite me later.

When compiling the actual library and just using it as LD_PRELOAD this wouldn’t happen?

Like, the runtime loaded for a separate library as LD_PRELOAD would not be the same that for the binary I would execute. Did I get this right?

The answer by u/CAD97 to the above question is probably. Aside from the testing suggested, should I fear any other conflicts of this kind? Does the output type dylib or cdylib make any difference here?

@saethlin saethlin added A-libtest Area: `#[test]` / the `test` library and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Sep 18, 2024
@saethlin saethlin added the C-gub Category: the reverse of a compiler bug is generally UB label Sep 18, 2024
@joboet
Copy link
Member

joboet commented Oct 1, 2024

This falls in the "use before and after main()" category. In short, we don't really guarantee anything in this environment. That said, println! is one of those features which we have tried to make universally available in the past (e.g. we ensure that the stdio buffer is flushed to ensure output after main is visible), so I think it makes sense to fix this. I have an idea for fixing this, it's just currently blocked on #127912.

@joboet joboet added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows and removed A-libtest Area: `#[test]` / the `test` library C-gub Category: the reverse of a compiler bug is generally UB labels Oct 1, 2024
@joboet joboet self-assigned this Oct 1, 2024
@petersalomonsen
Copy link

if I got this right, the change implies that calling println! initializes the main thread, so when the time of actually reaching main occurs the main thread initialization is attempted again, so it fails and the whole thing aborts?

Given the above is true about the println! call, is this the expected behavior?

I can confirm this when trying to running a test binary compiled to WebAssembly, where I've overridden the methods for getting environment variables. I had some println! calls in there, which started failing in 1.81.0. Removing the println calls makes the test pass again.

joboet added a commit to joboet/rust that referenced this issue Oct 4, 2024
Fixes rust-lang#130210.

Since rust-lang#124881, `ReentrantLock` uses `ThreadId` to identify threads. This has the unfortunate consequence of breaking uses of `Stdout` before main: Locking the `ReentrantLock` that synchronizes the output will initialize the thread ID before the handle for the main thread is set in `rt::init`. But since that would overwrite the current thread ID, `thread::set_current` triggers an abort.

This PR fixes the problem by using the already initialized thread ID for constructing the main thread handle and allowing `set_current` calls that do not change the thread's ID.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 12, 2024
std: fix stdout-before-main

Fixes rust-lang#130210.

Since rust-lang#124881, `ReentrantLock` uses `ThreadId` to identify threads. This has the unfortunate consequence of breaking uses of `Stdout` before main: Locking the `ReentrantLock` that synchronizes the output will initialize the thread ID before the handle for the main thread is set in `rt::init`. But since that would overwrite the current thread ID, `thread::set_current` triggers an abort.

This PR fixes the problem by using the already initialized thread ID for constructing the main thread handle and allowing `set_current` calls that do not change the thread's ID.
tgross35 added a commit to tgross35/rust that referenced this issue Oct 12, 2024
std: fix stdout-before-main

Fixes rust-lang#130210.

Since rust-lang#124881, `ReentrantLock` uses `ThreadId` to identify threads. This has the unfortunate consequence of breaking uses of `Stdout` before main: Locking the `ReentrantLock` that synchronizes the output will initialize the thread ID before the handle for the main thread is set in `rt::init`. But since that would overwrite the current thread ID, `thread::set_current` triggers an abort.

This PR fixes the problem by using the already initialized thread ID for constructing the main thread handle and allowing `set_current` calls that do not change the thread's ID.
@bors bors closed this as completed in 9f91c50 Oct 12, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 12, 2024
Rollup merge of rust-lang#131233 - joboet:stdout-before-main, r=tgross35

std: fix stdout-before-main

Fixes rust-lang#130210.

Since rust-lang#124881, `ReentrantLock` uses `ThreadId` to identify threads. This has the unfortunate consequence of breaking uses of `Stdout` before main: Locking the `ReentrantLock` that synchronizes the output will initialize the thread ID before the handle for the main thread is set in `rt::init`. But since that would overwrite the current thread ID, `thread::set_current` triggers an abort.

This PR fixes the problem by using the already initialized thread ID for constructing the main thread handle and allowing `set_current` calls that do not change the thread's ID.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants