-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang vs wasm32-{emscripten,wasi} rustc C ABI mismatch w.r.t. "singleton" unions #121408
Comments
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-medium |
I've run into this problem again, but this time it's a C callback that's returning the union. As I don't control it I can't change its signature to manually match the ABI. I think I'll have to move more processing code into my C support library. |
A possible workaround might be to use a technically incorrect function signature on the Rust side that is lowered to the correct ABI on wasm targets by current (and future) rustc - pretending the callbacks get an out-pointer instead of returning by value / receive arguments by reference instead of by value. Pretty ugly and wasm-specific, but maybe less ugly than additional C code that is pointless on non-wasm platforms? |
@hanna-kruppe Hmm, that might work, but it would mean I'd need to add platform conditional code in a number of places, vs a small C shim which works everywhere. Edit: I'm having trouble even getting a shim working. It works fine when compiled to x86_64, but I can't get it to work in wasm - it's like the DispatchRock is having its value modified. @hanna-kruppe On the chance that you have any time to take a look, I pushed my broken code to https://github.com/curiousdannii/emglken/tree/remglk_rs_broken_unions
This builds and runs glulxe
```
./src/build.sh
./bin/emglken.js tests/glulxercise.ulx
```
Then enter anything. You should see something like this (60664 turns into 60684 and then into 54940)
Where as in x86_64 it works (280594112 is retained through it all):
|
Hmm, just found a potential solution: add an dummy union variant on the rust side that is 64 bits. I think that means it won't be considered a singleton union, but C just ignores the extra word. I might be able to remove all my manual shimming this way? Yep, that seems to work perfectly! And when this bug eventually gets fixed, all I'll need to do is remove the dummy variant. Much cleaner. :) |
…to the DispatchRock union so that it won't be considered a singleton union
That's pretty risky. There may be targets where it causes Rust and C to disagree on whether the union should be passed/returned in memory. More importantly, both sides will now disagree on how large the type is on targets with 32 bit pointers. Like other ABI mismatches, that may not cause breakage immediately will fail in very spectacular ways sooner or later. For example, when Rust returns the union to C via an out-pointer, either because the source code is written like that or because that's the ABI lowering for returning by value, Rust may write a full eight bytes while the C side only reserved four bytes. Adjacent data in the stack or elsewhere will then be clobbered. In simple cases, returning the union may be compiled down to storing four bytes because it's obvious to the optimizer that only a four-byte field is initialized (that's why I had to add -Zmir-opt-level=0 to the linked example). But you shouldn't rely on that because you'll constantly be one refactoring or compiler update away a very "fun" debugging session. |
Here's another example that exhibits the problem even when compiled with optimizations, simplified from code in the linked commit. Because the union value is taken loaded from memory, not constructed in-place, it's always returned by copying eight bytes, so I think you already have the bug I predicted in my last comment. |
If I also added the dummy variant to the C union that would negate the risk, right? Or do you think the compiler is smart enough to optimise that away? It wouldn't optimise it away on both sides? I could also make my library write to the dummy variant if that would help it prevent being optimised away. |
Both C and Rust will follow the type layout and ABI rules for the type you've written down (modulo bugs such as this one and assuming |
Some notes from looking into this today (updating my atrophied knowledge of rustc's layout/ABI internals along the way):
All of this makes me think there's probably a chance for a quick and dirty fix by just special casing unions somehow. At least, if nobody cares to delve further into the details of how Clang handles empty structs and arrays in all cases. I'm not itching to write a patch, though, at least not until #119183 is settled. |
On wasm32-unknown-emscripten and wasm32-wasi, rustc implements the C ABI for some unions incorrectly, i.e., different from Clang. Minimized example:
I expected to see this happen: the resulting wasm code should pass and return the union indirectly, i.e. by pointers, as described in the C ABI document and implemented in Clang (compiler explorer).
Instead, this happened: the union is passed and returned as a single scalar (i32). See the previous compiler explorer link, and I also see it locally for wasm32-wasi (too lazy to install a whole emscripten toolchain):
The definition of "singleton" union in the C ABI document ("recursively contains just a single scalar value") may be considered ambiguous, but clearly Clang interprets it differently from rustc, so something will have to give. I have not tried to exhaustively explore in which cases they differ, the above example may not be the only one.
Compare and contrast #71871 - as discussed there, the emscripten and wasi targets have long since been fixed to match Clang's ABI, with only wasm32-unknown-unknown lagging behind. However, it seems that the fixed C ABI on emscripten and wasi targets is still incorrect in some cases around unions.
cc @curiousdannii, who encountered this in a real project (rust-lang/cc-rs#954)
Meta
rustc +nightly --version --verbose
:(Also happens on 1.76 stable.)
The text was updated successfully, but these errors were encountered: