-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ARM32 atomic intrinsics #97792
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,238,105 contexts (829,328 MinOpts, 1,408,777 FullOpts). MISSED contexts: base: 71,274 (3.08%), diff: 72,559 (3.14%) Overall (-122,174 bytes)
MinOpts (+532 bytes)
FullOpts (-122,706 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.43% to -0.04%)
Details here |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,238,105 contexts (829,328 MinOpts, 1,408,777 FullOpts). MISSED contexts: base: 71,274 (3.08%), diff: 72,559 (3.14%) Overall (-141,104 bytes)
MinOpts (+532 bytes)
FullOpts (-141,636 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.42% to -0.04%)
Details here |
Just a heads up. The tests fail on device. I am looking into it with Michal to see if we can find the root cause. |
1d24a02 fixed most of the failures, there's still incorrect sign extension of |
Diff results for #97792Throughput diffsThroughput diffs for linux/arm64 ran on windows/x64MinOpts (+0.00% to +0.01%)
Throughput diffs for osx/arm64 ran on windows/x64MinOpts (+0.00% to +0.01%)
Details here |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,237,693 contexts (829,328 MinOpts, 1,408,365 FullOpts). MISSED contexts: base: 71,275 (3.08%), diff: 72,560 (3.14%) Overall (-71,018 bytes)
MinOpts (+2,774 bytes)
FullOpts (-73,792 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.42% to -0.04%)
Details here |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,237,693 contexts (829,328 MinOpts, 1,408,365 FullOpts). MISSED contexts: base: 71,275 (3.08%), diff: 72,560 (3.14%) Overall (-71,018 bytes)
MinOpts (+2,774 bytes)
FullOpts (-73,792 bytes)
Details here Throughput diffsThroughput diffs for osx/arm64 ran on windows/x64MinOpts (+0.00% to +0.01%)
Throughput diffs for windows/arm64 ran on windows/x64MinOpts (+0.00% to +0.01%)
Details here Throughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.42% to -0.04%)
Details here |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,238,104 contexts (829,328 MinOpts, 1,408,776 FullOpts). MISSED contexts: base: 71,275 (3.08%), diff: 72,560 (3.14%) Overall (-70,778 bytes)
MinOpts (+2,774 bytes)
FullOpts (-73,552 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.42% to -0.04%)
Details here |
Diff results for #97792Assembly diffsAssembly diffs for linux/arm ran on windows/x86Diffs are based on 2,238,104 contexts (829,328 MinOpts, 1,408,776 FullOpts). MISSED contexts: base: 71,275 (3.08%), diff: 72,560 (3.14%) Overall (-70,778 bytes)
MinOpts (+2,774 bytes)
FullOpts (-73,552 bytes)
Details here Throughput diffsThroughput diffs for linux/arm ran on windows/x86Overall (-0.41% to -0.04%)
MinOpts (-0.10% to -0.06%)
FullOpts (-0.42% to -0.04%)
Details here |
Passes the tests on Raspberry Pi 5 now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an unnecessary sign extension of the value
parameter in CompareExchange
(short
):
0004e2c0 <System.Threading.Interlocked__CompareExchange_0>:
4e2c0: b508 push {r3, lr}
4e2c2: b20b sxth r3, r1
4e2c4: b212 sxth r2, r2
4e2c6: f3bf 8f5f dmb sy
4e2ca: e8d0 ef5f ldrexh lr, [r0]
4e2ce: fa0f fe8e sxth.w lr, lr
4e2d2: 4596 cmp lr, r2
4e2d4: d103 bne.n 4e2de <System.Threading.Interlocked__CompareExchange_0+0x1e>
4e2d6: e8c0 3f51 strexh r1, r3, [r0]
4e2da: 2900 cmp r1, #0
4e2dc: d1f5 bne.n 4e2ca <System.Threading.Interlocked__CompareExchange_0+0xa>
4e2de: f3bf 8f5f dmb sy
4e2e2: 4670 mov r0, lr
4e2e4: bd08 pop {r3, pc}
It's not really a correctness issue though. Otherwise, LGTM. Thanks!
I'll get back to working on my PRs again in June, currently I'm occupied with some other stuff. |
@kunalspathak I've unified the ARM64 and ARM32 handling here, could you rereview? |
@kunalspathak, it is ready to review again. |
@MichalPetryka, we are short-staffed. Is it ok to merge this to .NET 10? |
Yes, I think it is prudent to wait on this one, unless there's some key benefit to having this in .NET 9 that we're overlooking. |
We don't have time to work on this for .NET 9, so we will review it in .NET 10. |
@MichalPetryka can you resolve the conflicts please? |
This pull request has been automatically marked |
This pull request will now be closed since it had been marked |
Implements Interlocked Exchange, Add and CompareExchange as "must expand" intrinsics on ARM32 for types equal to or smaller than pointer size.
Contributes to #86915.
Fixes #9982.