-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
20%+ regression in ReleaseFast performance since 0.11.0 #17768
Comments
mmm interresting. looks like reverting the change does indeed restore the perf. --- a/src/codegen/llvm.zig
+++ b/src/codegen/llvm.zig
@@ -10528,7 +10528,8 @@ pub const FuncGen = struct {
if (isByRef(elem_ty, mod)) {
return self.loadByRef(ptr, elem_ty, ptr_alignment, access_kind);
}
- return self.loadTruncate(access_kind, elem_ty, ptr, ptr_alignment);
+ //return self.loadTruncate(access_kind, elem_ty, ptr, ptr_alignment);
+ return self.wip.load(access_kind, try o.lowerType(elem_ty), ptr, ptr_alignment, "");
}
const containing_int_ty = try o.builder.intType(@intCast(info.packed_offset.host_size * 8)); a quick look with var slot: u4 = 2;
while (slot <= 6) : (slot += 1) {
const id = side.order[slot - 1];
if (id == 0 or side.pokemon[id - 1].hp == 0) continue;
out[n] = .{ .type = .Switch, .data = slot };
n += 1;
} The loop (and others in the function) was previously unrolled by llvm, and no longer is. |
I've tried to poke at it a little bit, but no idea how to fix this. Short of doing some kind of range propagation pass or something, I'm not quite sure how it is possible distinguish between "this is a nice clean local variable" and "this may contain some uninitialized left over bits" (as in #14200) (but then, I know nothing about llvm or zig internals, so maybe there's a way...) Of course, as a workaround, changing the loop counter to |
Thanks for digging in, @xxxbxxx! Its nice to know that I can work around this with |
Changing all non-power-of-2 loop counters to
|
that's a misunderstanding: the "change that didn't cause any regression" is indeed the change triggering the performance issue... |
forcing llvm to truncate the padding bits prevents some optimizations. fixes 370662c see ziglang#17768
forcing llvm to truncate the padding bits prevents some optimizations. fixes 370662c see ziglang#17768
forcing llvm to truncate the padding bits prevents some optimizations. fixes 370662c see ziglang#17768
per ziglang/zig#17768 (comment) this removes truncation on the loop index and allows LLVM to better simplify the code. This claws back half of the 10% regression, though there is still a 5% gap.
per ziglang/zig#17768 (comment) this removes truncation on the loop index and allows LLVM to better simplify the code. This claws back half of the 10% regression, though there is still a 5% gap.
forcing llvm to truncate the padding bits prevents some optimizations. fixes 370662c see ziglang#17768
Required as a workaround for ziglang/zig#17768 - currently the performance is around 20% slower on master which is unacceptable. This is obviously somewhat of a bummer (RIP 3a62fd3) and will be a pretty big maintenance headache but its been almost a year without (positive) progress on the Zig side so there's not really any better options. Clean `make integration` on the following builds: 0.11.0 0.12.0 0.12.0-dev.789+e6590fea1 0.12.0-dev.866+3a47bc715 0.12.0-dev.876+aaf46187a 0.12.0-dev.1396+f6de3ec96 0.12.0-dev.1879+e19219fa0 0.12.0-dev.2036+fc79b22a9 0.12.0-dev.2665+919a3bae1 0.12.0-dev.3644+05d975576 0.12.1 0.13.0 0.13.0-dev.8+c352845e8 0.13.0-dev.39+f6f7a47aa 0.13.0-dev.46+3648d7df1 0.14.0-dev.23+d9bd34fd0 0.14.0-dev.564+75cf7fca9 0.14.0-dev.994+9f46abf59
I just spent the day reverting my project to 0.11.0 - as noticed by @Inqnuam I'm now actually seeing 20%+ regression, not just 10% (and @xxxbxxx's suggestion above) no longer seems to move the needle much at all. My project builds with the Zig compiler at HEAD and all the way back to 0.11.0. I don't know how long that will possible in the wake of breaking language changes (it seems like only breaking standard library and build system changes have occurred since 0.11.0), but currently I feel it would serve as a uniquely suitable testbed for someone interesting in attempting to improve compiler performance or the performance of compiled output. I would be very surprised if other projects haven't also experienced at least some sort of slowdown since 0.11.0, I just imagine such a slowdown is harder to attribute directly to Zig in the same way my project is able to due to the difficulties of supporting multiple Zig versions and how much project's code and feature set would usually change over time. |
Required as a workaround for ziglang/zig#17768 - currently the performance is around 20% slower on master which is unacceptable. This is obviously somewhat of a bummer (RIP 3a62fd3) and will be a pretty big maintenance headache but its been almost a year without (positive) progress on the Zig side so there's not really any better options. Clean `make integration` on the following builds: 0.11.0 0.12.0 0.12.0-dev.789+e6590fea1 0.12.0-dev.866+3a47bc715 0.12.0-dev.876+aaf46187a 0.12.0-dev.1396+f6de3ec96 0.12.0-dev.1879+e19219fa0 0.12.0-dev.2036+fc79b22a9 0.12.0-dev.2665+919a3bae1 0.12.0-dev.3644+05d975576 0.12.1 0.13.0 0.13.0-dev.8+c352845e8 0.13.0-dev.39+f6f7a47aa 0.13.0-dev.46+3648d7df1 0.14.0-dev.23+d9bd34fd0 0.14.0-dev.564+75cf7fca9 0.14.0-dev.994+9f46abf59
Zig Version
0.12.0-dev.876+aaf46187a
Steps to Reproduce and Observed Behavior
I noticed a regression on my project's benchmark and bisected it via nightlies -
zig-macos-aarch64-0.12.0-dev.866+3a47bc715
produces code that runs ~10% faster thanzig-macos-aarch64-0.12.0-dev.876+aaf46187a
:The benchmark tool uses
std.time.Timer
to perform its own internal timing which it prints out as the first number there (second two numbers are for confirming the benchmarks are computing the same results).Alternatively, since the regression is large enough you can literally just use
time
(though this is measuring a different thing than the internal benchmark timer, but thats kind of unimportant from Zig's POV):Expected Behavior
There not to be a regression :)
From 3a47bc7...aaf4618 I'm guessing #17391 is a likely suspect?
The text was updated successfully, but these errors were encountered: