-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize string comparison by using memcmp #28338
Conversation
Very nice find @bors r+ |
📌 Commit bbad2d5 has been approved by |
I wonder how you slid in without being noticed by travis or the highfive bot. |
How strange! https://status.github.com/ doesn't report any issues, so it's spooky. |
@erickt Can you add a comment in the code about the optimization so somebody doesn't re-'fix' it later? |
llvm seems to be having some trouble optimizing the iterator-based string comparsion method into some equivalent to memcmp. This explicitly calls out to the memcmp intrinisic in order to allow llvm to generate better code. In some manual benchmarking, this memcmp-based approach is 20 times faster than the iterator approach.
@brson: Done. How's it look? |
@bors r+ |
📌 Commit fbd91a7 has been approved by |
llvm seems to be having some trouble optimizing the iterator-based string comparsion method into some equivalent to memcmp. This explicitly calls out to the memcmp intrinisic in order to allow llvm to generate better code. In some manual benchmarking, this memcmp-based approach is 20 times faster than the iterator approach.
fyi, I had a bug in my benchmarks where llvm 3.6 was still optimizing a good chunk of code way even in light of After I refactored the tests to test all 26 items as well as factoring out the search methods into another crate to make sure llvm couldn't do that trick again, I had a much more modest 8-22% speedup with memcmp compared to the iterator comparison method. |
Thanks @erickt. Still a fine win. |
llvm seems to be having some trouble optimizing the iterator-based string comparsion method into some equivalent to memcmp. This explicitly calls out to the memcmp intrinisic in order to allow llvm to generate better code. In some manual benchmarking, this memcmp-based approach is 20 times faster than the iterator approach.