-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto vectorize replace_copy
, replace_copy_if
#4431
Auto vectorize replace_copy
, replace_copy_if
#4431
Conversation
Curiously, the AVX2 auto vectorization does indeed work even for different source and destination element size, this is achieved using Surprising the compiles goes that much far, and misses the final piece to vectorize without this PR |
replace_copy
replace_copy
, replace_copy_if
This reverts commit fa29d26.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
Thanks for helping the compiler, said the developer who once gave a talk titled Don't Help The Compiler! 😹 🤪 🚀 |
Work around DevCom-10606350
Benchmark result before:
After:
After with
/arch:AVX2
:The anomality for
rc_if<uint64_t>
AVX2 case it that compiler generates AVX512 code and checks for AVX512 in__isa_available
and I don't have that, so it is scalar, still a bit faster due to branchless. Applies specifically to unsigned 64 bit only.