-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSE2 vectorization for bitset::to_string
#3960
Conversation
Co-authored-by: Alcaro <[email protected]>
Co-authored-by: Alcaro <[email protected]>
Stop pretending these are meaningful
Thanks! 😻 (And sorry for taking so long to review this. 🐌) I pushed a merge with |
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
Thanks for optimizing this function! 🚀 🚀 🎉 |
Resolves #3858
@Alcaro suggested the original approach with forming mask with
_mm_and_si128(_Vec4, _mm_set1_epi64x(0x0102040810204080))
and populating 2 bytes to low and high 8 bytes of SSE vector via repeated_mm_unpacklo_epi8
Results
Without vectorization:
With vectorization:
(
15, char
and7, wchar_t
results are variation between runs, but there's still a strong indication that others are better with vectorization)