Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve call_once performance by using Init­Once­Begin­Initialize / InitOnceComplete #688

Merged
merged 42 commits into from
May 11, 2020

Conversation

AlexGuteniev
Copy link
Contributor

Implement #606

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner April 6, 2020 17:10
stl/src/xonce2.cpp Outdated Show resolved Hide resolved
@CaseyCarter CaseyCarter added the performance Must go faster label Apr 6, 2020
stl/inc/xcall_once.h Outdated Show resolved Hide resolved
stl/src/xonce2.cpp Outdated Show resolved Hide resolved
stl/src/xonce2.cpp Outdated Show resolved Hide resolved
stl/inc/xcall_once.h Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
stl/src/xonce2.cpp Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
stl/inc/mutex Outdated Show resolved Hide resolved
@AlexGuteniev
Copy link
Contributor Author

I have an idea that this change is not needed.
Instead should proceed with #593 , and implement call_once based on atomic wait.
It would be better than whatever I'm proposing here.
As on fast path it will not call any API at all, it would be header only!
Just need #593 to make supporting Windows XP.

@AlexGuteniev
Copy link
Contributor Author

Ah, forgot about ODR compatibility, when should coexist with an instance of old STL. Then the proposal in above comment is for vNext only. Let's proceed with this PR, and for proposed improvement we'll have #707

@AlexGuteniev AlexGuteniev requested a review from cbezault May 6, 2020 19:42
Copy link
Contributor

@cbezault cbezault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unblocking my review.

@@ -397,13 +398,18 @@ set(STATIC_SOURCES
${SOURCES_SATELLITE_CODECVT_IDS}
)

add_library(std_init_once_begin_initialize OBJECT IMPORTED)
add_library(std_init_once_complete OBJECT IMPORTED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbezault Why two object libraries instead of one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. We could make them a single library and give it a nice logical name.

@StephanTLavavej
Copy link
Member

@BillyONeal has created Microsoft-internal MSVC-PR-246931 which needs a few Setup changes.

@AlexGuteniev
Copy link
Contributor Author

@StephanTLavavej , does ready to merge status mean that it was (internally) confirmed that InitOnceBeginInitialize fails only in abnormal conditions, not in low resource conditions, so it is the right thing to call abort on its failure and mark call_once as noexcept conditionally?

@StephanTLavavej
Copy link
Member

No, I missed that that was an open question - thanks for bringing it up! @BillyONeal, what's the status of that?

@BillyONeal
Copy link
Member

To the best of my reading of the Windows sources the only failing condition for this API are invalid parameters. (Since Vista the kernel team understands that failure to allocate resources for a synchronization primitive is generally unrecoverable, and so they moved the resource allocation to the thread data structure rather than the lock data structure -- that's why SRWLOCK is a pointer)

Whether they want to commit to that as in 'documented on MSDN' is another question, but I don't see what a user could hypothetically do to recover here.

@AlexGuteniev
Copy link
Contributor Author

To the best of my reading of the Windows sources

Fair enough.

So, as you inspected sources, then practically I agree with current resolution not to throw on failures.

And then possibly the same approach can be used for my WaitOnAddress / WakeByAddress issue.

(Though it still may be undesirable.
#705 resolution is against undocumented functions, so undocumented implementation details of documented functions might also be not in spirit of the resolution.

For critical sections the behavior was explicitly documented that EnterCriticalSection does not throw on low resources anymore. For AcquireSRWLock* the behavior is also implicitly documented by not providing ways to fail. But for this one, it cannot be concluded from the documentation that no-failure rule extrapolates)

but I don't see what a user could hypothetically do to recover here.

Precondition failure is unrecoverable.
But if it failed due to the lack of resource, it could have been recoverable: exception handling frees kernel object handles, and on next attempt there may be less kernel handles in use, so the next attempt may succeed.

@AlexGuteniev
Copy link
Contributor Author

To be clear, I'm satisfied with the current resolution of this in STL source.

I'm not satisfied with Windows documentation maintenance quality, and with #705 resolution that seems to assume more serious approach to documentation usage that is being taken in practice for documentation maintenance.

@BillyONeal
Copy link
Member

What is CHPE anyway?

I know it stands for 'Compiled Hybrid Portable Executable', that it behaves like another CPU as far as the libraries team is concerned, and is in support of the x86-on-arm64 emulator that's part of Windows on aarch64, but I don't really know details.

@AlexGuteniev
Copy link
Contributor Author

x86-on-arm64 emulator

If it is x86 emulator then probably it should have x86 ABI, so it would be more logical to copy x86 object, the same as ARM64 is a copy of x64

@BillyONeal
Copy link
Member

The mangling is different, so copying it wouldn't work. I asked the chpe folks what to do here and this is what they said.

@BillyONeal BillyONeal merged commit 9a0a17e into microsoft:master May 11, 2020
@BillyONeal
Copy link
Member

@AlexGuteniev Thanks for your contribution!

@pmisik
Copy link
Contributor

pmisik commented Oct 1, 2020

Hi

Sorry for asking question about legacy Windows XP.
Does this change mean that VS 16.7 drops support for Windows XP for binaries linked statically with runtime?
Redist binaries looks correct to me as they import table do not contains

KERNEL32.dll!InitOnceComplete
KERNEL32.dll!InitOnceBeginInitialize

In case I compile binary statically linked with runtime those imports are present in final executable.

Is it intentional or is it bug?
If it is intentional, is there any announcement of Windows XP support drop in STL we can refer to managers, business partners, customers…?
Is there any recommended workaround to fix this issue to target Windows XP?
Where can I find aliasobj.exe? What it actually does? I cannot find it in VS16.7, Windows SDK, Widows WDK. Do I need install specific component of Visual Studio (which one)?
Do you have any public info/plan about supported platforms in vNext, timeline…?

@BillyONeal
Copy link
Member

BillyONeal commented Oct 1, 2020

Does this change mean that VS 16.7 drops support for Windows XP for binaries linked statically with runtime?

XP support was deprecated in VS2017, and was dropped in VS2019 RTM. Despite lack of support, it might have "happened to work" because our runtime DLLs needed to still work, since the runtime DLLs are also shared with VS2015.

Note that due to the termination of all SHA-1 root signing certificates, we won't be able to update the redist on XP at all, so you'll be seeing Vista dependencies there soon (#1194). I think we have slated to keep the latest redist that worked on XP around for 2015 and 2017 customers.

Redist binaries looks correct to me as they import table do not contains

Right, because in this case the dependency comes from the import library rather than our DLLs.

Is there any recommended workaround to fix this issue to target Windows XP?

VS2019 comes with an option to install the VS2017 compiler which still supported XP:

image

Do you have any public info/plan about supported platforms in vNext, timeline…?

I think some folks have been pushing to set that release to Win7 but as far as I know Vista has most of the things we actually need.

@BillyONeal
Copy link
Member

(Sorry, to clarify, when I said VS2019 above I mean "the v142 compilers first shipped in VS2019. The VS2019 IDE still supports targeting XP because it can natively use the v141_xp compiler.)

@StephanTLavavej
Copy link
Member

I will be demanding advocating strongly for Win7 being the minimum platform for vNext as it will allow us to use TryAcquireSRWLockExclusive. Any attempt to support Vista implies the sort of perf-harming switching logic that we currently have; to impose such costs for the lifetime of vNext for an OS which is no longer supported right now makes no sense whatsoever.

@sylveon
Copy link
Contributor

sylveon commented Oct 4, 2020

I would argue to target windows 8/8.1 minimum as the new C++20 synchronization stuff makes heavy use of WaitOnAddress, and Windows 7 is not in mainstream or extended support either.

@StephanTLavavej
Copy link
Member

Given Windows 7’s popularity, it seems unlikely that we could drop targeting; I suppose we could ask.

@BillyONeal
Copy link
Member

+1 on asking to target only Win8+ but also +1 on pessimism that that will fly. Server 2008 is still supported for a at least 3 more years if folks move their workloads into Azure. (see https://www.microsoft.com/en-us/cloud-platform/windows-server-2008 )

Any attempt to support Vista implies the sort of perf-harming switching logic

Most of that perf hammering is for the ConcRT stuff though -- for example std::mutex needing to be 10x as large as it needs to be. Given that we have the OK from the kernel folks to implement our own TryAcquireSRWLockExclusive using the actual SRWLOCK data structure if TryAcquireSRWLockExclusive is not present, it would cost at most 1 extra branch for a function which isn't that commonly used, so not a huge issue.

@AlexGuteniev
Copy link
Contributor Author

it would cost at most 1 extra branch for a function which isn't that commonly used, so not a huge issue.

slightly more. GetProcAddress instead of IAT costs Control Flow Guard.

plus may cost spurious AppVerifier stops on Vista.

still reasonable price.

@pmisik
Copy link
Contributor

pmisik commented Oct 6, 2020

Windows Vista does not have backported support for SHA2 codesigning certificates however Windows Server 2008 has.
https://support.microsoft.com/en-us/help/4472027/2019-sha-2-code-signing-support-requirement-for-windows-and-wsus
I think you won't be able to update the redist on Vista.
There is also expiring KM cross-certificate issue. So, vendors making drivers or code that needs to be signed with page hashes won’t be able issue update targeting legacy Windows without SHA2 codesign support.
https://docs.microsoft.com/en-us/windows-hardware/drivers/install/software-publisher-certificate
So, I would say deprecation of Windows Vista(2K8) based kernel may be fair trade compromise to keep high adoption of next Visual Studio version/not harm performance much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants