-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Batch disk operations #7895
base: master
Are you sure you want to change the base?
Conversation
@@ -98,6 +99,22 @@ func main() { | |||
extraconfig.SetLogLevel(log.DebugLevel) | |||
} | |||
|
|||
if vchConfig.Diagnostics.DebugLevel > 4 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is here solely to allow capture of the raw soap requests issued to vSphere.
This may be a useful capability to keep, but likely needs tidying up somewhat.
The disk-specific portions of this should be preserved, but the batching logic should be replaced with the general-purpose batching from #8034 . |
This adds disk batching to the endpointVM so operations such as volume filesystem creation can occur in parallel. It also adjusts the volume cache to remove a lock that seralizes all creates. This is NOT complete as of this commit - it requires singleflight or similar wrapping the volume create path, and also consideration as to the other inspect/delete operations as they apply to a partially completed volume. This should also allow an operation to specify if it can tolerate some latency. The desire behind this is so that disk detach operations for pull can be deferred and batched with the attach operation for the subsequent child disk which should halve the number of reconfigure operations associated with a pull request.
This adds singleflight for device listing to avoid large numbers of duplicate calls after having batched disk reconfigure operations. This also adds debug logic for capturing the govmomi soap output at debug > 4
This was a minor change to resolve a compile error from failure to reference the `shared` return value from singleflight. I don't know if that's actually needed, but adjusted to preserve both the TODO comment and a separate return path based on shared vs unique result. Corrects a missing package name in a type reference, and a spelling mistake to allow for compilation. A basic test at this point shows we can pull images, create containers with volumes attached, and parallel creation of volumes (via multiple concurrent `docker volume create` commands) shows batching of reconfigure operations occurring.
15ba6ec
to
41de10a
Compare
Observed an outofspace question causing a stun of the VCH until answering the question via the UI. In this case the question was msg.hbacommon.outofspace That has NOT been fully resolved. The following steps have been taken: 1. switch to retry on transient errors in vm.reconfigure in disk manager 2. added Retry answer to the question so the operation responds in a predictable manner instead of stunning the VCH indefinitely. Additionally async error handlers have been added on some disk management paths to allow for increased parallelization of attach/detact.
41de10a
to
90ec04e
Compare
8e38c19
to
8169ff9
Compare
8169ff9
to
b5abe95
Compare
Prior test was unnecessarily complex and resulting in a non-useful message in error case The automation token can be added into drone using the drone cli: * drone secret add or * login to the drone webui, go to profile->secrets, and update via the box in the top right https://ci-vic.vmware.com/vmware/vic/settings/secrets at time of writing
b5abe95
to
ee23edb
Compare
This adds disk batching to the endpointVM so operations such as volume
filesystem creation can occur in parallel.
It also adjusts the volume cache to remove a lock that seralizes all
creates. This is NOT complete as of this commit - it requires singleflight
or similar wrapping the volume create path, and also consideration as to
the other inspect/delete operations as they apply to a partially completed
volume.
This should also allow an operation to specify if it can tolerate some
latency. The desire behind this is so that disk detach operations for pull
can be deferred and batched with the attach operation for the subsequent
child disk which should halve the number of reconfigure operations
associated with a pull request.
This has additional modifications to the docker personality processing for
multiple anonymous volumes to parallelize the volume creation.
This means that most concurrent volume operations batch, reducing
container creation time by ~50% with 10 anonymous volumes in my tests.
Related: #5677
Fixes: #7929