Isolate envtest tests to local processes, rather than requiring a cluster #319

squaremo · 2022-08-31T16:41:23Z

Proposed changes

This changes the envtest (Kubernetes environment for unit testing) configuration so that it runs an API server against etcd in a local process, rather than expecting an existing cluster. This has a couple of benefits:

coincidences and flakes are fewer, because the whole environment is created fresh each time
no chance of conflicting with a controller running in the cluster
there's no need to spin up a cluster in a cloud platform (or use e.g., kind) for integration tests

The downsides are:

there's a startup time associated with running etcd and kube-api-server each test run, which you may not have if using e.g., Docker Desktop
if you rely on other controllers to be running for a test to succeed -- e.g., the replicaset controller creating pods, so that a Deployment reaches a ready state -- that may have to be faked with a Kubernetes client. (On the other hand, usually you would want this to be explicit in your test, anyway)

This PR also makes test fail much faster, when they fail, and fixes up a couple of things that were exposed by tests timing out:

a goroutine leak
a race hazard between foreground deletion and the controller's finalizer handling

squaremo · 2022-09-13T09:10:59Z

As a follow-up: #323

stack72

Thank you for this :) the original setup was very much cut and paste from other sources so this makes it more targetted

This commit changes the envtest (Kubernetes API harness) setup to use a fake Kubernetes API rather than the (presumed) existing cluster. This insulates the tests from coincidences -- for example, data that's required for the test and happens to be present in the test author's cluster, but nowhere else. It also ensures the code under test is the code in the working directory, and not the code as it was when last deployed to the cluster. Signed-off-by: Michael Bridgen <[email protected]>

Signed-off-by: Michael Bridgen <[email protected]>

The timeout for uses of `Eventually` is 10 minutes, but a whole test case times out after 10 minutes -- so any failures will 1. take 10 minutes, and 2. tend to trigger the more opaque whole test case timeout, where you get a process dump rather than a test failure mentioning the line number. This commit reduces the timeout for indvidual `Eventually` calls to 1 minute -- if this turns out to be optimistic in specific cases, some can be increased (or it can be taken as a signal that the test should be reconsidered). Signed-off-by: Michael Bridgen <[email protected]>

Generally we expect Pulumi stacks to take O(minute) to run; but Kubernetes API operations should be near instanteneous, so if they are going to fail, they can fail quickly. Note that deleting Stack objects is a Kubernetes API operation, but will wait for the controller to run a finalizer, which is a stack operation -- so these ops get a stack-proportioned timeout. Signed-off-by: Michael Bridgen <[email protected]>

The func runCmd uses io.Pipe for both stderr and stdout of the command it runs, so writes on either can be echoed to the log as they happen; and two goroutines for doing that echoing. In process dumps, it's apparent that these goroutines leak -- they are blocked on Scan, which is presumably waiting for an EOF. This commit replaces the explicit pipe plumbing with `Command.StdoutPipe()` and `Command.StderrPipe()`. The Command machinery knows to close the readers when the process has finished writing, so Scan no longer blocks indefinitely. It's also not necessary to start _two_ new goroutines, since we already have one -- that in which we're running. Signed-off-by: Michael Bridgen <[email protected]>

Foreground propagation propagation relies on a finalizer, which interacts with the operator's finalizer in such a way that prevents deletion indefinitely and for tests to time out (I'm not sure exactly why to be honest, that's just what I observe). It shouldn't be necessary, since the test code in question goes on to wait for the deletion to have taken effect anyway. Signed-off-by: Michael Bridgen <[email protected]>

squaremo · 2022-09-13T10:33:06Z

(Just rebased after #322, since there were adjacent changes)

Signed-off-by: Michael Bridgen <[email protected]>

squaremo force-pushed the isolate-envtest branch 4 times, most recently from 9fbc2a8 to 2bb60fe Compare September 9, 2022 21:55

squaremo added the impact/no-changelog-required This issue doesn't require a CHANGELOG update label Sep 9, 2022

squaremo marked this pull request as ready for review September 9, 2022 22:03

squaremo force-pushed the isolate-envtest branch from 2bb60fe to 0194c9c Compare September 12, 2022 14:43

squaremo requested a review from stack72 September 12, 2022 21:50

squaremo mentioned this pull request Sep 13, 2022

Test matrix of supported Kubernetes versions #323

Open

squaremo requested a review from roothorp September 13, 2022 09:11

stack72 approved these changes Sep 13, 2022

View reviewed changes

squaremo added 6 commits September 13, 2022 11:32

Remove cluster setup from github workflow

0a0f632

Signed-off-by: Michael Bridgen <[email protected]>

squaremo force-pushed the isolate-envtest branch from 0194c9c to 6eda2fe Compare September 13, 2022 10:32

squaremo removed the impact/no-changelog-required This issue doesn't require a CHANGELOG update label Sep 13, 2022

Changelog entry for goroutine leak fix

f37ebad

Signed-off-by: Michael Bridgen <[email protected]>

roothorp approved these changes Sep 13, 2022

View reviewed changes

squaremo merged commit 607f398 into master Sep 13, 2022

squaremo deleted the isolate-envtest branch September 13, 2022 13:19

squaremo mentioned this pull request Sep 16, 2022

Fix GCP service account pulumi/ci-mgmt#251

Closed

squaremo mentioned this pull request Oct 4, 2022

Parallelise and randomize tests #331

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isolate envtest tests to local processes, rather than requiring a cluster #319

Isolate envtest tests to local processes, rather than requiring a cluster #319

squaremo commented Aug 31, 2022 •

edited

Loading

squaremo commented Sep 13, 2022

stack72 left a comment

squaremo commented Sep 13, 2022

Isolate envtest tests to local processes, rather than requiring a cluster #319

Isolate envtest tests to local processes, rather than requiring a cluster #319

Conversation

squaremo commented Aug 31, 2022 • edited Loading

Proposed changes

squaremo commented Sep 13, 2022

stack72 left a comment

Choose a reason for hiding this comment

squaremo commented Sep 13, 2022

squaremo commented Aug 31, 2022 •

edited

Loading