Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native multi-arch ClusterBuildStrategy #1634

Merged

Conversation

aleskandro
Copy link
Contributor

@aleskandro aleskandro commented Jun 26, 2024

Changes

This PR proposes a sample multi-arch native ClusterBuildStrategy. The ClusterBuildStrategy runs as a main orchestrator pod. It creates one "agent" job for each architecture requested by the Build. The agent jobs are responsible for building the container image and coordinate with the orchestrator pod via messages in FIFO pipes sent through the kubectl client.

The agent jobs run through the following coordination barriers:

  1. Wait for the assets to be uploaded (source code, pull secrets, etc...)
  2. Waiting for the image download to start
  3. Waiting for the image download to finish

The orchestrator pod:

  1. Create the agent jobs
  2. Upload the assets and signal the completion of the assets upload through a write to the FIFO pipe in each agent job
  3. Start the download of the oci archive
  4. Inform the agent jobs the download is completed (so that they can complete)
  5. Create a manifest-list and push it to the final registry.

The service account that runs the strategy must be able to create, list, get and watch Jobs and Pods. It also needs to be allowed the create verb for the pods/exec resource. In OKD/Openshift, the service account must also be able to create pods requiring the privileged SecurityContextConstraints.

Relates to #1119

/kind feature

Submitter Checklist

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

Add sample build strategy to orchestrate multi-arch container image builds. The service account executing this build strategy must have the ability to manage Jobs and Pods, as well as have the ability to exec into Pods. 

/cc @adambkaplan

@pull-request-size pull-request-size bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 26, 2024
@openshift-ci openshift-ci bot requested a review from adambkaplan June 26, 2024 10:59
@openshift-ci openshift-ci bot added the release-note Label for when a PR has specified a release note label Jun 26, 2024
Copy link
Member

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@aleskandro thanks for the PR! I took the liberty of updating the PR description to make the release notes linter happy, and use the word "agent" to promote inclusive language.

🤔 I'll need to check if our test suite automatically runs a smoke test for each sample build strategy, or if that is something we need to add to the e2e test code directly.

Copy link
Contributor

openshift-ci bot commented Jun 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2024
@SaschaSchwarze0
Copy link
Member

🤔 I'll need to check if our test suite automatically runs a smoke test for each sample build strategy, or if that is something we need to add to the e2e test code directly.

Needs to be added specifically. Though, what would we test in GitHub actions, there is no way to create a KinD cluster with nodes of different architectures ...

@aleskandro
Copy link
Contributor Author

aleskandro commented Jun 27, 2024

🤔 I'll need to check if our test suite automatically runs a smoke test for each sample build strategy, or if that is something we need to add to the e2e test code directly.

Needs to be added specifically. Though, what would we test in GitHub actions, there is no way to create a KinD cluster with nodes of different architectures ...

I think for the matter of testing, it is good enough to build a single-arch image into a manifest-list, i.e., having only one element in the architectures list of the Build object. For sure some for loops will execute only once, but that should be ok?

What do you think?

kube_version=$(curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
"https:${KUBERNETES_PORT#tcp:}/version" | sed -n 's/^.*gitVersion.*"v\(.*\)+.*$/\1/p')
arch=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')
curl -L -o /usr/local/bin/kubectl \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding a note for this as this will be executed every time a new build is spawn and reaches an outside network. I'm not sure if we can have an image with kubectl (and how we should consider the right version to use).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion we had live - agree that this step can be avoided if a single image with all needed binaries is published.

@aleskandro aleskandro force-pushed the multiarch-native-builds branch from ac2ea3e to e674b83 Compare July 8, 2024 12:04
@aleskandro aleskandro force-pushed the multiarch-native-builds branch 5 times, most recently from c021cd6 to 5373a51 Compare July 19, 2024 23:45
@aleskandro aleskandro requested a review from adambkaplan July 23, 2024 19:25
kube_version=$(curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
"https:${KUBERNETES_PORT#tcp:}/version" | sed -n 's/^.*gitVersion.*"v\(.*\)+.*$/\1/p')
arch=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')
curl -L -o /usr/local/bin/kubectl \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion we had live - agree that this step can be avoided if a single image with all needed binaries is published.

@aleskandro aleskandro force-pushed the multiarch-native-builds branch 7 times, most recently from b135d1d to ee72f5e Compare July 24, 2024 21:45
@@ -0,0 +1,14 @@
# NOTE: This is needed secifically in OKD/OpenShift environments.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @adambkaplan

Thanks @dorzel

@openshift-ci openshift-ci bot requested a review from adambkaplan July 24, 2024 21:56
@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2024
@aleskandro aleskandro force-pushed the multiarch-native-builds branch from ee72f5e to 433160e Compare July 25, 2024 11:09
@aleskandro
Copy link
Contributor Author

/unhold

(added by mistake)

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2024
@aleskandro
Copy link
Contributor Author

This failure on e2e (v1.27.11, v0.59.2) seems unrelated:

   Copying blob sha256:df2c3b2eb7cc63351bb32f26457bbe0402af8082548f26975f0c329bc7841881
  Copying blob sha256:efe636eac583776a8a114d50fef15bc65b648f3d2bb53326cf1f21cc5ef2b3ae
  Copying blob sha256:fe17849545bb51455d3f7c8773ded2dbb1d6668a85bd00564573a4b88afd36f6
  Error: creating build container: copying system image from manifest list: writing blob: adding layer with blob "sha256:fe17849545bb51455d3f7c8773ded2dbb1d6668a85bd00564573a4b88afd36f6": creating read-only layer with ID "fdd09facb36803af0795aedc41a5c13fdb1c835d7ff211ddfedac7813bd2dd80": no space left on device

This other on e2e (v1.27.11, v0.53.5), too, as the same test seemed to pass with v1alpha1.

• [FAILED] [95.119 seconds]
Using One-Off Builds Embed BuildSpec in BuildRun [It] should build an image using Buildpacks and a Git source
/home/runner/work/build/build/test/e2e/v1beta1/e2e_one_off_builds_test.go:64

  Timeline >>
  2024-07-25T14:06:34Z 1 Still waiting for build run 'onoff-fjrst' to succeed.
  [FAILED] in [It] - /home/runner/work/build/build/test/e2e/v1beta1/validators_test.go:134 @ 07/25/24 14:07:09.039
  2024-07-25T14:07:09Z 1 Print failed BuildRun's log
  [PANICKED] in [AfterEach] - /opt/hostedtoolcache/go/1.21.12/x64/src/runtime/panic.go:261 @ 07/25/24 14:07:09.041
  << Timeline

  [FAILED] BuildRun status doesn't move to Succeeded
  Expected
      <v1.ConditionStatus>: False
  not to equal
      <v1.ConditionStatus>: False
  In [It] at: /home/runner/work/build/build/test/e2e/v1beta1/validators_test.go:134 @ 07/25/24 14:07:09.039

  Full Stack Trace
    github.com/shipwright-io/build/test/e2e/v1beta1_test.validateBuildRunToSucceed.func1()
    	/home/runner/work/build/build/test/e2e/v1beta1/validators_test.go:134 +0x322
    reflect.Value.call({0x1a327a0?, 0xc000688390?, 0xc0007f5a20?}, {0x1d9a3e0, 0x4}, {0x3060d60, 0x0, 0xc0007f5ab2?})
    	/opt/hostedtoolcache/go/1.21.12/x64/src/reflect/value.go:596 +0xce7
    reflect.Value.Call({0x1a327a0?, 0xc000688390?, 0x0?}, {0x3060d60?, 0x1?, 0x3?})
    	/opt/hostedtoolcache/go/1.21.12/x64/src/reflect/value.go:380 +0xb9
    github.com/onsi/gomega/internal.(*AsyncAssertion).buildActualPoller.func3()
    	/home/runner/work/build/build/vendor/github.com/onsi/gomega/internal/async_assertion.go:325 +0x11f
    github.com/onsi/gomega/internal.(*AsyncAssertion).match(0xc0001f6d90, {0x2090c18?, 0xc0004e2b80}, 0x1, {0xc0004e2b90, 0x1, 0x1})
    	/home/runner/work/build/build/vendor/github.com/onsi/gomega/internal/async_assertion.go:540 +0x82c
    github.com/onsi/gomega/internal.(*AsyncAssertion).Should(0xc0001f6d90, {0x2090c18, 0xc0004e2b80}, {0xc0004e2b90, 0x1, 0x1})
    	/home/runner/work/build/build/vendor/github.com/onsi/gomega/internal/async_assertion.go:145 +0x86
    github.com/shipwright-io/build/test/e2e/v1beta1_test.validateBuildRunToSucceed(0xc0001a04d0, 0xc0002d8f00)
    	/home/runner/work/build/build/test/e2e/v1beta1/validators_test.go:144 +0x362
    github.com/shipwright-io/build/test/e2e/v1beta1_test.glob..func4.2.2()
    	/home/runner/work/build/build/test/e2e/v1beta1/e2e_one_off_builds_test.go:80 +0x450

  There were additional failures detected.  To view them in detail run ginkgo -vv

@adambkaplan
Copy link
Member

@SaschaSchwarze0 do you know if GitHub actions run the matrixed tests on the same machine? That could explain the out of disk issue.

@SaschaSchwarze0
Copy link
Member

@SaschaSchwarze0 do you know if GitHub actions run the matrixed tests on the same machine? That could explain the out of disk issue.

I would assume not given a common use case for matrix is also to use it for the os itself.

Disk space had not been an issue since we started to use https://github.com/shipwright-io/build/blob/v0.13.0/.github/workflows/ci.yml#L135-L144. I might not have looked too carefully, but did we have it somewhere else outside of this PR? Running concurrent jobs, letting them running until the result is streamed over (and therefore duplicated) to the buildrun pod and not deleting them (not seeing a kubectl delete for them and relying on ttlSecondsAfterFinished of one day) certainly can be disk-intensive.

@aleskandro
Copy link
Contributor Author

@SaschaSchwarze0 do you know if GitHub actions run the matrixed tests on the same machine? That could explain the out of disk issue.

I would assume not given a common use case for matrix is also to use it for the os itself.

Disk space had not been an issue since we started to use https://github.com/shipwright-io/build/blob/v0.13.0/.github/workflows/ci.yml#L135-L144. I might not have looked too carefully, but did we have it somewhere else outside of this PR? Running concurrent jobs, letting them running until the result is streamed over (and therefore duplicated) to the buildrun pod and not deleting them (not seeing a kubectl delete for them and relying on ttlSecondsAfterFinished of one day) certainly can be disk-intensive.

Builds and BuildRuns should be deleted after each case, according to

AfterEach(func() {
if CurrentSpecReport().Failed() {
printTestFailureDebugInfo(testBuild, testBuild.Namespace, testID)
} else if buildRun != nil {
validateServiceAccountDeletion(buildRun, testBuild.Namespace)
}
if buildRun != nil {
testBuild.DeleteBR(buildRun.Name)
buildRun = nil
}
if build != nil {
testBuild.DeleteBuild(build.Name)
build = nil
}
})

I'm running it a few times locally to see if I find anything that can be related to my changes... I can't re-run it here without your help though.

@SaschaSchwarze0
Copy link
Member

Builds and BuildRuns should be deleted after each case

Indeed they are. I missed the ownerreference in the job you are creating.

"https:${KUBERNETES_PORT#tcp:}/version" | sed -n 's/^.*gitVersion.*"v\(.*\)\+".*$/\1/p')
arch=$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')
curl -L -o /usr/local/bin/kubectl \
"https://storage.googleapis.com/kubernetes-release/release/v${kube_version}/bin/linux/${arch}/kubectl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into this failing when kubernetes version was reported by the server as v1.28.9+416ecaf. Google wants only v1.28.9 it seems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorzel good point, I just submitted a change that should cover that case too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commits adds a sample multi-arch native ClusterBuildStrategy.
The ClusterBuildStrategy runs as a main orchestrator pod. It creates one slave job for each architecture requested by the Build. The slave jobs are responsible for building the container image and coordinate with the orchestrator pod via messages in FIFO pipes sent through the kubectl client.

The slave jobs run through the following coordination barriers:
1. Wait for the assets to be uploaded (source code, pull secrets, etc...)
2. Waiting for the image download to start
3. Waiting for the image download to finish

The orchestrator pod:
1. Create the slave jobs
2. Upload the assets and signal the completion of the assets upload through a write to the FIFO pipe in each slave job
3. Start the download of the oci archive
4. Inform the slave jobs the download is completed
5. Creates a manifest-list and push it to the final registry.

The service account that runs the strategy must be able to list, get and watch jobs and pods. It also needs to be allowed the create verb for the pods/exec resource, and create for the jobs one.
In OKD/Openshift, the service account must also be able to create pods requiring the privileged SecurityContextConstraints.
… and E2E cases

When objects are not cluster-scoped, like rolebindings or local build strategies, if the TEST_NAMESPACE variable is set, the objects should be applied to that given namespace. If not, apply to the current context's namespace.
Some test cases were also implementing with an additional creation step for the same buildstrategies defined in the same folder recursively applied by the install-strategies target. This commit deletes those function calls to rely on the Makefile target for creating the non-cluster-scoped build strategies too.
@aleskandro aleskandro force-pushed the multiarch-native-builds branch from 433160e to be5660b Compare July 30, 2024 17:06
Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 4, 2024
@SaschaSchwarze0 SaschaSchwarze0 added this to the release-v0.14.0 milestone Aug 4, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 78a3c7b into shipwright-io:main Aug 4, 2024
16 checks passed
@SaschaSchwarze0 SaschaSchwarze0 added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Label for when a PR has specified a release note size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants