Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inherit scheduling for workloads from Compliance Operator manager #756

Merged
merged 1 commit into from
Dec 10, 2021

Conversation

JAORMX
Copy link
Contributor

@JAORMX JAORMX commented Dec 9, 2021

When the Compliance Operator was first coded, some assumptions were made
that were specific to Openshift. One of these assumptions was the fact
that the node-role.kubernetes.io/ was more actively used amongst other
distros. An even worst assumption was that the master role was always
available. These statements hold true for OpenShift, but not for other
distros like EKS.

This PR removes that assumption by making every workload inherit the
same nodeSelector and tolerations from the controller manager. This
is done by passing this information to every controller. However, using
the information is optional.

When CO begins, the manager will query it's own pod and fetch this
information. Failure to do so will result in an error.

Since we fetch the operator name and namespace from constants, the
constants were modified to not depend any more on k8sutils. the
aforementioned library will no longer be available in further versions
of operator-sdk, so it's better to move away from it now little by
little.

Signed-off-by: Juan Antonio Osorio Robles [email protected]

@JAORMX JAORMX requested review from jhrozek, Vincent056 and mrogers950 and removed request for jhrozek and Vincent056 December 9, 2021 07:40
@openshift-ci openshift-ci bot requested a review from Vincent056 December 9, 2021 07:40
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2021
@JAORMX JAORMX requested a review from rhmdnd December 9, 2021 07:40
@JAORMX JAORMX changed the title Inherit schedling for workloads from Compliance Operator manager Inherit scheduling for workloads from Compliance Operator manager Dec 9, 2021
When the Compliance Operator was first coded, some assumptions were made
that were specific to Openshift. One of these assumptions was the fact
that the `node-role.kubernetes.io/` was more actively used amongst other
distros. An even worst assumption was that the `master` role was always
available. These statements hold true for OpenShift, but not for other
distros like EKS.

This PR removes that assumption by making every workload inherit the
same `nodeSelector` and `tolerations` from the controller manager. This
is done by passing this information to every controller. However, using
the information is optional.

When CO begins, the manager will query it's own pod and fetch this
information. Failure to do so will result in an error.

Since we fetch the operator name and namespace from constants, the
constants were modified to not depend any more on k8sutils. the
aforementioned library will no longer be available in further versions
of operator-sdk, so it's better to move away from it now little by
little.

Signed-off-by: Juan Antonio Osorio Robles <[email protected]>
Copy link
Contributor

@rhmdnd rhmdnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick initial pass, but I'd like to try this out on an EKS cluster.

@@ -219,8 +220,14 @@ func RunOperator(cmd *cobra.Command, args []string) {
os.Exit(1)
}

getSIErr, si := getSchedulingInfo(ctx, mgr.GetAPIReader())
if getSIErr != nil {
log.Error(getSIErr, "Getting control plane schedling info")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: scheduling*

- key: "node.kubernetes.io/not-ready"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 120
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - so by removing these here, we're inheriting a set of global tolerances somewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this here to prevent pods from getting evicted on node updates? @JAORMX

Copy link
Contributor

@Vincent056 Vincent056 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 10, 2021
@openshift-ci
Copy link

openshift-ci bot commented Dec 10, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JAORMX, Vincent056

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 78f9bc8 into openshift:master Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants