Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow karpenter to filter and create nodes with any labels #1046

Open
elmiko opened this issue Feb 26, 2024 · 51 comments · May be fixed by #1908
Open

allow karpenter to filter and create nodes with any labels #1046

elmiko opened this issue Feb 26, 2024 · 51 comments · May be fixed by #1908
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@elmiko
Copy link
Contributor

elmiko commented Feb 26, 2024

Description

What problem are you trying to solve?

As a karpenter user running on openshift, i would like new nodes to have the label node-role.kubernetes.io/worker so that my cluster automation will behave as i expect. Karpenter does not allow some labels, notably those with the kubernetes.io prefix which are not well-known, to be filtered for or to be applied to new nodes. If karpenter could filter for, and add, any label then this would solve my problem.

How important is this feature to you?

This is fairly important for openshift as the label is used frequently. It is possible to work around this, but for long term usage on openshift a permanent solution would be better.

Openshift is not singular in this respect either, other cluster infrastructures have similar requirements. If karpenter is adding labels after a node has been created (ie not as a flag to kubelet) then it should be allowed to add any label that the user requests.

This issue was discussed at the Feb. 26 2024 community meeting.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@elmiko elmiko added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 26, 2024
@elmiko elmiko changed the title allow karpenter to create nodes with non-well-known kubernetes labels allow karpenter to filter and create nodes with non-well-known kubernetes labels Feb 26, 2024
@jonathan-innis
Copy link
Member

Looks like there are a couple other issues that were asking for similar functionality here: aws/karpenter-provider-aws#1941 and #660

@jonathan-innis
Copy link
Member

What's the security implication of allowing users to set this custom label? From a little sleuthing, it doesn't seem to be any different than the node-restriction.kubernetes.io label that we already support today.

@sftim
Copy link

sftim commented Mar 1, 2024

As Karpenter is a Kubernetes project, I'm wary of even indirectly encouraging use of unregistered labels.

If node-role.kubernetes.io/worker should become an official label, let's get that registered first - it's not a particularly difficult process. However, registration first, then adoption.

“OpenShift uses it” does not count as grounds to add support into Kubernetes code.


OpenShift folks, if you want a label to mark nodes as OpenShift workers, you can make one, eg openshift.example/worker: true

@sftim
Copy link

sftim commented Mar 1, 2024

security implication

Nodes either can or can't set arbitrary labels on themselves, depending on what controls you've set. So long as we leave the kubelet to set that label, we're fine. If Karpenter starts setting a label on a node, it could provide a way to bypass restrictions on what labels nodes can have (although, letting someone write to your NodePools is already bad practice).

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

As Karpenter is a Kubernetes project, I'm wary of even indirectly encouraging use of unregistered labels.

this makes me wonder if the kubelet should be creating warnings/errors if the user attempts to use an unregistered well-known label?

karpenter is may be preventing something that the kubelet allows.

If node-role.kubernetes.io/worker should become an official label, let's get that registered first - it's not a particularly difficult process. However, registration first, then adoption.

totally agreement from me, but this is historical as well. openshift has been using the label for a long time, i only mention this to help add context around the effort to change the value.

“OpenShift uses it” does not count as grounds to add support into Kubernetes code.

agreed, but let's take openshift out of the equation. as a user, i would like karpenter to create nodes with unregistered labels from the kubernetes.io prefix, i can do this today on the kubelet command line and i would like karpenter to do the same for me.

i think if this sounds like an unreasonable request at the karpenter level, then we should consider making the kubelet produce warnings when this happens because we are now asking karpenter to gate activity that the kubelet allows.

OpenShift folks, if you want a label to mark nodes as OpenShift workers, you can make one, eg openshift.example/worker: true

no objection from me, but i think this is orthogonal to the true issue here. i only mentioned openshift to help illuminate the use case.

So long as we leave the kubelet to set that label, we're fine. If Karpenter starts setting a label on a node, it could provide a way to bypass restrictions on what labels nodes can have (although, letting someone write to your NodePools is already bad practice).

i think this question is important, if karpenter is adding the label after node creation, then i totally agree with it preventing unregistered labels from being added. but if karpenter is passing the labels down to the kubelet configuration, then i think we should reconsider.

thanks for the thoughtful discussion @sftim and @jonathan-innis

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

one additional thought on this, even if karpenter cannot create nodes with the unregistered labels, it would be nice to able to have it know when a node will be created with that label so that it can properly predict which nodes should be created.

essentially, i am fine with karpenter not placing the unregistred label on the node, but i should be able to tell karpenter "hey if you see these pods, put them on nodes will this label, and oh btw these node classes will have that label".

@sftim
Copy link

sftim commented Mar 5, 2024

We have code in Kubernetes to prevent the node setting disallowed labels. You can use Karpenter to bypass that; it's your cluster.

However, should we encourage it? I say we should not. Upstream Kubernetes and in-project code should abide by our architecture standards, even if individual end users intend to disregard them.

TL;DR: offering an in-project security footgun feels like an obviously bad choice.

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

However, should we encourage it? I say we should not.

agreed

TL;DR: offering an in-project security footgun feels like an obviously bad choice.

just to be clear, do you consider the footgun to be: karpenter adding disallowed labels after the node is created, karpenter adding disallowed labels to the command line flag list for kubelet, or both?

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

reading this a little deeper, https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction

i have a question about this interaction with the admission controllers, if a user has disabled the admission controller for node restriction, will that change be reflected in karpenter's restriction as well?

it seems like we would want karpenter to reflect what the admission controllers are doing, so if an update to a node object gets rejected wouldn't we want to surface that error to the user?

@sftim
Copy link

sftim commented Mar 5, 2024

I suppose it's an outright footgun if a compromised node can, in any way, set labels on itself it shouldn't be able to, or cause those to be set.

I'm also opposed to tacitly encouraging people to use Kubernetes-namespaced labels that Kubernetes doesn't recognize. We shouldn't staff work that encourages doing more of the wrong thing.

@sftim
Copy link

sftim commented Mar 5, 2024

i have a question about this interaction with the admission controllers, if a user has disabled the admission controller for node restriction, will that change be reflected in karpenter's restriction as well?

No; in general, Karpenter cannot know what admission controllers are in use.

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

No; in general, Karpenter cannot know what admission controllers are in use.

perhaps i phrased that poorly, i'll try to restate.

imagine a user has the noderestriction admission controller disabled. the user configures karpenter to make changes that would otherwise be prevented by the admission controller. karpenter refuses to allow the changes because it has duplicated the default rules for the noderestriction admission cotroller. should karpenter be the component responsible for rejecting the configuration?

i'm wondering if karpenter should be returning the errors from updating nodes/pods instead of interposing itself for admission.

We shouldn't staff work that encourages doing more of the wrong thing.

agreed, and this makes a lot of sense to me.

@sftim
Copy link

sftim commented Mar 5, 2024

should karpenter be the component responsible for rejecting the configuration?

I think so, because it's much easier to maintain an important security guarantee this way.
It sounds like you think Karpenter is an admission controller; it's not.

@elmiko
Copy link
Contributor Author

elmiko commented Mar 5, 2024

It sounds like you think Karpenter is an admission controller; it's not.

i don't think i do, but i am trying to reconcile why the rules for the unregistered labels are encoded/validated in karpenter instead of deferring to the apiserver to return an error. (perhaps i am not understanding this flow properly).

@sftim
Copy link

sftim commented Mar 6, 2024

i don't think i do, but i am trying to reconcile why the rules for the unregistered labels are encoded/validated in karpenter instead of deferring to the apiserver to return an error. (perhaps i am not understanding this flow properly).

The API server's restrictions (from the node restriction admission controller) apply exclusively to label changes made by nodes. Karpenter is not a node, so those restrictions wouldn't be enforced.

@elmiko
Copy link
Contributor Author

elmiko commented Mar 6, 2024

exclusively to label changes made by nodes

that's what i was missing, thank you!

@engedaam
Copy link
Contributor

/remove-label needs-triage

@k8s-ci-robot
Copy link
Contributor

@engedaam: The label(s) /remove-label needs-triage cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/remove-label needs-triage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@engedaam
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Mar 23, 2024
@elmiko
Copy link
Contributor Author

elmiko commented Mar 26, 2024

happy to see this considered for inclusion, but given the conversation between @sftim and myself, i'm not sure this is a good idea to carry forward. it seems like a possible point for a security escalation.

@jonathan-innis jonathan-innis removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 13, 2024
@sftim
Copy link

sftim commented Jul 18, 2024

I think we should ask the OpenShift folks to consider not using the wrong label, and leave it at that. OpenShift is entirely welcome to use OpenShift labels for node role, it just shouldn't bypass K8s processes for registering labels.

@sftim
Copy link

sftim commented Jul 18, 2024

For people using Karpenter who want to work around this, they could use a different label (eg internal-node-role: worker) and a contrib controller that copies internal-node-role to whatever other label they want set. But we shouldn't make that contrib code part of Kubernetes either, because it'd be furthering a mistake.

@diranged
Copy link

So given that Karpenter already applies Taints to Node objects upon creation, I can't see why it would be any different to allow Karpenter to apply labels (including node-role.kubernetes.io/<role>).

@diranged
Copy link

Kubernetes enforces the rules here; a node autoscaler, as a client of the Kubernetes API, isn't able by itself to bypass the security rule.Message ID: @.***>

I don't believe that is true at all. For example, the Spot.io Ocean Cluster Controller absolutely labels our nodes with the node-role label for us on startup. The only rule here is that the node itself is unable to apply that label.

@sftim
Copy link

sftim commented Jul 22, 2024 via email

@sftim
Copy link

sftim commented Jul 22, 2024 via email

@diranged
Copy link

Should node autoscalers try to set unofficial labels in the Kubernetes namespace? It's possible to code up, but I (still) don't recommend it; in fact, I actively oppose making that change to Karpenter

I should fix my comment - I want to use the kubernetes.io/role=xxx label - specifically defined here https://github.com/kubernetes/kubernetes/blob/v1.30.3/staging/src/k8s.io/kubectl/pkg/describe/interface.go#L33-L34 as:

// NodeLabelRole specifies the role of a node
NodeLabelRole = "kubernetes.io/role"

Can you elaborate on why you oppose Karpenter managing this label? I can't find anything in the Kubernetes code or documentation that indicates we should't set this label.

@elmiko
Copy link
Contributor Author

elmiko commented Jul 22, 2024

I think we should ask the OpenShift folks to consider not using the wrong label, and leave it at that. OpenShift is entirely welcome to use OpenShift labels for node role, it just shouldn't bypass K8s processes for registering labels.

i think this is sound advice and something i have advocated for internally at red hat. i only brought up openshift in my issue as it pertains to my specific problem here. i am not attempting to suggest that we need to change kubernetes core behaviors based on my needs, i only added it to help with context.

i appreciate the extended discussions from @sftim and the explanation for the security-related part of this issue.

my confusion is more around karpenter attempting to block everything in the kuberenetes.io prefixed labels, i think this is similar to what @diranged is also pointing out. on this point, i'm fine to accept the community's decision on what karpenter should or should not allow, i was mostly looking to have an open discussion about it.

@sftim
Copy link

sftim commented Jul 23, 2024 via email

@diranged
Copy link

creating nodes that will run a kubelet that then self-labels

@sftim,
So that is true - if you are using an amiFamily like Bottlerocket where Karpenter creates/merges the userData... however, we actually were running in Custom mode, where Karpenter is not mutating or managing the userData at all, and it still sets the labels on the nodes. Karpenter is (I think) pre-creating the Node object and populating the labels that way.. which would allow it to set the kubernetes.io/role label just fine.

@sftim
Copy link

sftim commented Nov 22, 2024

Also see kubernetes/website#48803

The Kubernetes docs about the node-role.kubernetes.io/* labels turn out to be wrong.

@liggitt
Copy link

liggitt commented Nov 22, 2024

Karpenter creates nodes that have labels by, AIUI, creating nodes that will run a kubelet that then self-labels.

If that is the case, then it is reasonable to limit the labels it accepts to ones the kubelet will be allowed to self-label with.

https://github.com/kubernetes/kubernetes/blob/35d098aaa0a912171c46345e2c0a7b54d6046ef4/plugin/pkg/admission/noderestriction/admission.go#L576 and https://github.com/kubernetes/kubernetes/blob/35d098aaa0a912171c46345e2c0a7b54d6046ef4/staging/src/k8s.io/kubelet/pkg/apis/well_known_labels.go#L67 is the effective restriction on kubernetes.io or k8s.io labels kubelets can set

@sftim
Copy link

sftim commented Nov 22, 2024

Should Karpenter be allowed to set node role labels on behalf of the nodes it deploys? That's still a tricky one to answer.

In other words, Karpenter could make a NodeClaim, watch for a node, and then label that node with a role (maybe: never control-plane?)
Sounds feasible and nothing seems to disallow it.

@liggitt
Copy link

liggitt commented Nov 22, 2024

From Kubernetes' perspective, something managing nodes is perfectly free to label with arbitrary node-role.kubernetes.io/* node roles as it wishes, the kubelet just is restricted from self-labeling that way.

I don't know enough about Karpenter to know if that makes sense here, or if it has a controller mechanism managing nodes where it would make sense to put node label addition logic

@elmiko
Copy link
Contributor Author

elmiko commented Nov 26, 2024

personally, i would love to see karpenter open up it's allowed labels here, for reading and writing. currently there are some issues where karpenter uses the same detection logic to determine if the labels can be used for selection (not even for writing).

i totally get the reasoning for why kubelets are restricted in this manner, but i think other controllers should be able to label after the node has been created. the only thing i'm not sure about here is whether karpenter labels after the node is created or if it tries to inject the label into the kubelet configuration. the latter would not be what i would advocate for.

@javanthropus
Copy link

The ability to have karpenter set restricted labels would be helpful and makes administrative sense. Those labels are restricted in order to prevent a subverted node from updating its own labels in order to attract new pods to itself. Without an ability to set such labels as an admin, I'm forced to use unrestricted labels instead, and as a result a subverted node could set such labels for itself in order to attract sensitive pods. The pods, their secrets, and their service accounts' access permissions would all be vulnerable at that point since the attacker would have direct access to all of them.

@sftim
Copy link

sftim commented Dec 6, 2024

@elmiko have you got time to revise the issue description?

AIUI node-role.kubernetes.io/worker is official; when we added the code to kubectl and kubeadm, we missed updating the docs correctly to show this. So it's not that the labels are not well known; it's that they are not on the allow list for (official) keys that the kubelet is allowed to set on its own node.

@elmiko
Copy link
Contributor Author

elmiko commented Dec 6, 2024

sure thing @sftim , were you thinking i should scope it tighter around the node-role.kubernetes.io/worker and remove mentions of the well-known labels in favor of things that should be on the allow list, or did you have something else in mind?

@javanthropus
Copy link

@elmiko, if you do scope it tighter, please don't go any tighter than the prefix node-role.kubernetes.io/. That way we can define more roles than just "worker". We need additional node roles for secure workload segregation.

@elmiko
Copy link
Contributor Author

elmiko commented Dec 6, 2024

@javanthropus thanks, i tend to agree with you in that i think this covers a larger group of interests than just my own, or the one label. imo, it covers the behavior pattern, if karpenter is adding these labels after the node is created, then should we stop it from adding anything?

@javanthropus
Copy link

javanthropus commented Dec 6, 2024

I don't see a technical reason to stop it from adding arbitrary labels. In fact, maybe it's even a good idea for karpenter to enforce that all labels are always applied throughout the lifetime of a node in order to avoid drift on those labels should an administrator (or an attacker) try to change one for an arbitrary node. All karpenter would need to do to keep things simple is filter out the restricted labels from those given to kubelet so that kubelet doesn't complain about them.

@sftim
Copy link

sftim commented Dec 6, 2024

were you thinking i should scope it tighter around the node-role.kubernetes.io/worker and remove mentions of the well-known labels in favor of things that should be on the allow list, or did you have something else in mind?

I think the issue is now about allowing Karpenter to manage [all] node labels that the kubelet is denied from setting.

@elmiko
Copy link
Contributor Author

elmiko commented Dec 6, 2024

thanks, i'll spend some time rewriting it.

@sftim
Copy link

sftim commented Dec 6, 2024

As a side note to this issue: I'd love to see a Boolean input to the Helm charts (eg for the AWS Karpenter integration), where you have to opt in to the RBAC binding that lets Karpenter manage node labels directly.

AIUI Karpenter doesn't need any access to write to nodes in its default configuration, and if we're able to keep that the default, it's a good outcome for security.

@elmiko
Copy link
Contributor Author

elmiko commented Dec 6, 2024

that sounds like a good idea to me

@elmiko elmiko changed the title allow karpenter to filter and create nodes with non-well-known kubernetes labels allow karpenter to filter and create nodes with all labels Dec 6, 2024
@elmiko
Copy link
Contributor Author

elmiko commented Dec 6, 2024

@sftim changed the titled and added an extra sentence about using any label. does that look good to you?

@elmiko elmiko changed the title allow karpenter to filter and create nodes with all labels allow karpenter to filter and create nodes with any labels Dec 6, 2024
@sftim
Copy link

sftim commented Dec 6, 2024

I suggest dropping or rewording this bit:

As an openshift user, i require new nodes to take the label node-role.kubernetes.io/worker. Although this label uses the kubernetes.io prefix, it is not an official well-known label. Currently, karpenter does not allow NodePools to be created when a label with the kubernetes.io prefix is used on a non-well-known label for new node creation, or when checking predicates.

@enxebre
Copy link
Member

enxebre commented Jan 8, 2025

node-role.kubernetes.io/ namespace was originally used and intended for conventional and organizational purposes. As captured in [1] It was not meant for core kube internal contractual consumption but there's nothing preventing their usage for consumers - "The use of the node-role.kubernetes/* label namespace is reserved solely for end-user and external Kubernetes consumers".

I see managing kubelet labels as task owned by the Service Provider Persona via the the bootstrap userdata implementation and the node restriction admission plugin.
The security boundary is already handled by the node restriction admission plugin. I wouldn't expect karpenter to duplicate this logic, potentially inconsistently drifting over time, specially since karpenter supports N-M kube versions.
I think any opinions and earlier validation are specific to and should be owned by the bootstrap userdata implementation.

I'm not quite seeing how Karpenter preventing the usage of node-role.kubernetes.io selector will really harden security given that as mentioned above any invalid input for kubelet labels will be stopped by the node restriction admission if the bootstrap userdata implementation tries to pass it through via kubelet. Besides, the node restriction admission plugin already "provide a place under the kubernetes.io label namespace for kubelets to self-label with kubelet and node-specific labels" [2] so any workload can actually use that allowed selector and it'll pass through via kubelet labels.

I see hardened workload isolation as a Cluster Admin / Service Provider persona task. They can rely on the node-restriction.kubernetes.io/* label prefix to label their Node objects centrally for isolation purposes.

Overall based on the above, I'd expect core karpenter to not make any imposition, and let the bootstrap userdata implementations to impose their opinions and earlier validations for labels.

[1] https://github.com/kubernetes/enhancements/blob/master/keps/sig-architecture/1143-node-role-labels/README.md#goals

[2] https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/279-limit-node-access/README.md#proposal

@sftim
Copy link

sftim commented Jan 8, 2025

Karpenter opted to pick the narrower option until the situation was cleared up. That's a sound choice if there's a worry that the other option could lead to a problem. Imagine if it turned out there was a risk but Karpenter users were already relying on the problem behavior.

Now what's needed is volunteer time to get things sorted, including the Kubernetes docs (so that it's clear to reviewers what is truly expected).

@enxebre enxebre linked a pull request Jan 9, 2025 that will close this issue
@enxebre
Copy link
Member

enxebre commented Jan 9, 2025

I put a PR and tried to clarify in the description #1908

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants