k8sattributes processor instances should share K8s resource caches #36234

swiatekm · 2024-11-06T12:26:43Z

Component(s)

processor/k8sattributes

Is your feature request related to a problem? Please describe.

The k8sattributes processor maintains a local cache of K8s resources it uses to compute metadata. For example, it has a local copy of data for a Pod, so when a log record from that Pod arrives, it can attach Pod metadata to the record. Depending on the configuration and use case, these caches can be quite large in terms of memory consumption, relative to the performance profile of the collector at large.

Currently, each processor instance maintains its own set of caches, even in situations where they could easily be shared. Thus, each instance comes with significant additional memory consumption. Even just having three separate pipelines for metrics, logs and traces, each with a k8sattributes processor, results in 3x the memory consumption.

Describe the solution you'd like

k8sattributes processor instances should share informers. An informer is a kind of local cache which actively keeps itself in sync with the Kubernetes API Server state. The k8sattributes processor already uses informers, which can very much be shared, and there's even tooling to facilitate this kind of sharing.

In terms of implementation, I think we should use a similar approach as memorylimiter processor, where the processor factory holds the shared state. As k8sattributes processors can have different filters set for their watched resources, we need to have separate informer factories for each set of filters, created on demand.

The biggest problem with this approach is managing the lifecycle of these informers. The tooling only allows us to shut them all down collectively, which may leave us with informers running unnecessarily until all k8sattributes processor instances are stopped. In practice, it should be fine to just have a simple counter of processor instances per informer factory and clean up when it reaches 0.

Describe alternatives you've considered

There isn't really any alternative to sharing informers if we want to solve this problem. We could try to build our own solution for this, but I'd strongly prefer the one from client-go unless there's a very good reason to pass on it. We can consider doing so if cleanup ends up becoming a major issue.

Additional context

I'm limiting myself to informer sharing here. The processor also holds a local cache of computed metadata. Instead of sharing that, I'd rather switch to computing it lazily rather than based on event notifications. That can happen in its own issue, though.

github-actions · 2024-11-06T12:26:59Z

Pinging code owners:

processor/k8sattributes: @dmitryax @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

swiatekm · 2024-11-06T12:28:28Z

I'd like to work on this, so if there's no opposition to the idea itself, feel free to assign this to me.

TylerHelmuth · 2024-11-12T20:15:50Z

@swiatekm I'd like some before and after benchmarks for this effort, sounds like it'll be a great performance win.

swiatekm · 2024-11-28T13:46:20Z

I've hit two snags in the course of implementation.

Sharing K8s Clients

Right now the processor creates its own K8s client, which it then passes to informer provider functions to create informers. If two processor instances want to use different K8s clients, then in principle I could create different informers for them. However, the client type is a kubernetes.Interface, which I have no way of telling apart. So I think K8s client instantiation will have to be moved to the informer providers. Currently the processor doesn't use the client for anything else, so this is just a bit of internal refactoring.

Different transform functions between processors

The more serious problem is that we set transform functions on some informers, and that these transform functions can differ depending on processor configuration. In particular, we have a transform function for Pods where we ensure we only keep data we actually use for enrichment. If two processors want different Pod metadata, we have to ensure each gets an informer that actually contains the data they need. To make things more difficult, an informer can only have one transform function set, and it needs to be set before it's started.

I've come up with three possible solutions:

Give them separate informers, giving up on sharing.
Always store all the used data. Processors with the same needs lose out by storing data they don't need, but processors with different needs can now share.
Globally synchronize creating informers for all the processors. This sounds difficult and error-prone, and wouldn't play well with more dynamic use cases, where we want to add or remove processors at runtime.

I'm currently going with 1, but I can see a config option being added for advanced users that would allow switching to 2.

ChrsMark · 2024-12-05T10:57:26Z

Give them separate informers, giving up on sharing.

Always store all the used data. Processors with the same needs lose out by storing data they don't need, but processors with different needs can now share.

Wouldn't 2 just be optimal and simpler? If I understand this correctly you will just have a single informer which will include a union of the required metadata? Or is there any specific drawback with this?

At a high level a Collector instance would only need an informer which would include the union of metadata required by all the K8s API related components. Would that be possible to provide an option for receiver and processor components to "subscribe" to an extension (similar to k8s_observer but more naive) and retrieve the metadata from there? That extension would be responsible for defining the "union" of metadata either dynamically (on startup) or statically through configuration (that would make it a feature for advanced users). This would possibly cover what was mentioned at #36604 (comment).

swiatekm · 2024-12-05T14:13:49Z

Give them separate informers, giving up on sharing.

Always store all the used data. Processors with the same needs lose out by storing data they don't need, but processors with different needs can now share.

Wouldn't 2 just be optimal and simpler? If I understand this correctly you will just have a single informer which will include a union of the required metadata? Or is there any specific drawback with this?

At a high level a Collector instance would only need an informer which would include the union of metadata required by all the K8s API related components. Would that be possible to provide an option for receiver and processor components to "subscribe" to an extension (similar to k8s_observer but more naive) and retrieve the metadata from there? That extension would be responsible for defining the "union" of metadata either dynamically (on startup) or statically through configuration (that would make it a feature for advanced users). This would possibly cover what was mentioned at #36604 (comment).

The extension approach would indeed work, but it'd be much more complex, and also push some of the complexity to users, who'd need to configure an additional extension. That might be what we end up with, but in my opinion, as much of this resource sharing as possible should happen automatically, without users needing to do anything.

Sharing caches between instances of the same component, and also sharing K8s client sets between all components, can be accomplished without any of this additional complexity, and that's what I'd like to start with.

ChrsMark · 2024-12-05T14:25:41Z

@swiatekm agreed! Iterating on this sounds good 👍🏼

swiatekm added enhancement New feature or request needs triage New item requiring triage labels Nov 6, 2024

github-actions bot added the processor/k8sattributes k8s Attributes processor label Nov 6, 2024

github-actions bot mentioned this issue Nov 12, 2024

Weekly Report: 2024-11-05 - 2024-11-12 #36302

Closed

TylerHelmuth assigned swiatekm Nov 12, 2024

TylerHelmuth added priority:p2 Medium and removed needs triage New item requiring triage labels Nov 12, 2024

swiatekm mentioned this issue Nov 29, 2024

[k8sattributesprocessor] Refactor informer initialization #36604

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8sattributes processor instances should share K8s resource caches #36234

k8sattributes processor instances should share K8s resource caches #36234

swiatekm commented Nov 6, 2024 •

edited

Loading

github-actions bot commented Nov 6, 2024

swiatekm commented Nov 6, 2024

TylerHelmuth commented Nov 12, 2024

swiatekm commented Nov 28, 2024 •

edited

Loading

ChrsMark commented Dec 5, 2024

swiatekm commented Dec 5, 2024

ChrsMark commented Dec 5, 2024

k8sattributes processor instances should share K8s resource caches #36234

k8sattributes processor instances should share K8s resource caches #36234

Comments

swiatekm commented Nov 6, 2024 • edited Loading

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Nov 6, 2024

swiatekm commented Nov 6, 2024

TylerHelmuth commented Nov 12, 2024

swiatekm commented Nov 28, 2024 • edited Loading

Sharing K8s Clients

Different transform functions between processors

ChrsMark commented Dec 5, 2024

swiatekm commented Dec 5, 2024

ChrsMark commented Dec 5, 2024

swiatekm commented Nov 6, 2024 •

edited

Loading

swiatekm commented Nov 28, 2024 •

edited

Loading