[ottl] Demonstrate simpler context constructors #36188

djaglowski · 2024-11-05T02:12:41Z

This is just a proof of concept PR, rather than an issue.

I've been working with OTTL contexts lately and noticed that some of the parameters seem to be redundant. Looking back at the commit history, I believe the context constructors were created prior to some refactoring in pdata packages, so the redundancy appears to have been inherited from upstream changes.

My question is whether or not we could introduce a simpler set of constructors at this point, and potentially deprecate the old ones, maybe even push these changes down into the functions (which could access the same information).

TylerHelmuth

I'm not opposed to refactoring these as they have been a source of pain as OTTL has grown. We've had to change them in the past to support new parameters.

I'd much rather introduce something that scales like Options or a struct or something that can easily take a new parameter without breaking the function signature. We keep saying "this is the last parameter we'll need bc that's all the proto defines" but then there are exceptions (like schema).

djaglowski · 2024-11-05T13:41:20Z

I see where you're coming from but I think with pdata not stable we can probably make some clearer decisions. Notably, we can separate it into two questions:

Is there any information we can remove from the contexts?

Here I think the answer is no. If we made anything optional it would force the functions to check for presense and act defensively, likely bail out. This is why I suggest changing the signature but not the information. The same information is available with fewer parameters because of changes to pdata that are now finalized. (e.g. don't need to pass in ScopeSpans and InstrumentationScope, because ScopeSpans has a 1:1 with InstrumentationScope and makes it available via Scope() method.

Will there be additional information we need to add to the contexts?

I can't give a clear answer here, but would note:

With pdata stable, it's less likely to be a problem going forwards.
If new information is needed, it might be added to the parameters already passed in.
If the new information is required, we shouldn't provide it via an optional mechanism.
If the new information is optional, we can add options to the same function, (or add a ..WithOpts version of the same).

TylerHelmuth · 2024-11-05T16:02:44Z

Your arguments are why we've stuck with parameters so far: we want to represent the information as required. I am ok continuing this pattern but since we're taking the opportunity to change the parameters it would be nice to do another review of pdata and make sure there isn't any remaining pieces of information we're missing in the TransformContext.

djaglowski · 2024-11-07T21:50:31Z

it would be nice to do another review of pdata and make sure there isn't any remaining pieces of information we're missing in the TransformContext.

I took a fresh look at the pdata packages and came away with a strategy that is pretty close to what I suggested, with an important difference.

First, I think it's helpful to separate this analysis into two sections. The first looks at signal-specific contexts (log, metric, datapoint, span, spanevent), and the second looks at signal-agnostic contexts (resource, scope). There is a fundamentally different consideration for how these two categories need to be defined, which I will describe later.

Signal Specific Contexts

The signal specific data models follow a consistent pattern. Obviously they are hierarchical, but they are somewhat difficult to understand at times because of all the overlapping names. I think it's easier to illustrate my point if we use a simple analogy, so imagine a tree structure where each node contains (1) multiple values and (2) child elements.

type Node struct {
    value1 any
    value2 any
    children []*Node
}

Our NewTransformContext signatures have evolved over time because the model has changed, but also because we've realized new requirements, e.g. need to modify []children, or need to access value2. The fundamental problem though is that we're effectively passing in a mixture of value1, value2, children, and children[i].

The solution is to construct contexts from the nodes themselves. More specifically, a context should be the sequences of nodes in the hierarchy that leads to the lowest level node. i.e. root, root.children[i], root.children[i].children[j], root.children[i].children[j].children[k]

Using plog as the example, these nodes would be:

plog.Logs (contains value State and children []ResourceLogs)
plog.ResourceLogs (contains values SchemaUrl and Resource, and children []ScopeLogs)
plog.ScopeLogs (contains values SchemaUrl and Scope, and children []ScopeLogs)
plog.LogRecord (contains many values, and no children - equivalent to empty children []*Node)

Based on this insight, my initial proposal falls short because it leaves out the root node, which means we don't have access to a value which may be useful, and we can't modify its slice of children even though we can modify other slices of children in the hierarchy. Now, I'm not certain we really need access to the State, but it might be useful in some cases. What's more likely to me is that we will eventually want to allow modifications to the slice of ResourceLogs, just as we have repeatedly found is necessary at lower levels of the hierarchy.

Therefore, my recommendation is that for signal-specific contexts, we should construct each context starting from top level object, and include exactly the elements of the hierarchy which directly contain slices of sub-elements. This ensures that all possible values and slices are accessible.

log: New(plog.Logs, plog.ResourceLogs, plog.ScopeLogs, plog.LogRecord)
metric: New(pmetric.Metrics, pmetric.ResourceMetrics, pmetric.ScopeMetrics, pmetric.Metric)
datapoint: New(pmetric.Metrics, pmetric.ResourceMetrics, pmetric.ScopeMetrics, pmetric.Metric, pmetric.DataPoint)
span: New(ptrace.Traces, ptrace.ResourceSpans, ptrace.ScopeSpans, ptrace.Span)
spanevent: New(ptrace.Traces, ptrace.ResourceSpans, ptrace.ScopeSpans, ptrace.Span, ptrace.SpanEvent)

Interestingly, the pattern is so consistent that we could even offer constructors based on indices. e.g. NewFromIndices(plog.Logs, 2, 5, 3) would open up the plog.Logs, find the ResourceLogs at index 2, open that up and find the ScopeLogs at index 5, open that up and find the LogRecord at index 3, and be able to build the same exact context. (Not sure, but this might cause problems while iterating. Something we could consider though.)

Signal Agnostic Contexts

To use another analogy, the difference between resource and scope contexts is similar to that between interfaces and structs. The resource and scope contexts are presented consistently across signals, but we're only able to do this by going into each different type of data and extracting from it that which is common. Specifically, ResourceSpans, ResourceMetrics, and ResoureLogs all contain a pcommon.Resource and a SchemaUrl, so the resource context is constructed from these common parts. Scopes follow exactly the same pattern. Importantly, there are no other common fields shared across the signals, so we cannot be missing anything in these contexts.

However, we could in theory define signal-specific versions of the resource and scope contexts, e.g. logresource. This has obvious downsides but I point it out to highlight the fact that if we did this we would follow exactly the same strategy that I've highlighted above, e.g logresource: New(plog.Logs, plog.ResourceLogs)

github-actions · 2024-11-22T05:21:23Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

TylerHelmuth · 2024-11-25T17:24:46Z

Therefore, my recommendation is that for signal-specific contexts, we should construct each context starting from top level object, and include exactly the elements of the hierarchy which directly contain slices of sub-elements. This ensures that all possible values and slices are accessible.

@djaglowski I like this approach. Due to increase interest in exposing the context.Context in OTTL I think add it as the first parameter makes sense.

github-actions · 2024-12-10T05:21:02Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

TylerHelmuth · 2024-12-16T17:18:06Z

@djaglowski if you've still got time for this we'd accept the change.

djaglowski · 2024-12-17T16:43:32Z

@djaglowski if you've still got time for this we'd accept the change.

Hopefully not long after the new year I can make this a priority.

github-actions · 2025-01-01T05:21:06Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

[ottl] Demonstrate simpler context constructors

bc0447d

djaglowski added the pkg/ottl label Nov 5, 2024

github-actions bot requested review from bogdandrutu, evan-bradley, kentquirk and TylerHelmuth November 5, 2024 02:12

TylerHelmuth reviewed Nov 5, 2024

View reviewed changes

github-actions bot added the Stale label Nov 22, 2024

github-actions bot removed the Stale label Nov 26, 2024

github-actions bot added the Stale label Dec 10, 2024

github-actions bot removed the Stale label Dec 17, 2024

github-actions bot added the Stale label Jan 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ottl] Demonstrate simpler context constructors #36188

[ottl] Demonstrate simpler context constructors #36188

djaglowski commented Nov 5, 2024

TylerHelmuth left a comment

djaglowski commented Nov 5, 2024

TylerHelmuth commented Nov 5, 2024

djaglowski commented Nov 7, 2024 •

edited

Loading

github-actions bot commented Nov 22, 2024

TylerHelmuth commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Dec 10, 2024

TylerHelmuth commented Dec 16, 2024

djaglowski commented Dec 17, 2024

github-actions bot commented Jan 1, 2025

[ottl] Demonstrate simpler context constructors #36188

Are you sure you want to change the base?

[ottl] Demonstrate simpler context constructors #36188

Conversation

djaglowski commented Nov 5, 2024

TylerHelmuth left a comment

Choose a reason for hiding this comment

djaglowski commented Nov 5, 2024

TylerHelmuth commented Nov 5, 2024

djaglowski commented Nov 7, 2024 • edited Loading

Signal Specific Contexts

Signal Agnostic Contexts

github-actions bot commented Nov 22, 2024

TylerHelmuth commented Nov 25, 2024 • edited Loading

github-actions bot commented Dec 10, 2024

TylerHelmuth commented Dec 16, 2024

djaglowski commented Dec 17, 2024

github-actions bot commented Jan 1, 2025

djaglowski commented Nov 7, 2024 •

edited

Loading

TylerHelmuth commented Nov 25, 2024 •

edited

Loading