-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple processor metrics #10708
Comments
I would caution against automatically recording size, it's not that cheap to compute and the value of such metric is lower than plain counts. |
I agree. To clarify, I would expect that size metrics are disabled by default and the cost should only be incurred if enabled. Edit: What I mean by "automatically" here is that the individual components do not need to be concerned with recording them. They would be handled in a standard way by the collector's |
I have been working on offering guidance for metrics exposed by the collector here: #10365 From that guidance, my question would be: what attributes do you want to set on those metrics, and what cardinality will they have? It looks like you have none defined in your code sample, which would make identifying the processor used difficult. @codeboten has a RFC out for internal telemetry https://github.com/open-telemetry/opentelemetry-collector/pull/10695/files From that RFC, should we consider to prefix metrics with |
Generally, I would expect that the attributes should identify the specific instance of the processor within the collector. Today, that can be established with the following info:
That said, the pipeline ID may be unnecessary if we implement #10664. The reason is because if we instance processors by data type instead of pipeline, then the only identifying info in the pipeline ID is the data type itself but all proposed processor metrics are already differentiated by data type. Therefore, I do not believe we'd need anything besides the component ID, which we already have in Thanks for pointing out the RFC - I will take a look at it and consider how this proposal might fit. |
#### Description Implements ingoing and outgoing counts as described in #10708.
) #### Description Implements ingoing and outgoing counts as described in open-telemetry#10708.
Discussing with @codeboten and @mx-psi, we think it would make sense to add an |
Not sure this is the right place to ask this but is there a plan to provide parity with refused/dropped as it relates to the memory limiter processor? I might be missing something but I don't think I can reason about back pressure and data loss in the same way I was able to previously. |
It probably requires a separate proposal but I doubt there would be resistance to the idea. This proposal isn't arguing against that. I am just advocating for doing what can be done for all processors so that we have some uniform metrics. Generally speaking, we should add useful metrics to individual component, so I think you could propose to re-add refused/dropped. I also think refused/dropped may be universally relevant, even if theycannot be automatically recorded. Therefore, perhaps the end state is that we formalize the semantics of those metrics, but leave implementation to component authors. Personally, I think we need to get the basics in place before there is bandwidth to deal with the formal semantics, but having refused/dropped where they already were seems reasonable to me. |
Closing in favor of #11343 |
Is your feature request related to a problem? Please describe.
We currently define a set of metrics for processors which cannot be automatically recorded by the collector framework, and as far as I can tell are not actually recorded by any processors in core or contrib. In short, the metrics describe the number of items accepted, dropped, inserted, and refused. (I believe these metrics are loosely inspired by proposals such as open-telemetry/oteps#259, but no such design has actually landed.)
We also have proposals such as #10355 which would measure size of data according to the same categories, but again, it is not clear that there is any path to automatically recording these values. In the case of size metrics, it would additionally be very difficult for processor developers to accurately capture such measurements because it's unclear how to attribute inserted, dropped, or refused sizes as a portion of an overall payload.
Describe the solution you'd like
Looking at this from a collector implementation perspective, what we clearly can observe automatically, is the data being passed in and out of each processor. It would be difficult to determine precisely which parts of that data were inserted or deleted, but we can certainly count the number of items and measure the size of the payload (according to a standard such as
ProtoMarshaler.Sizer()
).Therefore, I propose that we should define new metrics in
processorhelper
which are automatically recorded. If another more nuanced proposal eventually lands, we can add those later.Importantly, in order to handle error propagation consistently, these metrics would be recorded after the processor successfully passes data to the next consumer. This can be accomplished within
processorhelper
by looking at data before and after the processing function, and taking into consideration the result of passing the data along to the next consumer.Specifically, I propose the following metrics per data type. I'm showing only logs but assume there would be equivalents for metrics and traces.
Incoming and outgoing counts.
Incoming and outgoing sizes. These would be measured only if enabled.
The text was updated successfully, but these errors were encountered: