-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new component: deltatocumulative processor #30479
Comments
I'm sponsoring this, as it's important for our Prometheus story. |
This is a duplicate of #29300, which I've also offered to sponsor. I'm glad to see there are others interested in this as well. The main thing I think we need to do in order to combine these proposals is decide whether or not the component will be split into two parts, as described here. I would appreciate your thoughts on this @sh0rez and @jpkrohling. @RichieSams, looks like there is already an implementation started so I would recommend pausing until we work out which approach we are going to use. Also cc: @0x006EA1E5 who proposed the split design. |
@djaglowski looking at the comment you referenced, it appears the proposal is the following:
is that correct? If so, that aligns perfectly with our work here, because #30705 implements exactly the behavior described in (1). |
@RichieSams I think I'm covered in terms of code writing, but a review of #30706 and #30707 would be incredibly helpful! Let me know if I can assist in #29461 in any way |
Hi, sorry just catching up with this. I'm very happy this is moving forward, and happy to help wherever I can... |
I've been trying to think of use-cases, edge cases etc.
I'm wondering, as long this processor can handle case (1) above (well formatted delta datapoints missing a Put another way, what is the simplest possible implementation of this processor, assuming upstream "sanitisation"? |
I am not sh0rz,I am shorez ,hahahah
北京欢迎你~~~~~~ 如果来中国旅游的话
…________________________________
发件人: Greg Eales ***@***.***>
发送时间: 2024年1月23日 15:31
收件人: open-telemetry/opentelemetry-collector-contrib ***@***.***>
抄送: shorez ***@***.***>; Mention ***@***.***>
主题: Re: [open-telemetry/opentelemetry-collector-contrib] new component: deltatocumulative processor (Issue #30479)
Hi, sorry just catching up with this. I'm very happy this is moving forward, and happy to help wherever I can...
―
Reply to this email directly, view it on GitHub<#30479 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3MURMQTGG5D5SWXVMGPPLYP7JUBAVCNFSM6AAAAABBYK2XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWGI4TSMJWGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I think it makes sense for the configuration for this processor and #29461 to align with the All three processors would also need to have a similar "MetricIdentity" feature. Would it make sense to factor out this common code? There is also something somewhat similar in the As I understand it, we can use the metrics spec as a reference as to what is an "identifying" property when building the "Metric Identity" used to track a metric |
I agree we should probably look at creating a shared set of Using hashes of the metric attributes, etc. |
I was thinking about this during the implementation of this component and preliminarily decided against it. |
Exactly! I wrote a bit about this in the design doc: https://docs.google.com/document/d/1Oqwl5rDLqB6-Qgd6Hy1PXYZBAH4pkcdudxNA7bRkrIc/edit#heading=h.a00fffk0v68v
yeah I felt like this was generally useful while writing this, happy to factor this out to a common place! use-cases I can think of right now are |
When I first looked at this, my first thought was to use some kind of non-crypto hash too, but then I looked at It made me wonder, is the space saving worth the hashing cost (which has to be performed on every single input), vs the cost of using a string builder? How big do you make the hash to avoid collisions, etc? I don't know the answer to that, my feeling would be that any gain may be marginal at best, so for my POC implementation I just went with concat + separator (basically a copy / paste from the prometheus exporter). As I implied above, once I started to analyse this in more detail, I noticed that this is something the In general, my heuristic approach would be, if in doubt, keep it simple, and optimise later when you know there is a concrete need. That's not to say I'm against using a hash, just that I wasn't personally confident it was worth the cost of change etc, compared to simply factoring out and reusing the current Indeed, if the consensus is that a hash is a better fit here, I'm keen to hear it! 😄 |
But they're feeding the string directly into golang's hash function (because they're using the string as the key lookup in a map). So it will be basically identical. We're just skipping the step of having to having the big byte array in memory / use a pool of buffers |
Doesn't the Is the hash used here not going to get occasional collisions? Collisions are a fair trade off in a map, as any collision will be resolved by the map also having the actual key to compare. But wouldn't you want a much bigger hash if it's intended as a UUID and you want to be sure there won't be any collisions? What is the expected benefit of using a hash? What are the risks/consequences of something going wrong? Just kind of worries me, when it is not even clear that there is a significant problem here that needs to be solved in the first place 🤷 |
sync.Map() (and golang native map) call Both solutions will do a hash. That's just how maps work.
Yes, which is fine. The std-lib map implementation needs to deal with hash collisions anyway. So this isn't a problem. FNV has "good-enough" hash properties for our use-case.
Just to clarify, we're comparing:
@sh0rez 's approach cleaner / clearer what is happening IMO. And we don't need to do large string builder manipulations.
The only risk IMO is forgetting to include a "unique" bit of data in the hash combination. But that should be trivial to audit. |
@sh0rez I guess it would be nice to know why the |
@RichieSams Sorry, I'm being slow today (I actually have mild COVID, so my brain is not really cooperating 😅)
But how can it deal with a collision if the hash is done to create the actual key? 🤔 In a map, hash collisions are just a performance hit, as the hash is just telling the map which bucket to look in, so a collision means that more than one item is in the bucket. Worst case scenario, the map iterates over the bucket items until it finds an exact match for the key. In the case of using a long string as the key, the map can always compare one long string to another if needed, and get the right item. In the (admittedly very rare) case of a new metric producing the same hash as another, already tracked metric, you'll get the original metric's tracker returned by the map. Are you then going to compare the tracked attributes etc against the incoming item, to check there hasn't been a collision? What do you do if there is a collision, track both items under the same key? You're then kind of implementing a map on top of a map. That's obviously not viable, so you're going to end up just saying that collisions happen so infrequently that you'll pretend they don't happen. So, for me, the consequences of something going wrong are, (very infrequently) sometimes you'll mangle the users data, and add deltas from one metric to another's cumulative. You're sacrificing a tiny bit of correctness, so that you don't need to build and store the string. |
Actually, looking at this, the code here is actually storing all the metric's attributes in the map (am I reading that right?) Something like this is required, as the emitted cumulative data points need to have all the same attributes as the source datapoint (I guess the alternative would be to copy from each incoming delta datapoint and not store it at all) So, wouldn't it be possible to have the name, all the attributes etc, as the key, then the map's item could just be the cumulative sum and the timestamps etc? 🤔 This way there isn't even a memory overhead, as you've just moved the data from the map's value to the key. Maybe that doesn't work for some reason? |
@0x006EA1E5 @RichieSams let me try to provide some clarity on metric identity and hashing. The OTel metrics spec defines the fields required to identify a metric, and also how to identify a single stream within a metric. I have written about that in my design doc: https://docs.google.com/document/d/1Oqwl5rDLqB6-Qgd6Hy1PXYZBAH4pkcdudxNA7bRkrIc/edit#heading=h.a00fffk0v68v. This is also implemented in Go has the concept of comparable data types. Those need to follow some data rules, but then can be directly used with Because Having those @0x006EA1E5 The The |
@0x006EA1E5 I guess it would be nice to know why the cumulativetodeltaprocessor has this include/exclude config in the first place (PR: #8952). Perhaps it predates some of the other feature that can do it better? That's a good question. Maybe @jpkrohling @codeboten @Aneurysm9 can clarify? |
It looks like @TylerHelmuth might be able to help here, here's the closest I could find about this: #5877 (comment) Today, I think I'd prefer to see a connector solution to deal with include/exclude though. I would prefer to not have this include/exclude in the new component unless we have concrete use-cases for it that cannot be done otherwise, or can only done in a way that severely impacts performance. |
There is #25161 to address filter interface for receivers. Ultimately the cumulativetodelta allows specifying which metrics to compact because that gives users more control over their data. I tend to favor giving users absolute control, even if that means they can hurt themselves. The cumulativetodelta processor will convert everything if not You can also safely choose not to allow |
I agree with this because the alternative IMO leads to a mess of overlapping functionality. Include/exclude is not any more related to this processor than any other type of transformation which a user may need to do. We're better off decomposing the functionality so that users can manage their data however they want.
I don't think this actually gives users more control. It just pulls a specific type of control into the component. The problem is that you could make the same case for adding almost any type of processing to almost any component. The reason we have components is so that they can be composed as necessary. |
Sorry if this is a silly question, but what would be the best way to address the use case where we want to selectively process some telemetry? 🤔 Would we use the routingconnector and the forwardconnector? Something like:
? |
Yes, I believe we should be relying on connectors for routing telemetry to the appropriate processors. |
@sh0rez Can I ask what the plan is for the common functionality that we can expect to be shared with the implementation of #29461 (metric identity etc)? It looks like you are doing everything within the Do we intend to also refactor the I'm wondering, have you considered simply refactoring This could mean either simply adopting the The value of this would be we don't diverge between It seems to me that we will surely have to factor out common code at some point soon in any case (as #29461 is required to make the Is this a useful suggestion, or does it make things unnecessarily more difficult for you? |
yes I considered that, however I chose against doing so for two major reasons:
I factored out generic tracking in #31017 (comment) recently, which was then picked up by @RichieSams in #31089 (comment). Looking at that PR and his other work on the interval processor, the proposed api seems to be as versatile and composable as I had hoped for. Converting the cumulativetodelta at a later point should be rather straightforward as well, bringing the performance benefits there as well. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
…code owners (#33019) **Description:** <Describe what has changed.> <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> @jpkrohling and @djaglowski volunteered to be sponsors of the delta to cumulative processor, and @djaglowski also volunteered to be sponsor of the interval processor in relation to this. They should also be code owners. From [CONTRIBUTING.md](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#adding-new-components): ``` A sponsor is an approver who will be in charge of being the official reviewer of the code and become a code owner for the component. ``` **Link to tracking Issue:** <Issue number if applicable> #30479 - Delta to cumulative processor #29461 - Interval processor --------- Co-authored-by: Juraci Paixão Kröhling <[email protected]>
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
I'm closing this, as the first versions of this component are done (even though it's still under development). |
Important
This component is in active development. See progress in #30705
The purpose and use-cases of the new component
In similar fashion to the existing
cumulativetodeltaprocessor
, this component aggregates delta samples to their respective cumulative counterpartAn extensive design doc has been written to explore this idea.
Example configuration for the component
Telemetry data types supported
Code Owner(s)
@gouthamve @sh0rez
Sponsor (optional)
@jpkrohling
Additional context
Prior discussion:
The text was updated successfully, but these errors were encountered: