v1.12.0 exposes some high cardinality metrics #967

prymitive · 2022-01-19T11:29:18Z

It seems that v1.12.0 includes some extra high cardinality metrics, most likely autogenerated by Go itself thanks to #955.

Here's a full diff of metrics between 1.11 and 1.12 - https://gist.github.com/prymitive/3d16b79a8187448bf6e0d61db7336ca0

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

kakkoyun · 2022-01-20T05:28:15Z

Thanks for reporting. It seems like we need to think more about default bucket configuration.
@mknyszek what do you think?

kakkoyun · 2022-01-20T12:57:01Z

cc @bwplotka

beorn7 · 2022-01-20T13:16:53Z

Yeah, that's rather extreme for default metrics. Perhaps those could be opt-in at that granularity.

I would also like to change that before releasing the final Prometheus v2.33.

mknyszek · 2022-01-20T16:11:31Z

Yeah, the histograms in runtime/metrics are large by some standards. The time-related ones have pretty high granularity, but it's fairly straightforward to downsample and make them exponentially bucketed. Maybe some code to check a threshold on bucket granularity and then downsample if necessary?

FWIW I think it would be nice keep the allocs-by-size and frees-by-size as-is -- they're bounded by the number of size classes (currently; that can change but likely won't change drastically), so that's 136 total. Is that too much? Perhaps that's why MemStats.BySize wasn't included initially.

mknyszek · 2022-01-20T16:11:55Z

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

Out of curiosity, what kind of problems do you expect?

beorn7 · 2022-01-20T16:18:37Z

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

Out of curiosity, what kind of problems do you expect?

If these metrics came with a newly introduced package, it might be fine. Users could do the resource planning for it or decide to drop them on ingestion. However, the problem is that these default Go metrics are in a lot of existing binaries, and all of them will suddenly expose a significantly increased number of metrics after merely bumping the dependencies to the current minor release. This might easily push many users beyond their cardinality budget.

mknyszek · 2022-01-20T16:21:23Z

This might easily push many users beyond their cardinality budget.

Not to derail this too hard, but just for my understanding: what does this cardinality budget look like? Memory? Throughput?

prymitive · 2022-01-20T16:23:43Z

Each unique time series eats ~4KB on average in our production environments. The diff I posted has more than 1k lines. Adding 1k extra metrics adds ~4MB of Prometheus memory per instance for every Go service scraped by Prometheus (plus some memory for exposing those metrics in those services). 4MB might not be much but it adds up quickly.

mknyszek · 2022-01-20T16:30:04Z

Thanks, that's good to know. I meant to say this earlier, but I'll send a PR to reduce the bucket granularity.

I'm thinking something like a function that iterates over the buckets and merges buckets within either a factor of 2 or a factor of 10 (depending on the metric unit), making the buckets exponential. Does this sound reasonable?

Opt-in to higher granularity can be future work; an exponential distribution is already pretty useful.

prymitive · 2022-01-20T16:44:24Z

FWIW I think it would be nice keep the allocs-by-size and frees-by-size as-is -- they're bounded by the number of size classes (currently; that can change but likely won't change drastically), so that's 136 total. Is that too much? Perhaps that's why MemStats.BySize wasn't included initially.

Is 136 too much?
I see that an example Go service I run with only a few custom metrics exposes ~54 time series on average across all instances. If you add 136 time series to that you're more than doubling metrics footprint of client_golang and that's a big jump. Nowhere near as huge as adding over a 1k time series but still relatively large.
It's probably gonna be fine for some people, might be too much for others. It can be dropped via relabel rules if needed, but that requires everyone to update scrape configs.

mknyszek · 2022-01-20T16:53:57Z

@prymitive Sure. I can also reduce the impact of this further if there's a way to alias metrics (i.e. just say "here's metric name, but it's really just this other metric, so use that data instead") since some existing metrics are re-exported under new names. I can also drop the new names for just those metrics altogether, but that would make for a somewhat worse user experience. Currently, the names of the metrics are generated from the runtime/metrics keys, so there's a clear mapping and it's easy to find a metric from knowing its runtime/metrics key and vice versa.

beorn7 · 2022-01-20T18:50:19Z

I leave it to @bwplotka and @kakkoyun to make the call what increase in metric cardinality is fine for a minor release. As I might have said before, this library is used a lot, so we have to navigate with care and in doubt make potentially disruptive features opt-in.

mknyszek · 2022-01-20T19:43:39Z

For sure. The current state of things is not great. I didn't anticipate that each histogram bucket would be treated as an individual metric, and I also forgot quite how big I made these histograms. 😅

Going back over the list, it looks like at a bare minimum (without removing "duplicate" metrics -- I'd rather not because the new forms of them are more orthogonal and clearer; the divisions and groupings of memory in runtime.MemStats are pretty arbitrary), that's 25 new individual metrics and 4 new histograms.

2 of those new histograms are time histograms. They're high dynamic range histograms which are a combination of exponentially and linearly bucketed. Collapsing those down to be exponentially bucketed, we get 45 buckets each. We can then square the base of the exponential series and cut that down to 23, and then cut out a lot of the higher buckets (the range goes all the way up to O(days), we can just catch everything above 1 second into its own bucket for all of the existing ones). That gets us down to about 15. For the other two histograms, we can just do power-of-two exponential bucketing which collapses the 67 or so buckets down to 13 or 14.

To be clear, this code would be generic, so any new runtime/metrics histograms metrics with unit "bytes" or "seconds" would immediately be minimized. I can add a test that ensures new metrics to be audited via the generator do not accidentally create a whole bunch of bloat.

Another alternative is a Summary but I have a slight preference against that, just because it implicitly assumes a normal distribution of the underlying data where it may not exist, resulting in potentially misleading information.

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can ~~cut out metrics that already exist in their old form (in some fashion)~~ (EDIT: cut out new metrics that represent the same information as old metrics with a new name and a slightly different form) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it. Earlier I asked about an aliasing mechanism because some of these are a 1:1 mapping, and so we could avoid the storage of both by just having one point to the other's data in the backend.

Side-note: the reasoning behind these huge histograms is it doesn't cost much (< 10 KiB I'll guess?) to actually manage these in the runtime, and it's better for the runtime to expose more granular data than less granular data. It's totally reasonable for downstream users to adjust that granularity.

kakkoyun · 2022-01-21T06:36:04Z

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can cut out metrics that already exist in their old form (in some fashion) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it.

First of all, let's not remove old metrics. There are lots of dashboards out there would be broken.

And @mknyszek as you suggested let's try to aggressively reduce the number of the buckets. With the a reason number of cardinality increase and better communication, this should be doable.

To be clear, this code would be generic, so any new runtime/metrics histograms metrics with unit "bytes" or "seconds" would immediately be minimized. I can add a test that ensures new metrics to be audited via the generator do not accidentally create a whole bunch of bloat.

Yes, tests are essential for this case.

And also as @beorn7 suggested, maybe we should go in a direction to make these additional costly metrics opt-in rather than enabling them implicitly until the major version bump.

Or as it already discussed we can start with really conservative bucket distribution for all of them and increase the granularity with configuration rather than having an additional collector.

Personally I would prefer this approach more. Otherwise, it would take longer times to discover these new useful metrics.

What do you think? @bwplotka

mknyszek · 2022-01-21T06:48:18Z

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can cut out metrics that already exist in their old form (in some fashion) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it.

First of all, let's not remove old metrics. There are lots of dashboards out there would be broken.

Er, very sorry. I was definitely not suggesting that. I recognize how important that is. And I can see how what I wrote that reads that way, but I meant that we can cut out new metrics that already exist in some form.

And @mknyszek as you suggested let's try to aggressively reduce the number of the buckets. With the a reason number of cardinality increase and better communication, this should be doable.

I have a WIP implementation of the reduced bucketing that means 10 exponential buckets for 2 of the histograms, and 13 for the other 2. That makes for a total of 25+10+10+13+13=71 new metrics. I can have this up for review with tests tomorrow.

bwplotka · 2022-01-21T09:06:15Z

Thanks, reducing those histograms is definitely the first step.

What if we make it so a user has 3 options:

Have new ones from MemStats (default)
Have new with extended buckets and ones from MemStats
Have only old ones from MemStats for those who don't want extra 71 metrics?

Potentially also 4th one? without MemStats?

beorn7 · 2022-01-21T13:35:40Z

Also a general note: I love high-res histograms, but Prometheus doesn't really do those yet. We are working on it, but it will take months before they are generally available (at which point we should definitely offer those awesome metrics for users of the new high-res histograms at fine granularity).

mknyszek · 2022-01-21T16:59:17Z

@bwplotka Yeah, that works. My apologies for not just doing this up-front, but understanding the constraints now I get where you're coming from.

Looking back over the comment history, if the biggest concern here is that 71 metrics being added for a minor release is too much, then I'm fine with only exporting the old metrics, backed by MemStats, and then flipping to the added 71 metrics on the major release (is that something that's happening relatively soon?).

Finally, how should I expose the options? A new GoCollector API in the collector package that has the options, with the default one just an implementation that chooses one of them? e.g.

func CustomGoCollector(kind GoCollectorKind) Collector
...
func GoCollector() Collector {
    return CustomGoCollector(GoCollectorMemStatsOnly)
}

(This is not the exact code I'd write, just a mock-up to express the idea.)

@beorn7 Cool! Looking forward to that. Honestly though I wasn't necessarily expecting the high granularity histograms to always be used in production anyway; this whole issue is basically just because I forgot how big they were, otherwise I would've done the bucket minimization up-front.

jameshartig · 2022-01-23T03:23:08Z

I found this issue after seeing this:

and we only have this deployed to 1 service (out of 50+) and already 2 of the new histograms are already in the top 3 of our metrics. go_sched_latencies_seconds_bucket and go_gc_pauses_seconds_total_bucket both have over 700 buckets per instance for each of them. With over 3000 go instances, this could easily be millions of metrics if deployed to all of our services.

I definitely agree this is completely unexpected for a minor release.

beorn7 · 2022-01-23T19:24:41Z

Yeah, I think this deserves a timely bugfix release. Together with #969.

mknyszek · 2022-01-24T05:44:00Z

Sent #974 to mitigate the immediate problem. Waiting on advice on the best way to expose multiple options so users may opt out of the additional 79 values introduced in this release (I was off by a few because I forgot about the sum and count dimensions of histograms).

bwplotka · 2022-01-25T22:01:45Z

Sorry for the lag @mknyszek and thanks for the quick reaction! Let's aim to release this week.

Finally, how should I expose the options?

I would propose adding variadic option pattern so from func NewGoCollector() prometheus.Collector do func NewGoCollector(opts...GoOption) prometheus.Collector where GoOption is some func(o *goOptions) and then we can have WithMemStatsEnabled(bool) WithRuntimeMetricsEnabled(bool) and both true by default? 🤔 Then also WithExpandedBuckets some day?

mknyszek · 2022-01-28T00:52:58Z

Given how big #974 has gotten, I'll do the options as a separate change. Sorry for the delay this week, I'm on a rotation.

beorn7 · 2022-01-28T21:41:46Z

Thank you very much for fixing this quickly.

@kakkoyun are you planning to cut 1.12.1 any time soon? It would help me to cut the final Prometheus v2.33.0 release (the rc.1 has client_golang 1.12.0 with the race condition and the excessive amount of Go metrics).

kakkoyun · 2022-01-29T05:04:21Z

Yes, I will do it today Björn. And ping you when it’s out.

On Sat 29. Jan 2022 at 00:41, Björn Rabenstein ***@***.***> wrote: Thank you very much for fixing this quickly. @kakkoyun <https://github.com/kakkoyun> are you planning to cut 1.12.1 any time soon? It would help me to cut the final Prometheus v2.33.0 release (the rc.1 has client_golang 1.12.0 with the race condition and the excessive amount of Go metrics). — Reply to this email directly, view it on GitHub <#967 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEC7AKHQWBIP3DCVZ7I3NLUYMESLANCNFSM5MJVQSFA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Kemal Akkoyun

prymitive · 2022-02-03T10:40:02Z

Thanks for fixing excessive bucket number count.
I see it still adds 80 extra metrics, it's not a small number but something most of us can live with.
There's only one bamboozle I have after upgrading to v1.12.1:

# HELP go_gc_pauses_seconds_total Distribution individual GC-related stop-the-world pause latencies.
# TYPE go_gc_pauses_seconds_total histogram
go_gc_pauses_seconds_total_bucket{le="-5e-324"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-10"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-09"} 0
go_gc_pauses_seconds_total_bucket{le="1.2799999999999998e-07"} 0
go_gc_pauses_seconds_total_bucket{le="1.2799999999999998e-06"} 0
go_gc_pauses_seconds_total_bucket{le="1.6383999999999998e-05"} 7598
go_gc_pauses_seconds_total_bucket{le="0.00016383999999999998"} 31961
go_gc_pauses_seconds_total_bucket{le="0.0020971519999999997"} 33919
go_gc_pauses_seconds_total_bucket{le="0.020971519999999997"} 34705
go_gc_pauses_seconds_total_bucket{le="0.26843545599999996"} 34708
go_gc_pauses_seconds_total_bucket{le="+Inf"} 34708
go_gc_pauses_seconds_total_sum NaN
go_gc_pauses_seconds_total_count 34

Isn't le="-5e-324" a negative number? Is that a valid le value?

beorn7 · 2022-02-03T12:08:37Z

It is valid (or in other words: Prometheus histograms do support buckets for negative observations). But in this case, the observed GC pauses should never last a negative amount of time (that would mean the pause ended before it started? ;) .

I assume that's just a floating point precision issue, and the bucket boundary should perhaps be adjusted (to zero?).

@mknyszek you will know more about the internals here.

mknyszek · 2022-02-03T18:07:41Z

@beorn7 is correct. It's the math.NextAfter value for zero (toward -Inf). It is unfortunately sometimes possible, due to OS or runtime bugs, that samples end up in this bucket (probably not for GC pauses, but for scheduling latencies for example). This bucket is supposed to capture any of those cases, but I think there's a point to be made that it might not be worth surfacing that to the user. I'm fine with leaving it in or absorbing it into the next bucket.

And also apologies for not getting around to the opt-out API yet, I've gotten distracted with other things.

bwplotka · 2022-02-05T12:48:09Z

No worries @mknyszek and thanks for helping!

I created another issue for opt-out API, (#983), so hopefully, users like @prymitive can fall back to the old behaviour if needed. Also help wanted if anyone else want to contribute this! 🤗

… metrics by default. Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]>

bwplotka · 2022-04-13T18:00:19Z

Hi All.

With new optionality, I would propose reverting the addition (by default) of new metrics. Please voice your opinion here: #1033

… metrics by default. (#1033) Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]>

v1.12.2 removes some high cardinality metrics introduced in v1.12.0. See prometheus/client_golang#967 for context.

v1.12.2 removes some high cardinality metrics introduced in v1.12.0. See prometheus/client_golang#967 for context. Signed-off-by: Ben Reedy <[email protected]>

* Cut v1.12.0 Signed-off-by: Kemal Akkoyun <[email protected]> * Bump the day Signed-off-by: Kemal Akkoyun <[email protected]> * Make the Go 1.17 collector thread-safe (#969) * Use simpler locking in the Go 1.17 collector (#975) A previous PR made it so that the Go 1.17 collector locked only around uses of rmSampleBuf, but really that means that Metric values may be sent over the channel containing some values from future metrics.Read calls. While generally-speaking this isn't a problem, we lose any consistency guarantees provided by the runtime/metrics package. Also, that optimization to not just lock around all of Collect was premature. Truthfully, Collect is called relatively infrequently, and its critical path is fairly fast (10s of µs). To prove it, this change also adds a benchmark. name old time/op new time/op delta GoCollector-16 43.7µs ± 2% 43.2µs ± 2% ~ (p=0.190 n=9+9) Note that because the benchmark is single-threaded it actually looks like it might be getting *slightly* faster, because all those Collect calls for the Metrics are direct calls instead of interface calls. Signed-off-by: Michael Anthony Knyszek <[email protected]> * API client: make http reads more efficient (#976) Replace `io.ReadAll` with `bytes.Buffer.ReadFrom`. Both need to resize a buffer until they have finished reading; the former increases by 1.25x each time while the latter uses 2x. Also added a benchmark to demonstrate the benefit: name old time/op new time/op delta Client/4KB-8 35.9µs ± 4% 35.3µs ± 3% ~ (p=0.310 n=5+5) Client/50KB-8 83.1µs ± 8% 69.5µs ± 1% -16.37% (p=0.008 n=5+5) Client/1000KB-8 891µs ± 6% 750µs ± 0% -15.83% (p=0.016 n=5+4) Client/2000KB-8 1.74ms ± 2% 1.35ms ± 1% -22.72% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Client/4KB-8 20.2kB ± 0% 20.4kB ± 0% +1.26% (p=0.008 n=5+5) Client/50KB-8 218kB ± 0% 136kB ± 0% -37.65% (p=0.008 n=5+5) Client/1000KB-8 5.88MB ± 0% 2.11MB ± 0% -64.10% (p=0.008 n=5+5) Client/2000KB-8 11.7MB ± 0% 4.2MB ± 0% -63.93% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Client/4KB-8 75.0 ± 0% 72.0 ± 0% -4.00% (p=0.008 n=5+5) Client/50KB-8 109 ± 0% 98 ± 0% -10.09% (p=0.079 n=4+5) Client/1000KB-8 617 ± 0% 593 ± 0% -3.89% (p=0.008 n=5+5) Client/2000KB-8 1.13k ± 0% 1.09k ± 0% -3.27% (p=0.008 n=5+5) Signed-off-by: Bryan Boreham <[email protected]> * Reduce granularity of histogram buckets for Go 1.17 collector (#974) The Go runtime/metrics package currently exports extremely granular histograms. Exponentially bucket any histogram with unit "seconds" or "bytes" instead to dramatically reduce the number of buckets, and thus the number of metrics. This change also adds a test to check for expected cardinality to prevent cardinality surprises in the future. Signed-off-by: Michael Anthony Knyszek <[email protected]> * Cut v1.12.1 (#978) * Cut v1.12.1 Signed-off-by: Kemal Akkoyun <[email protected]> * Apply review suggestions Signed-off-by: Kemal Akkoyun <[email protected]> * Fix deprecated `NewBuildInfoCollector` API Update `examples/random/main.go`: `prometheus.NewBuildInfoCollector` is deprecated. Use `collectors.NewBuildInfoCollector` instead. Signed-off-by: alissa-tung <[email protected]> * gocollector: Added options to Go Collector for changing the (#1031) * Renamed files. Signed-off-by: Bartlomiej Plotka <[email protected]> * gocollector: Added options to Go Collector for diffetent collections. Fixes #983 Also: * fixed TestMemStatsEquivalence, it was noop before (: * Removed gc_cpu_fraction metric completely, since it's not working completely for Go1.17+ Signed-off-by: Bartlomiej Plotka <[email protected]> * gocollector: Reverted client_golang v1.12 addition of runtime/metrics metrics by default. (#1033) Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]> * prometheus: Fix convention violating names for generated collector metrics (#1048) * Fix convention violating names for generated collector metrics Signed-off-by: Kemal Akkoyun <[email protected]> * Add new Go collector example Signed-off-by: Kemal Akkoyun <[email protected]> * Remove -Inf buckets from go collector histograms (#1049) * Remove -Inf buckets from go collector histograms Signed-off-by: Kemal Akkoyun <[email protected]> * Update prometheus/collectors/go_collector_latest_test.go Co-authored-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Kemal Akkoyun <[email protected]> * Simplify Signed-off-by: Kemal Akkoyun <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]> * Cut v1.12.2 (#1052) * Cut v1.12.2 Signed-off-by: Kemal Akkoyun <[email protected]> * Apply suggestions Signed-off-by: Kemal Akkoyun <[email protected]> * Update CHANGELOG.md Co-authored-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Kemal Akkoyun <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]> Co-authored-by: Kemal Akkoyun <[email protected]> Co-authored-by: Michael Knyszek <[email protected]> Co-authored-by: Bryan Boreham <[email protected]> Co-authored-by: Kemal Akkoyun <[email protected]> Co-authored-by: alissa-tung <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]>

… metrics by default. (#1033) Fixes prometheus/client_golang#967 Signed-off-by: Bartlomiej Plotka <[email protected]>

* Cut v1.12.0 Signed-off-by: Kemal Akkoyun <[email protected]> * Bump the day Signed-off-by: Kemal Akkoyun <[email protected]> * Make the Go 1.17 collector thread-safe (#969) * Use simpler locking in the Go 1.17 collector (#975) A previous PR made it so that the Go 1.17 collector locked only around uses of rmSampleBuf, but really that means that Metric values may be sent over the channel containing some values from future metrics.Read calls. While generally-speaking this isn't a problem, we lose any consistency guarantees provided by the runtime/metrics package. Also, that optimization to not just lock around all of Collect was premature. Truthfully, Collect is called relatively infrequently, and its critical path is fairly fast (10s of µs). To prove it, this change also adds a benchmark. name old time/op new time/op delta GoCollector-16 43.7µs ± 2% 43.2µs ± 2% ~ (p=0.190 n=9+9) Note that because the benchmark is single-threaded it actually looks like it might be getting *slightly* faster, because all those Collect calls for the Metrics are direct calls instead of interface calls. Signed-off-by: Michael Anthony Knyszek <[email protected]> * API client: make http reads more efficient (#976) Replace `io.ReadAll` with `bytes.Buffer.ReadFrom`. Both need to resize a buffer until they have finished reading; the former increases by 1.25x each time while the latter uses 2x. Also added a benchmark to demonstrate the benefit: name old time/op new time/op delta Client/4KB-8 35.9µs ± 4% 35.3µs ± 3% ~ (p=0.310 n=5+5) Client/50KB-8 83.1µs ± 8% 69.5µs ± 1% -16.37% (p=0.008 n=5+5) Client/1000KB-8 891µs ± 6% 750µs ± 0% -15.83% (p=0.016 n=5+4) Client/2000KB-8 1.74ms ± 2% 1.35ms ± 1% -22.72% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Client/4KB-8 20.2kB ± 0% 20.4kB ± 0% +1.26% (p=0.008 n=5+5) Client/50KB-8 218kB ± 0% 136kB ± 0% -37.65% (p=0.008 n=5+5) Client/1000KB-8 5.88MB ± 0% 2.11MB ± 0% -64.10% (p=0.008 n=5+5) Client/2000KB-8 11.7MB ± 0% 4.2MB ± 0% -63.93% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Client/4KB-8 75.0 ± 0% 72.0 ± 0% -4.00% (p=0.008 n=5+5) Client/50KB-8 109 ± 0% 98 ± 0% -10.09% (p=0.079 n=4+5) Client/1000KB-8 617 ± 0% 593 ± 0% -3.89% (p=0.008 n=5+5) Client/2000KB-8 1.13k ± 0% 1.09k ± 0% -3.27% (p=0.008 n=5+5) Signed-off-by: Bryan Boreham <[email protected]> * Reduce granularity of histogram buckets for Go 1.17 collector (#974) The Go runtime/metrics package currently exports extremely granular histograms. Exponentially bucket any histogram with unit "seconds" or "bytes" instead to dramatically reduce the number of buckets, and thus the number of metrics. This change also adds a test to check for expected cardinality to prevent cardinality surprises in the future. Signed-off-by: Michael Anthony Knyszek <[email protected]> * Cut v1.12.1 (#978) * Cut v1.12.1 Signed-off-by: Kemal Akkoyun <[email protected]> * Apply review suggestions Signed-off-by: Kemal Akkoyun <[email protected]> * Fix deprecated `NewBuildInfoCollector` API Update `examples/random/main.go`: `prometheus.NewBuildInfoCollector` is deprecated. Use `collectors.NewBuildInfoCollector` instead. Signed-off-by: alissa-tung <[email protected]> * gocollector: Added options to Go Collector for changing the (#1031) * Renamed files. Signed-off-by: Bartlomiej Plotka <[email protected]> * gocollector: Added options to Go Collector for diffetent collections. Fixes prometheus/client_golang#983 Also: * fixed TestMemStatsEquivalence, it was noop before (: * Removed gc_cpu_fraction metric completely, since it's not working completely for Go1.17+ Signed-off-by: Bartlomiej Plotka <[email protected]> * gocollector: Reverted client_golang v1.12 addition of runtime/metrics metrics by default. (#1033) Fixes prometheus/client_golang#967 Signed-off-by: Bartlomiej Plotka <[email protected]> * prometheus: Fix convention violating names for generated collector metrics (#1048) * Fix convention violating names for generated collector metrics Signed-off-by: Kemal Akkoyun <[email protected]> * Add new Go collector example Signed-off-by: Kemal Akkoyun <[email protected]> * Remove -Inf buckets from go collector histograms (#1049) * Remove -Inf buckets from go collector histograms Signed-off-by: Kemal Akkoyun <[email protected]> * Update prometheus/collectors/go_collector_latest_test.go Co-authored-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Kemal Akkoyun <[email protected]> * Simplify Signed-off-by: Kemal Akkoyun <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]> * Cut v1.12.2 (#1052) * Cut v1.12.2 Signed-off-by: Kemal Akkoyun <[email protected]> * Apply suggestions Signed-off-by: Kemal Akkoyun <[email protected]> * Update CHANGELOG.md Co-authored-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Kemal Akkoyun <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]> Co-authored-by: Kemal Akkoyun <[email protected]> Co-authored-by: Michael Knyszek <[email protected]> Co-authored-by: Bryan Boreham <[email protected]> Co-authored-by: Kemal Akkoyun <[email protected]> Co-authored-by: alissa-tung <[email protected]> Co-authored-by: Bartlomiej Plotka <[email protected]>

v1.12.2 removes some high cardinality metrics introduced in v1.12.0. See prometheus/client_golang#967 for context. Signed-off-by: Ben Reedy <[email protected]>

kakkoyun mentioned this issue Jan 25, 2022

Reduce granularity of histogram buckets for Go 1.17 collector #974

Merged

kakkoyun closed this as completed in #974 Jan 28, 2022

bwplotka mentioned this issue Feb 5, 2022

Add option to Go process metrics to fallback to pre v1.12.0 behaviour. #983

Closed

This was referenced Mar 4, 2022

go metrics are tightly coupled with used go version #994

Open

Invalid le label value on go_gc_pauses_seconds_total_bucket #995

Closed

bwplotka added a commit that referenced this issue Apr 13, 2022

gocollector: Reverted client_golang v1.12 addition of runtime/metrics…

0ecdf4c

… metrics by default. Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]>

bwplotka mentioned this issue Apr 13, 2022

gocollector: Reverted client_golang v1.12 addition of runtime/metrics metrics by default #1033

Merged

bwplotka reopened this Apr 13, 2022

kakkoyun closed this as completed in #1033 Apr 13, 2022

kakkoyun pushed a commit that referenced this issue Apr 13, 2022

gocollector: Reverted client_golang v1.12 addition of runtime/metrics…

11ee9ad

… metrics by default. (#1033) Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]>

kakkoyun pushed a commit that referenced this issue May 13, 2022

gocollector: Reverted client_golang v1.12 addition of runtime/metrics…

7eb9d11

… metrics by default. (#1033) Fixes #967 Signed-off-by: Bartlomiej Plotka <[email protected]>

breed808 added a commit to prometheus-community/windows_exporter that referenced this issue May 16, 2022

Adjust e2e output for client_golang-1.12.2

77c989d

v1.12.2 removes some high cardinality metrics introduced in v1.12.0. See prometheus/client_golang#967 for context.

Cori1109 added a commit to Cori1109/client_golang that referenced this issue Jan 9, 2023

gocollector: Reverted client_golang v1.12 addition of runtime/metrics…

14ee7f9

… metrics by default. (#1033) Fixes prometheus/client_golang#967 Signed-off-by: Bartlomiej Plotka <[email protected]>

Cori1109 added a commit to Cori1109/client_golang that referenced this issue Jan 9, 2023

gocollector: Reverted client_golang v1.12 addition of runtime/metrics…

f834ca6

… metrics by default. (#1033) Fixes prometheus/client_golang#967 Signed-off-by: Bartlomiej Plotka <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.12.0 exposes some high cardinality metrics #967

v1.12.0 exposes some high cardinality metrics #967

prymitive commented Jan 19, 2022

kakkoyun commented Jan 20, 2022

kakkoyun commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022

mknyszek commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022

prymitive commented Jan 20, 2022

mknyszek commented Jan 20, 2022

prymitive commented Jan 20, 2022

mknyszek commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022 •

edited

Loading

kakkoyun commented Jan 21, 2022 •

edited

Loading

mknyszek commented Jan 21, 2022 •

edited

Loading

bwplotka commented Jan 21, 2022

beorn7 commented Jan 21, 2022

mknyszek commented Jan 21, 2022

jameshartig commented Jan 23, 2022 •

edited

Loading

beorn7 commented Jan 23, 2022

mknyszek commented Jan 24, 2022

bwplotka commented Jan 25, 2022 •

edited

Loading

mknyszek commented Jan 28, 2022

beorn7 commented Jan 28, 2022

kakkoyun commented Jan 29, 2022 via email

prymitive commented Feb 3, 2022

beorn7 commented Feb 3, 2022

mknyszek commented Feb 3, 2022

bwplotka commented Feb 5, 2022

bwplotka commented Apr 13, 2022

v1.12.0 exposes some high cardinality metrics #967

v1.12.0 exposes some high cardinality metrics #967

Comments

prymitive commented Jan 19, 2022

kakkoyun commented Jan 20, 2022

kakkoyun commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022

mknyszek commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022

prymitive commented Jan 20, 2022

mknyszek commented Jan 20, 2022

prymitive commented Jan 20, 2022

mknyszek commented Jan 20, 2022

beorn7 commented Jan 20, 2022

mknyszek commented Jan 20, 2022 • edited Loading

kakkoyun commented Jan 21, 2022 • edited Loading

mknyszek commented Jan 21, 2022 • edited Loading

bwplotka commented Jan 21, 2022

beorn7 commented Jan 21, 2022

mknyszek commented Jan 21, 2022

jameshartig commented Jan 23, 2022 • edited Loading

beorn7 commented Jan 23, 2022

mknyszek commented Jan 24, 2022

bwplotka commented Jan 25, 2022 • edited Loading

mknyszek commented Jan 28, 2022

beorn7 commented Jan 28, 2022

kakkoyun commented Jan 29, 2022 via email

prymitive commented Feb 3, 2022

beorn7 commented Feb 3, 2022

mknyszek commented Feb 3, 2022

bwplotka commented Feb 5, 2022

bwplotka commented Apr 13, 2022

mknyszek commented Jan 20, 2022 •

edited

Loading

kakkoyun commented Jan 21, 2022 •

edited

Loading

mknyszek commented Jan 21, 2022 •

edited

Loading

jameshartig commented Jan 23, 2022 •

edited

Loading

bwplotka commented Jan 25, 2022 •

edited

Loading