Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact output metrics #4167

Open
jgaalen opened this issue Jan 9, 2025 · 6 comments
Open

Compact output metrics #4167

jgaalen opened this issue Jan 9, 2025 · 6 comments
Labels

Comments

@jgaalen
Copy link

jgaalen commented Jan 9, 2025

Currently, K6 stores way too many values per single request to influxdb or whatever output you use causing a very high overhead in terms of network bandwidth, cpu and storage. Per request it stores 4 lines (data sent, received, duration and failed).

It would be incredibly more efficient to be able to store a single metric per requests, which holds all the data such as the data sent, received, status (failed true/false), duration and url, etc..

@inancgumus
Copy link
Member

Hi @jgaalen, this issue sounds similar to #1321. Can you confirm?

@joanlopez
Copy link
Contributor

It would be incredibly more efficient to be able to store a single metric per requests, which holds all the data such as the data sent, received, status (failed true/false), duration and url, etc..

Could you explain this more extensively, please? @jgaalen

What do you understand by "a single metric per requests, which holds all the data such as the data"?

I'd love to see some concrete examples about how that data would look like in plaintext.

Thanks! 🙇🏻

@jgaalen
Copy link
Author

jgaalen commented Jan 10, 2025

It would be incredibly more efficient to be able to store a single metric per requests, which holds all the data such as the data sent, received, status (failed true/false), duration and url, etc..

Could you explain this more extensively, please? @jgaalen

What do you understand by "a single metric per requests, which holds all the data such as the data"?

I'd love to see some concrete examples about how that data would look like in plaintext.

Thanks! 🙇🏻

Ok, I've just did a single request to 'www.google.com' to show what is stored:

<style> </style>
metric_name timestamp metric_value check error error_code expected_response group method name proto scenario service status subproto tls_version url extra_tags metadata
http_reqs 1736496755038080000 1       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_duration 1736496755038080000 64,557       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_blocked 1736496755038080000 0       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_connecting 1736496755038080000 0       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_tls_handshaking 1736496755038080000 0       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_sending 1736496755038080000 0,03       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_waiting 1736496755038080000 63,441       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_receiving 1736496755038080000 1,086       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    
http_req_failed 1736496755038080000 0       TRUE   GET https://www.google.com HTTP/2.0 sample   200   TLS1.3 https://www.google.com    

for k6/http, these are 9 separate writes (either CSV, influxdb or probably other output writes as well), all at the same timestamp for the same request. It would safe incredible more resources, if this would be just a single line. Keep the same tags, but as some as values

If you would output it like this, it would be a big safer in resources (especially influxdb which can be heavily used due to all the writes):

<style> </style>
type timestamp group scenario service name method url http_req_duration http_req_blocked http_req_connecting http_req_tls_handshaking http_req_sending http_req_waiting http_req_receiving http_req_success
http_req 1736496755038080000         GET https://www.google.com 64,557 0 0 0 0,03 63,44 1,086 TRUE

Perhaps some more columns/values, such as: sent bytes, received bytes (per request)

@joanlopez
Copy link
Contributor

Hey @jgaalen,

Thanks for your detailed explanation, that makes more sense now.

However, I think what you're suggesting, which is basically having multiple values (not tags/labels) for a single measurement/sample, is not something that can be generalized, cause it's very specific of the InfluxDB metrics model.

Please, note the importance to distinguish between a value and a tag/label here, especially in the context of TSDBs, because tag/labels are normally used for filtering data, and values to do the actual calculus (percentiles, averages, means, etc).

In comparison, most of the widely used models, like Prometheus, OpenMetrics or OpenTelemetry only allow a single value for each measurement/sample, and there's no way to apply the pattern you're suggesting. Well, yes, it could be "doable" by storing the other values as tag/label values, but that would make them useless, which makes no sense. Not to mention the explosion of cardinality that would represent.

That said, I think what you're suggesting is likely something that could be interesting to explore for the InfluxDB output, but as we mentioned previously, we're no longer maintaining the InfluxDB output, but most likely leave it open for the community to maintain it.

Finally, there's possibly a way to get benefit of part of your idea to make the model k6 uses to internally store metrics in a more efficient way, but we're still doing some research there. We'll take it into account, but generally speaking our path forward is to be compliant with Prometheus and/or move towards OpenTelemetry.

Anyway, thanks for bringing your idea, and feel free to pursue that kind of improvement for the InfluxDB extension.

@jgaalen
Copy link
Author

jgaalen commented Jan 10, 2025

Hey @jgaalen,

Thanks for your detailed explanation, that makes more sense now.

However, I think what you're suggesting, which is basically having multiple values (not tags/labels) for a single measurement/sample, is not something that can be generalized, cause it's very specific of the InfluxDB metrics model.

Please, note the importance to distinguish between a value and a tag/label here, especially in the context of TSDBs, because tag/labels are normally used for filtering data, and values to do the actual calculus (percentiles, averages, means, etc).

In comparison, most of the widely used models, like Prometheus, OpenMetrics or OpenTelemetry only allow a single value for each measurement/sample, and there's no way to apply the pattern you're suggesting. Well, yes, it could be "doable" by storing the other values as tag/label values, but that would make them useless, which makes no sense. Not to mention the explosion of cardinality that would represent.

That said, I think what you're suggesting is likely something that could be interesting to explore for the InfluxDB output, but as we mentioned previously, we're no longer maintaining the InfluxDB output, but most likely leave it open for the community to maintain it.

Finally, there's possibly a way to get benefit of part of your idea to make the model k6 uses to internally store metrics in a more efficient way, but we're still doing some research there. We'll take it into account, but generally speaking our path forward is to be compliant with Prometheus and/or move towards OpenTelemetry.

Anyway, thanks for bringing your idea, and feel free to pursue that kind of improvement for the InfluxDB extension.

Thank you for your explanation. I understand that some TSDB's only allow a single value per unique tag-combo.

Perhaps, the output code, could somehow bundle all the values in a single object. Then it is up to the output writer to have single tag-combo's per value, or combine them. So CSV, Influx, Timescaled can leverage on combined values for resource reduction (imagine how much network, storage, cpu and memory is saved, as well as speed up queries if you can combine values).

@joanlopez
Copy link
Contributor

imagine how much network, storage, cpu and memory is saved, as well as speed up queries if you can combine values

There's always a compromise, and speaking of flexibility, having a simpler model is likely more flexible.

Speaking of resource utilization, most of the metrics are aggregated when stored in memory, by k6, except for Trends (i.e. histograms), for which as I said we're already exploring more efficient ways to store them, although it has low priority on our backlog. And even in such case, the main problem is the amount of values to store rather than the structure.

Finally, if your concern is network, you can do that simple aggregation/transformation on the extension side, which still runs on the k6 side, before metrics are emitted to the backend (e.g. Influx). If all the values come from the same event (e.g. HTTP request), they will likely arrive together through the samples channel, or very close in time, so you would only need to hold them in memory for a very short period of time, and you could flush them very frequently.

@inancgumus inancgumus removed their assignment Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants