High Latency Metrics Collection on oDAO node #726

mendelskiv93 · 2025-01-08T20:28:16Z

Performance issue observed on oDAO node with metrics collection taking excessive time to respond, suggesting metrics are collected on-demand during query rather than continuously maintained.

Evidence:

Metric endpoint response times:

from localhost:

time curl -s 0:9102/metrics  0.00s user 0.01s system 0% cpu 19.347 total

from prometheus slave:

time curl http://10.13.0.58:9102/metrics  0.00s user 0.01s system 0% cpu 44.452 total

Impact visible in monitoring:
- Significant increase in TCP socket TIMEWAIT states
- File descriptors for rocketpool process show elevated numbers
- No corresponding increase in system load

Suggested improvement:
Consider implementing continuous metric collection instead of on-demand gathering during scrape requests to reduce response latency.

The text was updated successfully, but these errors were encountered:

jakubgs · 2025-01-09T13:04:55Z

It is worth mentioning this is happening on an oDAO node.

jshufro · 2025-01-09T13:56:23Z

Thanks for the report.

The metrics collection code is quite old and has always had some less-than ideal qualities (eg #186 )

I think we should probably rewrite a lot of it. I'll take a look into the performance regression.

Unfortunately it might have to wait a bit as we're in the middle of merging a very large refactor.

mendelskiv93 · 2025-01-09T14:01:01Z

No worries, we managed to work around this. Thanks for looking into it.

mendelskiv93 changed the title ~~High Latency Metrics Collection~~ High Latency Metrics Collection on oDAO node Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Latency Metrics Collection on oDAO node #726

High Latency Metrics Collection on oDAO node #726

mendelskiv93 commented Jan 8, 2025 •

edited

Loading

jakubgs commented Jan 9, 2025

jshufro commented Jan 9, 2025

mendelskiv93 commented Jan 9, 2025

High Latency Metrics Collection on oDAO node #726

High Latency Metrics Collection on oDAO node #726

Comments

mendelskiv93 commented Jan 8, 2025 • edited Loading

jakubgs commented Jan 9, 2025

jshufro commented Jan 9, 2025

mendelskiv93 commented Jan 9, 2025

mendelskiv93 commented Jan 8, 2025 •

edited

Loading