Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Prometheus metrics of coredns pod and determine health/issues with DNS in EKS cluster #2

Open
joshisumit opened this issue Jul 1, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@joshisumit
Copy link
Owner

joshisumit commented Jul 1, 2020

CoreDNS exports basic process and Go runtime metrics as well as CoreDNS-specific metrics, on port 9153/metrics

https://coredns.io/plugins/metrics/

Get metrics from CoreDNS in real time to visualize and monitor DNS failures and cache hits/misses.

Important Metrics:

  1. coredns_dns_request_count_total - will show you how busy CoreDNS is, and you can look deeper to understand how requests are being resolved.
# HELP coredns_dns_request_count_total Counter of DNS requests made per zone, protocol and family.
# TYPE coredns_dns_request_count_total counter
coredns_dns_request_count_total{family="1",proto="udp",server="dns://:53",zone="."} 957
  1. coredns_dns_request_duration_seconds - metric can show you how much DNS latency is contributing to overall, user-facing response time.
# HELP coredns_dns_request_duration_seconds Histogram of the time (in seconds) each request took.
# TYPE coredns_dns_request_duration_seconds histogram
coredns_dns_request_duration_seconds_sum{server="dns://:53",zone="."} 0.6624103219999997
coredns_dns_request_duration_seconds_count{server="dns://:53",zone="."} 957
  1. coredns_dns_response_rcode_count_total - Check error codes of DNS responses
    Errors like NXDomain and FormErr can reveal a problem with the requests CoreDNS is receiving, while a ServFail error could indicate an issue with the function of the CoreDNS server itself.
# HELP coredns_dns_response_rcode_count_total Counter of response status codes.
# TYPE coredns_dns_response_rcode_count_total counter
coredns_dns_response_rcode_count_total{rcode="NOERROR",server="dns://:53",zone="."} 521
coredns_dns_response_rcode_count_total{rcode="NXDOMAIN",server="dns://:53",zone="."} 436
@github-actions
Copy link

github-actions bot commented Jul 1, 2020

Hello @joshisumit , thank you for submitting an issue!

@joshisumit joshisumit added the enhancement New feature or request label Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant