From e9ae7770ea685336aa9b2aa0b9f802b028f9aa76 Mon Sep 17 00:00:00 2001 From: Alejandro Pedraza Date: Thu, 30 Sep 2021 11:01:19 -0500 Subject: [PATCH] New content dir for 2.11 docs (#1178) * New content dir for 2.11 docs `/content/2.11` is basically a copy of `/content/2.10`, where all new content pertaining to 2.11 should reside. I tried to remove all direct references to `2.10` and made them relative dir references instead, so this process gets easier as new major versions are released. --- linkerd.io/config.toml | 2 +- linkerd.io/content/1/getting-started/k8s.md | 2 +- .../2.10/features/protocol-detection.md | 104 +- linkerd.io/content/2.11/_index.md | 6 + linkerd.io/content/2.11/checks/index.html | 18 + linkerd.io/content/2.11/features/_index.md | 14 + .../content/2.11/features/automatic-mtls.md | 127 + linkerd.io/content/2.11/features/cni.md | 106 + linkerd.io/content/2.11/features/dashboard.md | 127 + .../2.11/features/distributed-tracing.md | 59 + .../content/2.11/features/fault-injection.md | 12 + linkerd.io/content/2.11/features/ha.md | 162 ++ linkerd.io/content/2.11/features/http-grpc.md | 21 + linkerd.io/content/2.11/features/ingress.md | 14 + .../content/2.11/features/load-balancing.md | 37 + .../content/2.11/features/multicluster.md | 104 + .../2.11/features/protocol-detection.md | 131 + .../content/2.11/features/proxy-injection.md | 64 + .../2.11/features/retries-and-timeouts.md | 77 + .../content/2.11/features/server-policy.md | 105 + .../content/2.11/features/service-profiles.md | 33 + linkerd.io/content/2.11/features/telemetry.md | 79 + .../content/2.11/features/traffic-split.md | 36 + .../content/2.11/getting-started/_index.md | 258 ++ linkerd.io/content/2.11/overview/_index.md | 69 + linkerd.io/content/2.11/reference/_index.md | 6 + .../content/2.11/reference/architecture.md | 155 ++ .../2.11/reference/authorization-policy.md | 256 ++ .../content/2.11/reference/cli/_index.md | 21 + .../content/2.11/reference/cli/check.md | 72 + .../content/2.11/reference/cli/completion.md | 9 + .../content/2.11/reference/cli/diagnostics.md | 48 + .../content/2.11/reference/cli/identity.md | 9 + .../content/2.11/reference/cli/inject.md | 27 + .../content/2.11/reference/cli/install-cni.md | 9 + .../content/2.11/reference/cli/install.md | 33 + .../content/2.11/reference/cli/jaeger.md | 51 + .../2.11/reference/cli/multicluster.md | 67 + .../content/2.11/reference/cli/profile.md | 13 + .../content/2.11/reference/cli/repair.md | 9 + .../content/2.11/reference/cli/uninject.md | 9 + .../content/2.11/reference/cli/uninstall.md | 9 + .../content/2.11/reference/cli/upgrade.md | 30 + .../content/2.11/reference/cli/version.md | 9 + linkerd.io/content/2.11/reference/cli/viz.md | 167 ++ .../2.11/reference/cluster-configuration.md | 77 + .../content/2.11/reference/extension-list.md | 14 + linkerd.io/content/2.11/reference/iptables.md | 198 ++ .../2.11/reference/proxy-configuration.md | 55 + .../content/2.11/reference/proxy-log-level.md | 39 + .../content/2.11/reference/proxy-metrics.md | 206 ++ .../2.11/reference/service-profiles.md | 135 + linkerd.io/content/2.11/tasks/_index.md | 15 + .../content/2.11/tasks/adding-your-service.md | 109 + ...-rotating-control-plane-tls-credentials.md | 234 ++ ...ically-rotating-webhook-tls-credentials.md | 323 +++ linkerd.io/content/2.11/tasks/books.md | 472 ++++ .../content/2.11/tasks/canary-release.md | 298 ++ .../tasks/configuring-proxy-concurrency.md | 131 + .../content/2.11/tasks/configuring-retries.md | 81 + .../2.11/tasks/configuring-timeouts.md | 40 + .../content/2.11/tasks/customize-install.md | 145 + .../content/2.11/tasks/debugging-502s.md | 75 + .../2.11/tasks/debugging-your-service.md | 64 + .../content/2.11/tasks/distributed-tracing.md | 273 ++ .../content/2.11/tasks/exporting-metrics.md | 165 ++ .../content/2.11/tasks/exposing-dashboard.md | 248 ++ linkerd.io/content/2.11/tasks/extensions.md | 77 + .../content/2.11/tasks/external-prometheus.md | 160 ++ .../content/2.11/tasks/fault-injection.md | 197 ++ .../2.11/tasks/generate-certificates.md | 86 + .../2.11/tasks/getting-per-route-metrics.md | 93 + linkerd.io/content/2.11/tasks/gitops.md | 534 ++++ .../content/2.11/tasks/graceful-shutdown.md | 61 + linkerd.io/content/2.11/tasks/install-helm.md | 159 ++ linkerd.io/content/2.11/tasks/install.md | 155 ++ .../2.11/tasks/installing-multicluster.md | 363 +++ linkerd.io/content/2.11/tasks/linkerd-smi.md | 218 ++ ...-rotating-control-plane-tls-credentials.md | 335 +++ .../2.11/tasks/modifying-proxy-log-level.md | 38 + .../tasks/multicluster-using-statefulsets.md | 336 +++ linkerd.io/content/2.11/tasks/multicluster.md | 520 ++++ .../tasks/replacing_expired_certificates.md | 123 + .../content/2.11/tasks/restricting-access.md | 157 ++ .../tasks/rotating_webhooks_certificates.md | 103 + .../2.11/tasks/securing-your-cluster.md | 220 ++ .../2.11/tasks/setting-up-service-profiles.md | 148 + .../content/2.11/tasks/troubleshooting.md | 2476 +++++++++++++++++ .../2.11/tasks/uninstall-multicluster.md | 41 + linkerd.io/content/2.11/tasks/uninstall.md | 52 + .../2.11/tasks/upgrade-multicluster.md | 109 + linkerd.io/content/2.11/tasks/upgrade.md | 1030 +++++++ .../upgrading-2.10-ports-and-protocols.md | 121 + .../using-a-private-docker-repository.md | 52 + .../content/2.11/tasks/using-custom-domain.md | 31 + .../2.11/tasks/using-debug-endpoints.md | 59 + .../content/2.11/tasks/using-ingress.md | 540 ++++ linkerd.io/content/2.11/tasks/using-psp.md | 121 + .../2.11/tasks/using-the-debug-container.md | 102 + .../2.11/tasks/validating-your-traffic.md | 150 + linkerd.io/content/_index.md | 2 +- .../content/blog/announcing-linkerd-2-11.md | 4 +- .../content/design-principles/_index.md | 8 +- linkerd.io/content/faq/_index.md | 6 +- linkerd.io/content/gsoc.md | 2 +- linkerd.io/layouts/2.11/list.html | 3 + linkerd.io/layouts/2.11/single.html | 3 + .../choose-your-platform.html | 4 +- linkerd.io/layouts/partials/1/sidebar.html | 3 + linkerd.io/layouts/partials/cyp.html | 2 +- linkerd.io/layouts/partials/docs.html | 6 +- linkerd.io/layouts/partials/faqs.html | 2 +- linkerd.io/layouts/partials/footer_old.html | 9 - linkerd.io/layouts/partials/get-started.html | 2 +- linkerd.io/layouts/partials/nav.html | 2 +- linkerd.io/layouts/partials/sidebar-2.html | 13 +- linkerd.io/static/checks/index.html | 6 +- linkerd.io/static/dns-rebinding/index.html | 6 +- linkerd.io/static/next-steps/index.html | 6 +- linkerd.io/static/tap-rbac/index.html | 6 +- linkerd.io/static/upgrade/index.html | 6 +- 121 files changed, 15142 insertions(+), 99 deletions(-) create mode 100644 linkerd.io/content/2.11/_index.md create mode 100644 linkerd.io/content/2.11/checks/index.html create mode 100644 linkerd.io/content/2.11/features/_index.md create mode 100644 linkerd.io/content/2.11/features/automatic-mtls.md create mode 100644 linkerd.io/content/2.11/features/cni.md create mode 100644 linkerd.io/content/2.11/features/dashboard.md create mode 100644 linkerd.io/content/2.11/features/distributed-tracing.md create mode 100644 linkerd.io/content/2.11/features/fault-injection.md create mode 100644 linkerd.io/content/2.11/features/ha.md create mode 100644 linkerd.io/content/2.11/features/http-grpc.md create mode 100644 linkerd.io/content/2.11/features/ingress.md create mode 100644 linkerd.io/content/2.11/features/load-balancing.md create mode 100644 linkerd.io/content/2.11/features/multicluster.md create mode 100644 linkerd.io/content/2.11/features/protocol-detection.md create mode 100644 linkerd.io/content/2.11/features/proxy-injection.md create mode 100644 linkerd.io/content/2.11/features/retries-and-timeouts.md create mode 100644 linkerd.io/content/2.11/features/server-policy.md create mode 100644 linkerd.io/content/2.11/features/service-profiles.md create mode 100644 linkerd.io/content/2.11/features/telemetry.md create mode 100644 linkerd.io/content/2.11/features/traffic-split.md create mode 100644 linkerd.io/content/2.11/getting-started/_index.md create mode 100644 linkerd.io/content/2.11/overview/_index.md create mode 100644 linkerd.io/content/2.11/reference/_index.md create mode 100644 linkerd.io/content/2.11/reference/architecture.md create mode 100644 linkerd.io/content/2.11/reference/authorization-policy.md create mode 100644 linkerd.io/content/2.11/reference/cli/_index.md create mode 100644 linkerd.io/content/2.11/reference/cli/check.md create mode 100644 linkerd.io/content/2.11/reference/cli/completion.md create mode 100644 linkerd.io/content/2.11/reference/cli/diagnostics.md create mode 100644 linkerd.io/content/2.11/reference/cli/identity.md create mode 100644 linkerd.io/content/2.11/reference/cli/inject.md create mode 100644 linkerd.io/content/2.11/reference/cli/install-cni.md create mode 100644 linkerd.io/content/2.11/reference/cli/install.md create mode 100644 linkerd.io/content/2.11/reference/cli/jaeger.md create mode 100644 linkerd.io/content/2.11/reference/cli/multicluster.md create mode 100644 linkerd.io/content/2.11/reference/cli/profile.md create mode 100644 linkerd.io/content/2.11/reference/cli/repair.md create mode 100644 linkerd.io/content/2.11/reference/cli/uninject.md create mode 100644 linkerd.io/content/2.11/reference/cli/uninstall.md create mode 100644 linkerd.io/content/2.11/reference/cli/upgrade.md create mode 100644 linkerd.io/content/2.11/reference/cli/version.md create mode 100644 linkerd.io/content/2.11/reference/cli/viz.md create mode 100644 linkerd.io/content/2.11/reference/cluster-configuration.md create mode 100644 linkerd.io/content/2.11/reference/extension-list.md create mode 100644 linkerd.io/content/2.11/reference/iptables.md create mode 100644 linkerd.io/content/2.11/reference/proxy-configuration.md create mode 100644 linkerd.io/content/2.11/reference/proxy-log-level.md create mode 100644 linkerd.io/content/2.11/reference/proxy-metrics.md create mode 100644 linkerd.io/content/2.11/reference/service-profiles.md create mode 100644 linkerd.io/content/2.11/tasks/_index.md create mode 100644 linkerd.io/content/2.11/tasks/adding-your-service.md create mode 100644 linkerd.io/content/2.11/tasks/automatically-rotating-control-plane-tls-credentials.md create mode 100644 linkerd.io/content/2.11/tasks/automatically-rotating-webhook-tls-credentials.md create mode 100644 linkerd.io/content/2.11/tasks/books.md create mode 100644 linkerd.io/content/2.11/tasks/canary-release.md create mode 100644 linkerd.io/content/2.11/tasks/configuring-proxy-concurrency.md create mode 100644 linkerd.io/content/2.11/tasks/configuring-retries.md create mode 100644 linkerd.io/content/2.11/tasks/configuring-timeouts.md create mode 100644 linkerd.io/content/2.11/tasks/customize-install.md create mode 100644 linkerd.io/content/2.11/tasks/debugging-502s.md create mode 100644 linkerd.io/content/2.11/tasks/debugging-your-service.md create mode 100644 linkerd.io/content/2.11/tasks/distributed-tracing.md create mode 100644 linkerd.io/content/2.11/tasks/exporting-metrics.md create mode 100644 linkerd.io/content/2.11/tasks/exposing-dashboard.md create mode 100644 linkerd.io/content/2.11/tasks/extensions.md create mode 100644 linkerd.io/content/2.11/tasks/external-prometheus.md create mode 100644 linkerd.io/content/2.11/tasks/fault-injection.md create mode 100644 linkerd.io/content/2.11/tasks/generate-certificates.md create mode 100644 linkerd.io/content/2.11/tasks/getting-per-route-metrics.md create mode 100644 linkerd.io/content/2.11/tasks/gitops.md create mode 100644 linkerd.io/content/2.11/tasks/graceful-shutdown.md create mode 100644 linkerd.io/content/2.11/tasks/install-helm.md create mode 100644 linkerd.io/content/2.11/tasks/install.md create mode 100644 linkerd.io/content/2.11/tasks/installing-multicluster.md create mode 100644 linkerd.io/content/2.11/tasks/linkerd-smi.md create mode 100644 linkerd.io/content/2.11/tasks/manually-rotating-control-plane-tls-credentials.md create mode 100644 linkerd.io/content/2.11/tasks/modifying-proxy-log-level.md create mode 100644 linkerd.io/content/2.11/tasks/multicluster-using-statefulsets.md create mode 100644 linkerd.io/content/2.11/tasks/multicluster.md create mode 100644 linkerd.io/content/2.11/tasks/replacing_expired_certificates.md create mode 100644 linkerd.io/content/2.11/tasks/restricting-access.md create mode 100644 linkerd.io/content/2.11/tasks/rotating_webhooks_certificates.md create mode 100644 linkerd.io/content/2.11/tasks/securing-your-cluster.md create mode 100644 linkerd.io/content/2.11/tasks/setting-up-service-profiles.md create mode 100644 linkerd.io/content/2.11/tasks/troubleshooting.md create mode 100644 linkerd.io/content/2.11/tasks/uninstall-multicluster.md create mode 100644 linkerd.io/content/2.11/tasks/uninstall.md create mode 100644 linkerd.io/content/2.11/tasks/upgrade-multicluster.md create mode 100644 linkerd.io/content/2.11/tasks/upgrade.md create mode 100644 linkerd.io/content/2.11/tasks/upgrading-2.10-ports-and-protocols.md create mode 100644 linkerd.io/content/2.11/tasks/using-a-private-docker-repository.md create mode 100644 linkerd.io/content/2.11/tasks/using-custom-domain.md create mode 100644 linkerd.io/content/2.11/tasks/using-debug-endpoints.md create mode 100644 linkerd.io/content/2.11/tasks/using-ingress.md create mode 100644 linkerd.io/content/2.11/tasks/using-psp.md create mode 100644 linkerd.io/content/2.11/tasks/using-the-debug-container.md create mode 100644 linkerd.io/content/2.11/tasks/validating-your-traffic.md create mode 100644 linkerd.io/layouts/2.11/list.html create mode 100644 linkerd.io/layouts/2.11/single.html diff --git a/linkerd.io/config.toml b/linkerd.io/config.toml index e87b20a501..d82ab65b1f 100644 --- a/linkerd.io/config.toml +++ b/linkerd.io/config.toml @@ -102,7 +102,7 @@ identifier = "start" name = "GET STARTED" post = "" pre = "" -url = "/2.10/getting-started/" +url = "/getting-started/" weight = 6 [outputFormats.REDIRECTS] baseName = "_redirects" diff --git a/linkerd.io/content/1/getting-started/k8s.md b/linkerd.io/content/1/getting-started/k8s.md index ff858898bc..ac736f3a05 100644 --- a/linkerd.io/content/1/getting-started/k8s.md +++ b/linkerd.io/content/1/getting-started/k8s.md @@ -10,7 +10,7 @@ weight = 34 +++ {{< note >}} This document is specific to Linkerd 1.x. If you're on Kubernetes, you may wish -to consider [Linkerd 2.x](/2.10/getting-started/) instead. +to consider [Linkerd 2.x](/getting-started/) instead. {{< /note >}} If you have a Kubernetes cluster or even just run diff --git a/linkerd.io/content/2.10/features/protocol-detection.md b/linkerd.io/content/2.10/features/protocol-detection.md index 1e4ef03362..b5a5a76081 100644 --- a/linkerd.io/content/2.10/features/protocol-detection.md +++ b/linkerd.io/content/2.10/features/protocol-detection.md @@ -10,64 +10,71 @@ aliases = [ Linkerd is capable of proxying all TCP traffic, including TLS connections, WebSockets, and HTTP tunneling. -In most cases, Linkerd can do this without configuration. To do this, Linkerd -performs *protocol detection* to determine whether traffic is HTTP or HTTP/2 -(including gRPC). If Linkerd detects that a connection is HTTP or HTTP/2, -Linkerd will automatically provide HTTP-level metrics and routing. +In most cases, Linkerd can do this without configuration. To accomplish this, +Linkerd performs *protocol detection* to determine whether traffic is HTTP or +HTTP/2 (including gRPC). If Linkerd detects that a connection is HTTP or +HTTP/2, Linkerd automatically provides HTTP-level metrics and routing. If Linkerd *cannot* determine that a connection is using HTTP or HTTP/2, Linkerd will proxy the connection as a plain TCP connection, applying [mTLS](../automatic-mtls/) and providing byte-level metrics as usual. -{{< note >}} -Client-initiated HTTPS will be treated as TCP, not as HTTP, as Linkerd will not -be able to observe the HTTP transactions on the connection. -{{< /note >}} +(Note that HTTPS calls to or from meshed pods are treated as TCP, not as HTTP. +Because the client initiates the TLS connection, Linkerd is not be able to +decrypt the connection to observe the HTTP transactions.) ## Configuring protocol detection -In some cases, Linkerd's protocol detection cannot function because it is not -provided with enough client data. This can result in a 10-second delay in -creating the connection as the protocol detection code waits for more data. -This situation is often encountered when using "server-speaks-first" protocols, -or protocols where the server sends data before the client does, and can be -avoided by supplying Linkerd with some additional configuration. - {{< note >}} -Regardless of the underlying protocol, client-initiated TLS connections do not -require any additional configuration, as TLS itself is a client-speaks-first -protocol. +If you are experiencing 10-second delays when establishing connections, you are +likely running into a protocol detection timeout. This section will help you +understand how to fix this. {{< /note >}} +In some cases, Linkerd's protocol detection will time out because it doesn't +see any bytes from the client. This situation is commonly encountered when +using "server-speaks-first" protocols where the server sends data before the +client does, such as SMTP, or protocols that proactively establish connections +without sending data, such as Memcache. In this case, the connection will +proceed as a TCP connection after a 10-second protocol detection delay. + +To avoid this delay, you will need to provide some configuration for Linkerd. There are two basic mechanisms for configuring protocol detection: _opaque -ports_ and _skip ports_. Marking a port as _opaque_ instructs Linkerd to proxy -the connection as a TCP stream and not to attempt protocol detection. Marking a -port as _skip_ bypasses the proxy entirely. Opaque ports are generally -preferred (as Linkerd can provide mTLS, TCP-level metrics, etc), but crucially, -opaque ports can only be used for services inside the cluster. - -By default, Linkerd automatically marks some ports as opaque, including the -default ports for SMTP, MySQL, PostgresQL, and Memcache. Services that speak -those protocols, use the default ports, and are inside the cluster do not need -further configuration. - -The following table summarizes some common server-speaks-first protocols and -the configuration necessary to handle them. The "on-cluster config" column -refers to the configuration when the destination is *on* the same cluster; the -"off-cluster config" to when the destination is external to the cluster. - -| Protocol | Default port(s) | On-cluster config | Off-cluster config | -|-----------------|-----------------|-------------------|--------------------| -| SMTP | 25, 587 | none\* | skip ports | -| MySQL | 3306 | none\* | skip ports | -| MySQL with Galera replication | 3306, 4444, 4567, 4568 | none\* | skip ports | -| PostgreSQL | 5432 | none\* | skip ports | -| Redis | 6379 | none\* | skip ports | -| ElasticSearch | 9300 | none\* | skip ports | -| Memcache | 11211 | none\* | skip ports | - -_\* No configuration is required if the standard port is used. If a -non-standard port is used, you must mark the port as opaque._ +ports_ and _skip ports_. Marking a port as _opaque_ instructs Linkerd to skip +protocol detection and immediately proxy the connection as a TCP stream; +marking a port as a _skip port_ bypasses the proxy entirely. Opaque ports are +generally preferred (as Linkerd can provide mTLS, TCP-level metrics, etc), but +can only be used for services inside the cluster. + +By default, Linkerd automatically marks the ports for some server-speaks-first +protocol as opaque. Services that speak those protocols over the default ports +to destinations inside the cluster do not need further configuration. +Linkerd's default list of opaque ports in the 2.10 release is 25 (SMTP), 443 +(client-initiated TLS), 587 (SMTP), 3306 (MySQL), 5432 (Postgres), and 11211 +(Memcache). Note that this may change in future releases. + +The following table contains common protocols that may require configuration. + +| Protocol | Default port(s) | Notes | +|-----------------|-----------------|-------| +| SMTP | 25, 587 | | +| MySQL | 3306 | | +| MySQL with Galera | 3306, 4444, 4567, 4568 | Ports 4444, 4567, and 4568 are not in Linkerd's default set of opaque ports | +| PostgreSQL | 5432 | | +| Redis | 6379 | | +| ElasticSearch | 9300 | Not in Linkerd's default set of opaque ports | +| Memcache | 11211 | | + +If you are using one of those protocols, follow this decision tree to determine +which configuration you need to apply. + +* Is the protocol wrapped in TLS? + * Yes: no configuration required. + * No: is the destination on the cluster? + * Yes: is the port in Linkerd's default list of opaque ports? + * Yes: no configuration required. + * No: mark port(s) as opaque. + * No: mark port(s) as skip. ## Marking a port as opaque @@ -84,8 +91,7 @@ workloads in that namespace. {{< note >}} Since this annotation informs the behavior of meshed _clients_, it can be -applied to services that use server-speaks-first protocols even if the service -itself is not meshed. +applied to unmeshed services as well as meshed services. {{< /note >}} Setting the opaque-ports annotation can be done by using the `--opaque-ports` @@ -104,7 +110,7 @@ Multiple ports can be provided as a comma-delimited string. The values you provide will replace, not augment, the default list of opaque ports. {{< /note >}} -## Skipping the proxy +## Marking a port as skip Sometimes it is necessary to bypass the proxy altogether. For example, when connecting to a server-speaks-first destination that is outside of the cluster, diff --git a/linkerd.io/content/2.11/_index.md b/linkerd.io/content/2.11/_index.md new file mode 100644 index 0000000000..200fc5d135 --- /dev/null +++ b/linkerd.io/content/2.11/_index.md @@ -0,0 +1,6 @@ +--- +title: "Overview" +--- + + + diff --git a/linkerd.io/content/2.11/checks/index.html b/linkerd.io/content/2.11/checks/index.html new file mode 100644 index 0000000000..01ca8a947c --- /dev/null +++ b/linkerd.io/content/2.11/checks/index.html @@ -0,0 +1,18 @@ + + + + + + + Linkerd Check Redirection + + + If you are not redirected automatically, follow this + link. + + diff --git a/linkerd.io/content/2.11/features/_index.md b/linkerd.io/content/2.11/features/_index.md new file mode 100644 index 0000000000..21a32234d1 --- /dev/null +++ b/linkerd.io/content/2.11/features/_index.md @@ -0,0 +1,14 @@ ++++ +title = "Features" +weight = 3 +[sitemap] + priority = 1.0 ++++ + +Linkerd offers many features, outlined below. For our walkthroughs and guides, +please see the [Linkerd task docs]({{% ref "../tasks" %}}). For a reference, +see the [Linkerd reference docs]({{% ref "../reference" %}}). + +## Linkerd's features + +{{% sectiontoc "features" %}} diff --git a/linkerd.io/content/2.11/features/automatic-mtls.md b/linkerd.io/content/2.11/features/automatic-mtls.md new file mode 100644 index 0000000000..783c295f68 --- /dev/null +++ b/linkerd.io/content/2.11/features/automatic-mtls.md @@ -0,0 +1,127 @@ ++++ +title = "Automatic mTLS" +description = "Linkerd automatically enables mutual Transport Layer Security (TLS) for all communication between meshed applications." +weight = 4 +aliases = [ + "../automatic-tls" +] ++++ + +By default, Linkerd automatically enables mutually-authenticated Transport +Layer Security (mTLS) for all TCP traffic between meshed pods. This means that +Linkerd adds authenticated, encrypted communication to your application with +no extra work on your part. (And because the Linkerd control plane also runs +on the data plane, this means that communication between Linkerd's control +plane components are also automatically secured via mTLS.) + +See [Caveats and future work](#caveats-and-future-work) below for some details. + +## What is mTLS? + +A full definition of mTLS is outside the scope of this doc. For an overview of +what mTLS is and how it works in Kuberentes clusters, we suggest reading +through [A Kubernetes engineer's guide to +mTLS](https://buoyant.io/mtls-guide/). + +## Which traffic can Linkerd automatically mTLS? + +Linkerd transparently applies mTLS to all TCP communication between meshed +pods. However, there are still ways in which you may still have non-mTLS +traffic in your system, including: + +* Traffic to or from non-meshed pods (e.g. Kubernetes healthchecks) +* Traffic on ports that were marked as [skip ports](../protocol-detection/), + which bypass the proxy entirely. + +You can [verify which traffic is mTLS'd](../../tasks/validating-your-traffic/) +in a variety of ways. External systems such as [Buoyant +Cloud](https://buoyant.io/cloud) can also automatically generate reports of TLS +traffic patterns on your cluster. + +## Operational concerns + +Linkerd's mTLS requires some preparation for production use, especially for +long-lived clusters or clusters that expect to have cross-cluster traffic. + +The trust anchor generated by the default `linkerd install` CLI command expires +after 365 days. After that, it must be [manually +rotated](../../tasks/manually-rotating-control-plane-tls-credentials/)—a +non-trivial task. Alternatively, you can [provide the trust anchor +yourself](../../tasks/generate-certificates/) and control the expiration date, +e.g. setting it to 10 years rather than one year. + +Kubernetes clusters that make use of Linkerd's [multi-cluster +communication](../multicluster/) must share a trust anchor. Thus, the default +`linkerd install` setup will not work for this situation and you must provide +an explicit trust anchor. + +Similarly, the default cluster issuer certificate and key expire after a year. +These must be [rotated before they +expire](../../tasks/manually-rotating-control-plane-tls-credentials/). +Alternatively, you can [set up automatic rotation with +`cert-manager`](../../tasks/automatically-rotating-control-plane-tls-credentials/). + +External systems such as [Buoyant Cloud](https://buoyant.io/cloud) can be used +to monitor cluster credentials and to send reminders if they are close to +expiration. + +## How does Linkerd's mTLS implementation work? + +The [Linkerd control plane](../../reference/architecture/) contains a certificate +authority (CA) called `identity`. This CA issues TLS certificates to each +Linkerd data plane proxy. Each certificate is bound to the [Kubernetes +ServiceAccount](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/) +identity of the containing pod. These TLS certificates expire after 24 hours +and are automatically rotated. The proxies use these certificates to encrypt +and authenticate TCP traffic to other proxies. + +On the control plane side, Linkerd maintains a set of credentials in the +cluster: a trust anchor, and an issuer certificate and private key. These +credentials can be generated by Linkerd during install time, or optionally +provided by an external source, e.g. [Vault](https://vaultproject.io) or +[cert-manager](https://github.com/jetstack/cert-manager). The issuer +certificate and private key are stored in a [Kubernetes +Secret](https://kubernetes.io/docs/concepts/configuration/secret/); this Secret +is placed in the `linkerd` namespace and can only be read by the service +account used by the [Linkerd control plane](../../reference/architecture/)'s +`identity` component. + +On the data plane side, each proxy is passed the trust anchor in an environment +variable. At startup, the proxy generates a private key, stored in a [tmpfs +emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) which +stays in memory and never leaves the pod. The proxy connects to the control +plane's `identity` component, validating the connection to `identity` with the +trust anchor, and issues a [certificate signing request +(CSR)](https://en.wikipedia.org/wiki/Certificate_signing_request). The CSR +contains an initial certificate with identity set to the pod's [Kubernetes +ServiceAccount](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/), +and the actual service account token, so that `identity` can validate that the +CSR is valid. After validation, the signed trust bundle is returned to the +proxy, which can use it as both a client and server certificate. These +certificates are scoped to 24 hours and dynamically refreshed using the same +mechanism. + +Finally, when a proxy receives an outbound connection from the application +container within its pod, it looks up that desitnation with the Linkerd control +plane. If it's in the Kubernetes cluster, the control plane provides the proxy +with the destination's endpoint addresses, along with metadata including an +identity name. When the proxy connects to the destination, it initiates a TLS +handshake and verifies that that the destination proxy's certificate is signed +by the trust anchor and contains the expected identity. + +## Caveats and future work + +There are a few known gaps in Linkerd's ability to automatically encrypt and +authenticate all communication in the cluster. These gaps will be fixed in +future releases: + +* Linkerd does not currently *enforce* mTLS. Any unencrypted requests inside + the mesh will be opportunistically upgraded to mTLS. Any requests originating + from inside or outside the mesh will not be automatically mTLS'd by Linkerd. + This will be addressed in a future Linkerd release, likely as an opt-in + behavior as it may break some existing applications. + +* Ideally, the ServiceAccount token that Linkerd uses would not be shared with + other potential uses of that token. In future Kubernetes releases, Kubernetes + will support audience/time-bound ServiceAccount tokens, and Linkerd will use + those instead. diff --git a/linkerd.io/content/2.11/features/cni.md b/linkerd.io/content/2.11/features/cni.md new file mode 100644 index 0000000000..4de46e1431 --- /dev/null +++ b/linkerd.io/content/2.11/features/cni.md @@ -0,0 +1,106 @@ ++++ +title = "CNI Plugin" +description = "Linkerd can be configured to run a CNI plugin that rewrites each pod's iptables rules automatically." ++++ + +Linkerd installs can be configured to run a +[CNI plugin](https://github.com/containernetworking/cni) that rewrites each +pod's iptables rules automatically. Rewriting iptables is required for routing +network traffic through the pod's `linkerd-proxy` container. When the CNI plugin +is enabled, individual pods no longer need to include an init container that +requires the `NET_ADMIN` capability to perform rewriting. This can be useful in +clusters where that capability is restricted by cluster administrators. + +## Installation + +Usage of the Linkerd CNI plugin requires that the `linkerd-cni` DaemonSet be +successfully installed on your cluster _first_, before installing the Linkerd +control plane. + +### Using the CLI + +To install the `linkerd-cni` DaemonSet, run: + +```bash +linkerd install-cni | kubectl apply -f - +``` + +Once the DaemonSet is up and running, all subsequent installs that include a +`linkerd-proxy` container (including the Linkerd control plane), no longer need +to include the `linkerd-init` container. Omission of the init container is +controlled by the `--linkerd-cni-enabled` flag at control plane install time. + +Install the Linkerd control plane, with: + +```bash +linkerd install --linkerd-cni-enabled | kubectl apply -f - +``` + +This will set a `cniEnabled` flag in the `linkerd-config` ConfigMap. All +subsequent proxy injections will read this field and omit init containers. + +### Using Helm + +First ensure that your Helm local cache is updated: + +```bash +helm repo update + +helm search linkerd2-cni +NAME CHART VERSION APP VERSION DESCRIPTION +linkerd-edge/linkerd2-cni 20.1.1 edge-20.1.1 A helm chart containing the resources needed by the Linke... +linkerd-stable/linkerd2-cni 2.7.0 stable-2.7.0 A helm chart containing the resources needed by the Linke... +``` + +Run the following commands to install the CNI DaemonSet: + +```bash +# install the CNI plugin first +helm install linkerd2-cni linkerd2/linkerd2-cni + +# ensure the plugin is installed and ready +linkerd check --pre --linkerd-cni-enabled +``` + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} + +At that point you are ready to install Linkerd with CNI enabled. +You can follow [Installing Linkerd with Helm](../../tasks/install-helm/) to do so. + +## Additional configuration + +The `linkerd install-cni` command includes additional flags that you can use to +customize the installation. See `linkerd install-cni --help` for more +information. Note that many of the flags are similar to the flags that can be +used to configure the proxy when running `linkerd inject`. If you change a +default when running `linkerd install-cni`, you will want to ensure that you +make a corresponding change when running `linkerd inject`. + +The most important flags are: + +1. `--dest-cni-net-dir`: This is the directory on the node where the CNI + Configuration resides. It defaults to: `/etc/cni/net.d`. +2. `--dest-cni-bin-dir`: This is the directory on the node where the CNI Plugin + binaries reside. It defaults to: `/opt/cni/bin`. +3. `--cni-log-level`: Setting this to `debug` will allow more verbose logging. + In order to view the CNI Plugin logs, you must be able to see the `kubelet` + logs. One way to do this is to log onto the node and use + `journalctl -t kubelet`. The string `linkerd-cni:` can be used as a search to + find the plugin log output. + +## Upgrading the CNI plugin + +Since the CNI plugin is basically stateless, there is no need for a separate +`upgrade` command. If you are using the CLI to upgrade the CNI plugin you can +just do: + +```bash +linkerd install-cni | kubectl apply --prune -l linkerd.io/cni-resource=true -f - +``` + +Keep in mind that if you are upgrading the plugin from an experimental version, +you need to uninstall and install it again. diff --git a/linkerd.io/content/2.11/features/dashboard.md b/linkerd.io/content/2.11/features/dashboard.md new file mode 100644 index 0000000000..fb67947eb9 --- /dev/null +++ b/linkerd.io/content/2.11/features/dashboard.md @@ -0,0 +1,127 @@ ++++ +title = "On-cluster metrics stack" +description = "Linkerd provides a full on-cluster metrics stack, including CLI tools and dashboards." ++++ + +Linkerd provides a full on-cluster metrics stack, including CLI tools, a web +dashboard, and pre-configured Grafana dashboards. + +To access this functionality, you install the viz extension: + +```bash +linkerd viz install | kubectl apply -f - +``` + +This extension installs the following components into your `linkerd-viz` +namespace: + +* A [Prometheus](https://prometheus.io/) instance +* A [Grafana](https://grafana.com/) instance +* metrics-api, tap, tap-injector, and web components + +These components work together to provide an on-cluster metrics stack. + +{{< note >}} +To limit excessive resource usage on the cluster, the metrics stored by this +extension are _transient_. Only the past 6 hours are stored, and metrics do not +persist in the event of pod restart or node outages. +{{< /note >}} + +## Operating notes + +This metrics stack may require significant cluster resources. Prometheus, in +particular, will consume resources as a function of traffic volume within the +cluster. + +Additionally, by default, metrics data is stored in a transient manner that is +not resilient to pod restarts or to node outages. See [Bringing your own +Prometheus](../../tasks/external-prometheus/) for one way to address this. + +## Linkerd dashboard + +The Linkerd dashboard provides a high level view of what is happening with your +services in real time. It can be used to view the "golden" metrics (success +rate, requests/second and latency), visualize service dependencies and +understand the health of specific service routes. One way to pull it up is by +running `linkerd viz dashboard` from the command line. + +{{< fig src="/images/architecture/stat.png" title="Top Line Metrics">}} + +## Grafana + +As a component of the control plane, Grafana provides actionable dashboards for +your services out of the box. It is possible to see high level metrics and dig +down into the details, even for pods. + +The dashboards that are provided out of the box include: + +{{< gallery >}} + +{{< gallery-item src="/images/screenshots/grafana-top.png" + title="Top Line Metrics" >}} + +{{< gallery-item src="/images/screenshots/grafana-deployment.png" + title="Deployment Detail" >}} + +{{< gallery-item src="/images/screenshots/grafana-pod.png" + title="Pod Detail" >}} + +{{< gallery-item src="/images/screenshots/grafana-health.png" + title="Linkerd Health" >}} + +{{< /gallery >}} + +linkerd -n emojivoto check --proxy + +## Examples + +In these examples, we assume you've installed the emojivoto example +application. Please refer to the [Getting Started +Guide](../../getting-started/) for how to do this. + +You can use your dashboard extension and see all the services in the demo app. +Since the demo app comes with a load generator, we can see live traffic metrics +by running: + +```bash +linkerd -n emojivoto viz stat deploy +``` + +This will show the "golden" metrics for each deployment: + +* Success rates +* Request rates +* Latency distribution percentiles + +To dig in a little further, it is possible to use `top` to get a real-time +view of which paths are being called: + +```bash +linkerd -n emojivoto viz top deploy +``` + +To go even deeper, we can use `tap` shows the stream of requests across a +single pod, deployment, or even everything in the emojivoto namespace: + +```bash +linkerd -n emojivoto viz tap deploy/web +``` + +All of this functionality is also available in the dashboard, if you would like +to use your browser instead: + +{{< gallery >}} + +{{< gallery-item src="/images/getting-started/stat.png" + title="Top Line Metrics">}} + +{{< gallery-item src="/images/getting-started/inbound-outbound.png" + title="Deployment Detail">}} + +{{< gallery-item src="/images/getting-started/top.png" + title="Top" >}} + +{{< gallery-item src="/images/getting-started/tap.png" + title="Tap" >}} + +{{< /gallery >}} diff --git a/linkerd.io/content/2.11/features/distributed-tracing.md b/linkerd.io/content/2.11/features/distributed-tracing.md new file mode 100644 index 0000000000..7bf2ef5be8 --- /dev/null +++ b/linkerd.io/content/2.11/features/distributed-tracing.md @@ -0,0 +1,59 @@ ++++ +title = "Distributed Tracing" +description = "You can enable distributed tracing support in Linkerd." ++++ + +Tracing can be an invaluable tool in debugging distributed systems performance, +especially for identifying bottlenecks and understanding the latency cost of +each component in your system. Linkerd can be configured to emit trace spans +from the proxies, allowing you to see exactly what time requests and responses +spend inside. + +Unlike most of the features of Linkerd, distributed tracing requires both code +changes and configuration. (You can read up on [Distributed tracing in the +service mesh: four myths](/2019/08/09/service-mesh-distributed-tracing-myths/) +for why this is.) + +Furthermore, Linkerd provides many of the features that are often associated +with distributed tracing, *without* requiring configuration or application +changes, including: + +* Live service topology and dependency graphs +* Aggregated service health, latencies, and request volumes +* Aggregated path / route health, latencies, and request volumes + +For example, Linkerd can display a live topology of all incoming and outgoing +dependencies for a service, without requiring distributed tracing or any other +such application modification: + +{{< fig src="/images/books/webapp-detail.png" + title="The Linkerd dashboard showing an automatically generated topology graph" +>}} + +Likewise, Linkerd can provide golden metrics per service and per *route*, again +without requiring distributed tracing or any other such application +modification: + +{{< fig src="/images/books/webapp-routes.png" + title="Linkerd dashboard showing an automatically generated route metrics" +>}} + +## Using distributed tracing + +That said, distributed tracing certainly has its uses, and Linkerd makes this +as easy as it can. Linkerd's role in distributed tracing is actually quite +simple: when a Linkerd data plane proxy sees a tracing header in a proxied HTTP +request, Linkerd will emit a trace span for that request. This span will +include information about the exact amount of time spent in the Linkerd proxy. +When paired with software to collect, store, and analyze this information, this +can provide significant insight into the behavior of the mesh. + +To use this feature, you'll also need to introduce several additional +components in your system., including an ingress layer that kicks off the trace +on particular requests, a client library for your application (or a mechanism +to propagate trace headers), a trace collector to collect span data and turn +them into traces, and a trace backend to store the trace data and allow the +user to view/query it. + +For details, please see our [guide to adding distributed tracing to your +application with Linkerd](../../tasks/distributed-tracing/). diff --git a/linkerd.io/content/2.11/features/fault-injection.md b/linkerd.io/content/2.11/features/fault-injection.md new file mode 100644 index 0000000000..540b977d83 --- /dev/null +++ b/linkerd.io/content/2.11/features/fault-injection.md @@ -0,0 +1,12 @@ ++++ +title = "Fault Injection" +description = "Linkerd provides mechanisms to programmatically inject failures into services." ++++ + +Fault injection is a form of chaos engineering where the error rate of a service +is artificially increased to see what impact there is on the system as a whole. +Traditionally, this would require modifying the service's code to add a fault +injection library that would be doing the actual work. Linkerd can do this +without any service code changes, only requiring a little configuration. + +To inject faults into your own services, follow the [tutorial](../../tasks/fault-injection/). diff --git a/linkerd.io/content/2.11/features/ha.md b/linkerd.io/content/2.11/features/ha.md new file mode 100644 index 0000000000..fe32dadefa --- /dev/null +++ b/linkerd.io/content/2.11/features/ha.md @@ -0,0 +1,162 @@ ++++ +title = "High Availability" +description = "The Linkerd control plane can run in high availability (HA) mode." +aliases = [ + "../ha/" +] ++++ + +For production workloads, Linkerd's control plane can run in high availability +(HA) mode. This mode: + +* Runs three replicas of critical control plane components. +* Sets production-ready CPU and memory resource requests on control plane + components. +* Sets production-ready CPU and memory resource requests on data plane proxies +* *Requires* that the [proxy auto-injector](../proxy-injection/) be + functional for any pods to be scheduled. +* Sets [anti-affinity + policies](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity) + on critical control plane components to ensure, if possible, that they are + scheduled on separate nodes and in separate zones by default. + +## Enabling HA + +You can enable HA mode at control plane installation time with the `--ha` flag: + +```bash +linkerd install --ha | kubectl apply -f - +``` + +Also note the Viz extension also supports an `--ha` flag with similar +characteristics: + +```bash +linkerd viz install --ha | kubectl apply -f - +``` + +You can override certain aspects of the HA behavior at installation time by +passing other flags to the `install` command. For example, you can override the +number of replicas for critical components with the `--controller-replicas` +flag: + +```bash +linkerd install --ha --controller-replicas=2 | kubectl apply -f - +``` + +See the full [`install` CLI documentation](../../reference/cli/install/) for +reference. + +The `linkerd upgrade` command can be used to enable HA mode on an existing +control plane: + +```bash +linkerd upgrade --ha | kubectl apply -f - +``` + +## Proxy injector failure policy + +The HA proxy injector is deployed with a stricter failure policy to enforce +[automatic proxy injection](../proxy-injection/). This setup ensures +that no annotated workloads are accidentally scheduled to run on your cluster, +without the Linkerd proxy. (This can happen when the proxy injector is down.) + +If proxy injection process failed due to unrecognized or timeout errors during +the admission phase, the workload admission will be rejected by the Kubernetes +API server, and the deployment will fail. + +Hence, it is very important that there is always at least one healthy replica +of the proxy injector running on your cluster. + +If you cannot guarantee the number of healthy proxy injector on your cluster, +you can loosen the webhook failure policy by setting its value to `Ignore`, as +seen in the +[Linkerd Helm chart](https://github.com/linkerd/linkerd2/blob/803511d77b33bd9250b4a7fecd36752fcbd715ac/charts/linkerd2/templates/proxy-injector-rbac.yaml#L98). + +{{< note >}} +See the Kubernetes +[documentation](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#failure-policy) +for more information on the admission webhook failure policy. +{{< /note >}} + +## Exclude the kube-system namespace + +Per recommendation from the Kubernetes +[documentation](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#avoiding-operating-on-the-kube-system-namespace), +the proxy injector should be disabled for the `kube-system` namespace. + +This can be done by labeling the `kube-system` namespace with the following +label: + +```bash +kubectl label namespace kube-system config.linkerd.io/admission-webhooks=disabled +``` + +The Kubernetes API server will not call the proxy injector during the admission +phase of workloads in namespace with this label. + +If your Kubernetes cluster have built-in reconcilers that would revert any changes +made to the `kube-system` namespace, you should loosen the proxy injector +failure policy following these [instructions](#proxy-injector-failure-policy). + +## Pod anti-affinity rules + +All critical control plane components are deployed with pod anti-affinity rules +to ensure redundancy. + +Linkerd uses a `requiredDuringSchedulingIgnoredDuringExecution` pod +anti-affinity rule to ensure that the Kubernetes scheduler does not colocate +replicas of critical component on the same node. A +`preferredDuringSchedulingIgnoredDuringExecution` pod anti-affinity rule is also +added to try to schedule replicas in different zones, where possible. + +In order to satisfy these anti-affinity rules, HA mode assumes that there +are always at least three nodes in the Kubernetes cluster. If this assumption is +violated (e.g. the cluster is scaled down to two or fewer nodes), then the +system may be left in a non-functional state. + +Note that these anti-affinity rules don't apply to add-on components like +Prometheus and Grafana. + +## Scaling Prometheus + +The Linkerd Viz extension provides a pre-configured Prometheus pod, but for +production workloads we recommend setting up your own Prometheus instance. To +scrape the data plane metrics, follow the instructions +[here](../../tasks/external-prometheus/). This will provide you +with more control over resource requirement, backup strategy and data retention. + +When planning for memory capacity to store Linkerd timeseries data, the usual +guidance is 5MB per meshed pod. + +If your Prometheus is experiencing regular `OOMKilled` events due to the amount +of data coming from the data plane, the two key parameters that can be adjusted +are: + +* `storage.tsdb.retention.time` defines how long to retain samples in storage. + A higher value implies that more memory is required to keep the data around + for a longer period of time. Lowering this value will reduce the number of + `OOMKilled` events as data is retained for a shorter period of time +* `storage.tsdb.retention.size` defines the maximum number of bytes that can be + stored for blocks. A lower value will also help to reduce the number of + `OOMKilled` events + +For more information and other supported storage options, see the Prometheus +documentation +[here](https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects). + +## Working with Cluster AutoScaler + +The Linkerd proxy stores its mTLS private key in a +[tmpfs emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir) +volume to ensure that this information never leaves the pod. This causes the +default setup of Cluster AutoScaler to not be able to scale down nodes with +injected workload replicas. + +The workaround is to annotate the injected workload with the +`cluster-autoscaler.kubernetes.io/safe-to-evict: "true"` annotation. If you +have full control over the Cluster AutoScaler configuration, you can start the +Cluster AutoScaler with the `--skip-nodes-with-local-storage=false` option. + +For more information on this, see the Cluster AutoScaler documentation +[here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node). diff --git a/linkerd.io/content/2.11/features/http-grpc.md b/linkerd.io/content/2.11/features/http-grpc.md new file mode 100644 index 0000000000..74ddd707ce --- /dev/null +++ b/linkerd.io/content/2.11/features/http-grpc.md @@ -0,0 +1,21 @@ ++++ +title = "HTTP, HTTP/2, and gRPC Proxying" +description = "Linkerd will automatically enable advanced features (including metrics, load balancing, retries, and more) for HTTP, HTTP/2, and gRPC connections." +weight = 1 ++++ + +Linkerd can proxy all TCP connections, and will automatically enable advanced +features (including metrics, load balancing, retries, and more) for HTTP, +HTTP/2, and gRPC connections. (See +[TCP Proxying and Protocol Detection](../protocol-detection/) for details of how +this detection happens). + +## Notes + +* gRPC applications that use [grpc-go][grpc-go] must use version 1.3 or later due + to a [bug](https://github.com/grpc/grpc-go/issues/1120) in earlier versions. +* gRPC applications that use [@grpc/grpc-js][grpc-js] must use version 1.1.0 or later + due to a [bug](https://github.com/grpc/grpc-node/issues/1475) in earlier versions. + +[grpc-go]: https://github.com/grpc/grpc-go +[grpc-js]: https://github.com/grpc/grpc-node/tree/master/packages/grpc-js diff --git a/linkerd.io/content/2.11/features/ingress.md b/linkerd.io/content/2.11/features/ingress.md new file mode 100644 index 0000000000..6a3d9308f8 --- /dev/null +++ b/linkerd.io/content/2.11/features/ingress.md @@ -0,0 +1,14 @@ ++++ +title = "Ingress" +description = "Linkerd can work alongside your ingress controller of choice." +weight = 7 +aliases = [ + "../ingress/" +] ++++ + +For reasons of simplicity, Linkerd does not provide its own ingress controller. +Instead, Linkerd is designed to work alongside your ingress controller of choice. + +See the [Using Ingress with Linkerd Guide](../../tasks/using-ingress/) for examples +of how to get it all working together. diff --git a/linkerd.io/content/2.11/features/load-balancing.md b/linkerd.io/content/2.11/features/load-balancing.md new file mode 100644 index 0000000000..5ec51ffac6 --- /dev/null +++ b/linkerd.io/content/2.11/features/load-balancing.md @@ -0,0 +1,37 @@ ++++ +title = "Load Balancing" +description = "Linkerd automatically load balances requests across all destination endpoints on HTTP, HTTP/2, and gRPC connections." +weight = 9 ++++ + +For HTTP, HTTP/2, and gRPC connections, Linkerd automatically load balances +requests across all destination endpoints without any configuration required. +(For TCP connections, Linkerd will balance connections.) + +Linkerd uses an algorithm called EWMA, or *exponentially weighted moving average*, +to automatically send requests to the fastest endpoints. This load balancing can +improve end-to-end latencies. + +## Service discovery + +For destinations that are not in Kubernetes, Linkerd will balance across +endpoints provided by DNS. + +For destinations that are in Kubernetes, Linkerd will look up the IP address in +the Kubernetes API. If the IP address corresponds to a Service, Linkerd will +load balance across the endpoints of that Service and apply any policy from that +Service's [Service Profile](../service-profiles/). On the other hand, +if the IP address corresponds to a Pod, Linkerd will not perform any load +balancing or apply any [Service Profiles](../service-profiles/). + +{{< note >}} +If working with headless services, endpoints of the service cannot be retrieved. +Therefore, Linkerd will not perform load balancing and instead route only to the +target IP address. +{{< /note >}} + +## Load balancing gRPC + +Linkerd's load balancing is particularly useful for gRPC (or HTTP/2) services +in Kubernetes, for which [Kubernetes's default load balancing is not +effective](https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/). diff --git a/linkerd.io/content/2.11/features/multicluster.md b/linkerd.io/content/2.11/features/multicluster.md new file mode 100644 index 0000000000..5ed2a8c4e7 --- /dev/null +++ b/linkerd.io/content/2.11/features/multicluster.md @@ -0,0 +1,104 @@ ++++ +title = "Multi-cluster communication" +description = "Linkerd can transparently and securely connect services that are running in different clusters." +aliases = [ "multicluster_support" ] ++++ + +Linkerd can connect Kubernetes services across cluster boundaries in a way that +is secure, fully transparent to the application, and independent of network +topology. This multi-cluster capability is designed to provide: + +1. **A unified trust domain.** The identity of source and destination workloads + are validated at every step, both in and across cluster boundaries. +2. **Separate failure domains.** Failure of a cluster allows the remaining + clusters to function. +3. **Support for heterogeneous networks.** Since clusters can span clouds, + VPCs, on-premises data centers, and combinations thereof, Linkerd does not + introduce any L3/L4 requirements other than gateway connectivity. +4. **A unified model alongside in-cluster communication.** The same + observability, reliability, and security features that Linkerd provides for + in-cluster communication extend to cross-cluster communication. + +Just as with in-cluster connections, Linkerd’s cross-cluster connections are +transparent to the application code. Regardless of whether that communication +happens within a cluster, across clusters within a datacenter or VPC, or across +the public Internet, Linkerd will establish a connection between clusters +that’s encrypted and authenticated on both sides with mTLS. + +## How it works + +Linkerd's multi-cluster support works by "mirroring" service information +between clusters. Because remote services are represented as Kubernetes +services, the full observability, security and routing features of Linkerd +apply uniformly to both in-cluster and cluster-calls, and the application does +not need to distinguish between those situations. + +{{< fig + alt="Overview" + title="Overview" + center="true" + src="/images/multicluster/feature-overview.svg" >}} + +Linkerd's multi-cluster functionality is implemented by two components: +a *service mirror* and a *gateway*. The *service mirror* component watches +a target cluster for updates to services and mirrors those service updates +locally on a source cluster. This provides visibility into the service names of +the target cluster so that applications can address them directly. The +*multi-cluster gateway* component provides target clusters a way to receive +requests from source clusters. (This allows Linkerd to support [hierarchical +networks](/2020/02/17/architecting-for-multicluster-kubernetes/#requirement-i-support-hierarchical-networks).) + +Once these components are installed, Kubernetes `Service` resources that match +a label selector can be exported to other clusters. + +## Headless services + +By default, Linkerd will mirror all exported services from a target cluster as +`clusterIP` services in a source cluster (they will be assigned a virtual IP). +This also extends to [headless +services](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services); +an exported headless service will be mirrored as a `clusterIP` service in the +source cluster. In general, headless services are used when a workloads needs a +stable network identifier or to facilitate service discovery without being tied +to Kubernetes' native implementation, this allows clients to either implement +their own load balancing or to address a pod directly through its DNS name. In +certain situations, it is desireable to preserve some of this functionality, +especially when working with Kubernetes objects that require it, such as +[StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/). + +Linkerd's multi-cluster extension can be configured with support for headless +services when linking two clusters together. When the feature is turned on, the +*service mirror* component will export headless services without assigning them +an IP. This allows clients to talk to specific pods (or hosts) across clusters. +To support direct communication, underneath the hood, the service mirror +component will create an *endpoint mirror* for each host that backs a headless +service. To exemplify, if in a target cluster there is a StatefulSet deployed +with two replicas, and the StatefulSet is backed by a headless service, when +the service will be exported, the source cluster will create a headless mirror +along with two "endpoint mirrors" representing the hosts in the StatefulSet. + +This approach allows Linkerd to preserve DNS record creation and support direct +communication to pods across clusters. Clients may also implement their own +load balancing based on the DNS records created by the headless service. +Hostnames are also preserved across clusters, meaning that the only difference +in the DNS name (or FQDN) is the headless service's mirror name. In order to be +exported as a headless service, the hosts backing the service need to be named +(e.g a StatefulSet is supported since all pods have a hostname, but a +Deployment would not be supported, since they do not allow for arbitrary +hostnames in the pod spec). + +Ready to get started? See the [getting started with multi-cluster +guide](../../tasks/multicluster/) for a walkthrough. + +## Further reading + +* [Multi-cluster installation instructions](../../tasks/installing-multicluster/). +* [Multi-cluster communication with StatefulSets](../../tasks/multicluster-using-statefulsets/). +* [Architecting for multi-cluster + Kubernetes](/2020/02/17/architecting-for-multicluster-kubernetes/), a blog + post explaining some of the design rationale behind Linkerd's multi-cluster + implementation. +* [Multi-cluster Kubernetes with service + mirroring](/2020/02/25/multicluster-kubernetes-with-service-mirroring/), a + deep dive of some of the architectural decisions behind Linkerd's + multi-cluster implementation. diff --git a/linkerd.io/content/2.11/features/protocol-detection.md b/linkerd.io/content/2.11/features/protocol-detection.md new file mode 100644 index 0000000000..fea8e418cd --- /dev/null +++ b/linkerd.io/content/2.11/features/protocol-detection.md @@ -0,0 +1,131 @@ ++++ +title = "TCP Proxying and Protocol Detection" +description = "Linkerd is capable of proxying all TCP traffic, including TLS'd connections, WebSockets, and HTTP tunneling." +weight = 2 +aliases = [ + "/2.11/supported-protocols/" +] ++++ + +Linkerd is capable of proxying all TCP traffic, including TLS connections, +WebSockets, and HTTP tunneling. + +In most cases, Linkerd can do this without configuration. To accomplish this, +Linkerd performs *protocol detection* to determine whether traffic is HTTP or +HTTP/2 (including gRPC). If Linkerd detects that a connection is HTTP or +HTTP/2, Linkerd automatically provides HTTP-level metrics and routing. + +If Linkerd *cannot* determine that a connection is using HTTP or HTTP/2, +Linkerd will proxy the connection as a plain TCP connection, applying +[mTLS](../automatic-mtls/) and providing byte-level metrics as usual. + +(Note that HTTPS calls to or from meshed pods are treated as TCP, not as HTTP. +Because the client initiates the TLS connection, Linkerd is not be able to +decrypt the connection to observe the HTTP transactions.) + +## Configuring protocol detection + +{{< note >}} +If you are experiencing 10-second delays when establishing connections, you are +likely running into a protocol detection timeout. This section will help you +understand how to fix this. +{{< /note >}} + +In some cases, Linkerd's protocol detection will time out because it doesn't +see any bytes from the client. This situation is commonly encountered when +using "server-speaks-first" protocols where the server sends data before the +client does, such as SMTP, or protocols that proactively establish connections +without sending data, such as Memcache. In this case, the connection will +proceed as a TCP connection after a 10-second protocol detection delay. + +To avoid this delay, you will need to provide some configuration for Linkerd. +There are two basic mechanisms for configuring protocol detection: _opaque +ports_ and _skip ports_. Marking a port as _opaque_ instructs Linkerd to skip +protocol detection and immediately proxy the connection as a TCP stream; +marking a port as a _skip port_ bypasses the proxy entirely. Opaque ports are +generally preferred (as Linkerd can provide mTLS, TCP-level metrics, etc), but +can only be used for services inside the cluster. + +By default, Linkerd automatically marks the ports for some server-speaks-first +protocol as opaque. Services that speak those protocols over the default ports +to destinations inside the cluster do not need further configuration. +Linkerd's default list of opaque ports in the 2.11 release is 25 (SMTP), 587 +(SMTP), 3306 (MySQL), 4444 (Galera), 5432 (Postgres), 6379 (Redis), 9300 +(ElasticSearch), and 11211 (Memcache). Note that this may change in future +releases. + +The following table contains common protocols that may require configuration. + +| Protocol | Default port(s) | Notes | +|-----------------|-----------------|-------| +| SMTP | 25, 587 | | +| MySQL | 3306 | | +| MySQL with Galera | 3306, 4444, 4567, 4568 | Ports 4567 and 4568 are not in Linkerd's default set of opaque ports | +| PostgreSQL | 5432 | | +| Redis | 6379 | | +| ElasticSearch | 9300 | | +| Memcache | 11211 | | + +If you are using one of those protocols, follow this decision tree to determine +which configuration you need to apply. + +* Is the protocol wrapped in TLS? + * Yes: no configuration required. + * No: is the destination on the cluster? + * Yes: is the port in Linkerd's default list of opaque ports? + * Yes: no configuration required. + * No: mark port(s) as opaque. + * No: mark port(s) as skip. + +## Marking a port as opaque + +You can use the `config.linkerd.io/opaque-ports` annotation to mark a port as +opaque. This instructions Linkerd to skip protocol detection for that port. + +This annotation can be set on a workload, service, or namespace. Setting it on +a workload tells meshed clients of that workload to skip protocol detection for +connections established to the workload, and tells Linkerd to skip protocol +detection when reverse-proxying incoming connections. Setting it on a service +tells meshed clients to skip protocol detection when proxying connections to +the service. Set it on a namespace applies this behavior to all services and +workloads in that namespace. + +{{< note >}} +Since this annotation informs the behavior of meshed _clients_, it can be +applied to unmeshed services as well as meshed services. +{{< /note >}} + +Setting the opaque-ports annotation can be done by using the `--opaque-ports` +flag when running `linkerd inject`. For example, for a MySQL database running +on the cluster using a non-standard port 4406, you can use the commands: + +```bash +linkerd inject mysql-deployment.yml --opaque-ports=4406 \ + | kubectl apply -f - + linkerd inject mysql-service.yml --opaque-ports=4406 \ + | kubectl apply -f - +``` + +{{< note >}} +Multiple ports can be provided as a comma-delimited string. The values you +provide will replace, not augment, the default list of opaque ports. +{{< /note >}} + +## Marking a port as skip + +Sometimes it is necessary to bypass the proxy altogether. For example, when +connecting to a server-speaks-first destination that is outside of the cluster, +there is no Service resource on which to set the +`config.linkerd.io/opaque-ports` annotation. + +In this case, you can use the `--skip-outbound-ports` flag when running +`linkerd inject` to configure resources to bypass the proxy entirely when +sending to those ports. (Similarly, the `--skip-inbound-ports` flag will +configure the resource to bypass the proxy for incoming connections to those +ports.) + +Skipping the proxy can be useful for these situations, as well as for +diagnosing issues, but otherwise should rarely be necessary. + +As with opaque ports, multiple skipports can be provided as a comma-delimited +string. diff --git a/linkerd.io/content/2.11/features/proxy-injection.md b/linkerd.io/content/2.11/features/proxy-injection.md new file mode 100644 index 0000000000..f15bfcb125 --- /dev/null +++ b/linkerd.io/content/2.11/features/proxy-injection.md @@ -0,0 +1,64 @@ ++++ +title = "Automatic Proxy Injection" +description = "Linkerd will automatically inject the data plane proxy into your pods based annotations." +aliases = [ + "../proxy-injection/" +] ++++ + +Linkerd automatically adds the data plane proxy to pods when the +`linkerd.io/inject: enabled` annotation is present on a namespace or any +workloads, such as deployments or pods. This is known as "proxy injection". + +See [Adding Your Service](../../tasks/adding-your-service/) for a walkthrough of +how to use this feature in practice. + +{{< note >}} +Proxy injection is also where proxy *configuration* happens. While it's rarely +necessary, you can configure proxy settings by setting additional Kubernetes +annotations at the resource level prior to injection. See the [full list of +proxy configuration options](../../reference/proxy-configuration/). +{{< /note >}} + +## Details + +Proxy injection is implemented as a [Kubernetes admission +webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks). +This means that the proxies are added to pods within the Kubernetes cluster +itself, regardless of whether the pods are created by `kubectl`, a CI/CD +system, or any other system. + +For each pod, two containers are injected: + +1. `linkerd-init`, a Kubernetes [Init + Container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) + that configures `iptables` to automatically forward all incoming and + outgoing TCP traffic through the proxy. (Note that this container is not + present if the [Linkerd CNI Plugin](../cni/) has been enabled.) +1. `linkerd-proxy`, the Linkerd data plane proxy itself. + +Note that simply adding the annotation to a resource with pre-existing pods +will not automatically inject those pods. You will need to update the pods +(e.g. with `kubectl rollout restart` etc.) for them to be injected. This is +because Kubernetes does not call the webhook until it needs to update the +underlying resources. + +## Overriding injection + +Automatic injection can be disabled for a pod or deployment for which it would +otherwise be enabled, by adding the `linkerd.io/inject: disabled` annotation. + +## Manual injection + +The [`linkerd inject`](../../reference/cli/inject/) CLI command is a text +transform that, by default, simply adds the inject annotation to a given +Kubernetes manifest. + +Alternatively, this command can also perform the full injection purely on the +client side with the `--manual` flag. This was the default behavior prior to +Linkerd 2.4; however, having injection to the cluster side makes it easier to +ensure that the data plane is always present and configured correctly, +regardless of how pods are deployed. + +See the [`linkerd inject` reference](../../reference/cli/inject/) for more +information. diff --git a/linkerd.io/content/2.11/features/retries-and-timeouts.md b/linkerd.io/content/2.11/features/retries-and-timeouts.md new file mode 100644 index 0000000000..97b2172fe6 --- /dev/null +++ b/linkerd.io/content/2.11/features/retries-and-timeouts.md @@ -0,0 +1,77 @@ ++++ +title = "Retries and Timeouts" +description = "Linkerd can perform service-specific retries and timeouts." +weight = 3 ++++ + +Automatic retries are one the most powerful and useful mechanisms a service mesh +has for gracefully handling partial or transient application failures. If +implemented incorrectly retries can amplify small errors into system wide +outages. For that reason, we made sure they were implemented in a way that would +increase the reliability of the system while limiting the risk. + +Timeouts work hand in hand with retries. Once requests are retried a certain +number of times, it becomes important to limit the total amount of time a client +waits before giving up entirely. Imagine a number of retries forcing a client +to wait for 10 seconds. + +A [service profile](../service-profiles/) may define certain routes as +retryable or specify timeouts for routes. This will cause the Linkerd proxy to +perform the appropriate retries or timeouts when calling that service. Retries +and timeouts are always performed on the *outbound* (client) side. + +{{< note >}} +If working with headless services, service profiles cannot be retrieved. Linkerd +reads service discovery information based off the target IP address, and if that +happens to be a pod IP address then it cannot tell which service the pod belongs +to. +{{< /note >}} + +These can be setup by following the guides: + +- [Configuring Retries](../../tasks/configuring-retries/) +- [Configuring Timeouts](../../tasks/configuring-timeouts/) + +## How Retries Can Go Wrong + +Traditionally, when performing retries, you must specify a maximum number of +retry attempts before giving up. Unfortunately, there are two major problems +with configuring retries this way. + +### Choosing a maximum number of retry attempts is a guessing game + +You need to pick a number that’s high enough to make a difference; allowing +more than one retry attempt is usually prudent and, if your service is less +reliable, you’ll probably want to allow several retry attempts. On the other +hand, allowing too many retry attempts can generate a lot of extra requests and +extra load on the system. Performing a lot of retries can also seriously +increase the latency of requests that need to be retried. In practice, you +usually pick a maximum retry attempts number out of a hat (3?) and then tweak +it through trial and error until the system behaves roughly how you want it to. + +### Systems configured this way are vulnerable to retry storms + +A [retry storm](https://twitter.github.io/finagle/guide/Glossary.html) +begins when one service starts (for any reason) to experience a larger than +normal failure rate. This causes its clients to retry those failed requests. +The extra load from the retries causes the service to slow down further and +fail more requests, triggering more retries. If each client is configured to +retry up to 3 times, this can quadruple the number of requests being sent! To +make matters even worse, if any of the clients’ clients are configured with +retries, the number of retries compounds multiplicatively and can turn a small +number of errors into a self-inflicted denial of service attack. + +## Retry Budgets to the Rescue + +To avoid the problems of retry storms and arbitrary numbers of retry attempts, +retries are configured using retry budgets. Rather than specifying a fixed +maximum number of retry attempts per request, Linkerd keeps track of the ratio +between regular requests and retries and keeps this number below a configurable +limit. For example, you may specify that you want retries to add at most 20% +more requests. Linkerd will then retry as much as it can while maintaining that +ratio. + +Configuring retries is always a trade-off between improving success rate and +not adding too much extra load to the system. Retry budgets make that trade-off +explicit by letting you specify exactly how much extra load your system is +willing to accept from retries. diff --git a/linkerd.io/content/2.11/features/server-policy.md b/linkerd.io/content/2.11/features/server-policy.md new file mode 100644 index 0000000000..636e260ef0 --- /dev/null +++ b/linkerd.io/content/2.11/features/server-policy.md @@ -0,0 +1,105 @@ ++++ +title = "Authorization Policy" +description = "Linkerd can restrict which types of traffic are allowed to ." ++++ + +Linkerd's server authorization policy allows you to control which types of +traffic are allowed to meshed pods. For example, you can restrict communication +to a particular service to only come from certain other services; or you can +enforce that mTLS must be used on a certain port; and so on. + +## Adding traffic policy on your services + +{{< note >}} +Linkerd can only enforce policy on meshed pods, i.e. pods where the Linkerd +proxy has been injected. If policy is a strict requirement, you should pair the +usage of these features with [HA mode](../ha/), which enforces that the proxy +*must* be present when pods start up. +{{< /note >}} + +By default Linkerd allows all traffic to transit the mesh, and uses a variety +of mechanisms, including [retries](../retries-and-timeouts/) and [load +balancing](../load-balancing/), to ensure that requests are delivered +successfully. + +Sometimes, however, we want to restrict which types of traffic are allowed. +Linkerd's policy features allow you to *deny* traffic under certain conditions. +It is configured with two basic mechanisms: + +1. A set of basic _default policies_, which can be set at the cluster, + namespace, workload, and pod level through Kubernetes annotations. +2. `Server` and `ServerAuthorization` CRDs that specify fine-grained policy + for specific ports. + +These mechanisms work in conjunction. For example, a default cluster-wide +policy of `deny` would prohibit any traffic to any meshed pod; traffic must +then be explicitly allowed through the use of `Server` and +`ServerAuthorization` CRDs. + +### Policy annotations + +The `config.linkerd.io/default-inbound-policy` annotation can be set at a +namespace, workload, and pod level, and will determine the default traffic +policy at that point in the hierarchy. Valid default policies include: + +- `all-unauthenticated`: inbound proxies allow all connections +- `all-authenticated`: inbound proxies allow only mTLS connections from other + meshed pods. +- `cluster-unauthenticated`: inbound proxies allow all connections from client + IPs in the cluster's `clusterNetworks` (must be configured at install-time). +- `cluster-authenticated`: inbound proxies allow only mTLS connections from other + meshed pods from IPs in the cluster's `clusterNetworks`. +- `deny` inbound proxies deny all connections that are not explicitly + authorized. + +See the [Policy reference](../../reference/authorization-policy/) for more default +policies. + +Every cluster has a default policy (by default, `all-unauthenticated`), set at +install / upgrade time. Annotations that are present at the workload or +namespace level *at pod creation time* can override that value to determine the +default policy for that pod. Note that the default policy is fixed at proxy +initialization time, and thus, after a pod is created, changing the annotation +will not change the default policy for that pod. + +### Policy CRDs + +The `Server` and `ServerAuthorization` CRDs further configure Linkerd's policy +beyond the default policies. In contrast to annotations, these CRDs can be +changed dynamically and policy behavior will be updated on the fly. + +A `Server` selects a port and a set of pods that is subject to policy. This set +of pods can correspond to a single workload, or to multiple workloads (e.g. +port 4191 for every pod in a namespace). Once created, a `Server` resource +denies all traffic to that port, and traffic to that port can only be enabled +by creating `ServerAuthorization` resources. + +A `ServerAuthorization` defines a set of allowed traffic to a `Server`. A +`ServerAuthorization` can allow traffic based on any number of things, +including IP address; use of mTLS; specific mTLS identities (including +wildcards, to allow for namespace selection); specific Service Accounts; and +more. + +See the [Policy reference](../../reference/authorization-policy/) for more on +the `Server` and `ServerAuthorization` resources. + +{{< note >}} +Currently, `Servers` can only reference ports that are defined as container +ports in the pod's manifest. +{{< /note >}} + +### Policy rejections + +Any traffic that is known to be HTTP (including HTTP/2 and gRPC) that is denied +by policy will result in the proxy returning an HTTP 403. Any non-HTTP traffic +will be denied at the TCP level, i.e. by refusing the connection. + +Note that dynamically changing the policy may result in abrupt termination of +existing TCP connections. + +### Examples + +See +[emojivoto-policy.yml](https://github.com/linkerd/website/blob/main/run.linkerd.io/public/emojivoto-policy.yml) +for an example set of policy definitions for the [Emojivoto sample +application](/getting-started/). diff --git a/linkerd.io/content/2.11/features/service-profiles.md b/linkerd.io/content/2.11/features/service-profiles.md new file mode 100644 index 0000000000..00075e8da8 --- /dev/null +++ b/linkerd.io/content/2.11/features/service-profiles.md @@ -0,0 +1,33 @@ ++++ +title = "Service Profiles" +description = "Linkerd's service profiles enable per-route metrics as well as retries and timeouts." +aliases = [ + "../service-profiles/" +] ++++ + +A service profile is a custom Kubernetes resource ([CRD][crd]) that can provide +Linkerd additional information about a service. In particular, it allows you to +define a list of routes for the service. Each route uses a regular expression +to define which paths should match that route. Defining a service profile +enables Linkerd to report per-route metrics and also allows you to enable +per-route features such as retries and timeouts. + +{{< note >}} +If working with headless services, service profiles cannot be retrieved. Linkerd +reads service discovery information based off the target IP address, and if that +happens to be a pod IP address then it cannot tell which service the pod belongs +to. +{{< /note >}} + +To get started with service profiles you can: + +- Look into [setting up service profiles](../../tasks/setting-up-service-profiles/) + for your own services. +- Understand what is required to see + [per-route metrics](../../tasks/getting-per-route-metrics/). +- [Configure retries](../../tasks/configuring-retries/) on your own services. +- [Configure timeouts](../../tasks/configuring-timeouts/) on your own services. +- Glance at the [reference](../../reference/service-profiles/) documentation. + +[crd]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ diff --git a/linkerd.io/content/2.11/features/telemetry.md b/linkerd.io/content/2.11/features/telemetry.md new file mode 100644 index 0000000000..47bbf055a7 --- /dev/null +++ b/linkerd.io/content/2.11/features/telemetry.md @@ -0,0 +1,79 @@ ++++ +title = "Telemetry and Monitoring" +description = "Linkerd automatically collects metrics from all services that send traffic through it." +weight = 8 +aliases = [ + "../observability/" +] ++++ + +One of Linkerd's most powerful features is its extensive set of tooling around +*observability*—the measuring and reporting of observed behavior in +meshed applications. While Linkerd doesn't have insight directly into the +*internals* of service code, it has tremendous insight into the external +behavior of service code. + +To gain access to Linkerd's observability features you only need to install the +Viz extension: + +```bash +linkerd viz install | kubectl apply -f - +``` + +Linkerd's telemetry and monitoring features function automatically, without +requiring any work on the part of the developer. These features include: + +* Recording of top-line ("golden") metrics (request volume, success rate, and + latency distributions) for HTTP, HTTP/2, and gRPC traffic. +* Recording of TCP-level metrics (bytes in/out, etc) for other TCP traffic. +* Reporting metrics per service, per caller/callee pair, or per route/path + (with [Service Profiles](../service-profiles/)). +* Generating topology graphs that display the runtime relationship between + services. +* Live, on-demand request sampling. + +This data can be consumed in several ways: + +* Through the [Linkerd CLI](../../reference/cli/), e.g. with `linkerd viz stat` and + `linkerd viz routes`. +* Through the [Linkerd dashboard](../dashboard/), and + [pre-built Grafana dashboards](../dashboard/#grafana). +* Directly from Linkerd's built-in Prometheus instance + +## Golden metrics + +### Success Rate + +This is the percentage of successful requests during a time window (1 minute by +default). + +In the output of the command `linkerd viz routes -o wide`, this metric is split +into EFFECTIVE_SUCCESS and ACTUAL_SUCCESS. For routes configured with retries, +the former calculates the percentage of success after retries (as perceived by +the client-side), and the latter before retries (which can expose potential +problems with the service). + +### Traffic (Requests Per Second) + +This gives an overview of how much demand is placed on the service/route. As +with success rates, `linkerd viz routes --o wide` splits this metric into +EFFECTIVE_RPS and ACTUAL_RPS, corresponding to rates after and before retries +respectively. + +### Latencies + +Times taken to service requests per service/route are split into 50th, 95th and +99th percentiles. Lower percentiles give you an overview of the average +performance of the system, while tail percentiles help catch outlier behavior. + +## Lifespan of Linkerd metrics + +Linkerd is not designed as a long-term historical metrics store. While +Linkerd's Viz extension does include a Prometheus instance, this instance +expires metrics at a short, fixed interval (currently 6 hours). + +Rather, Linkerd is designed to *supplement* your existing metrics store. If +Linkerd's metrics are valuable, you should export them into your existing +historical metrics store. + +See [Exporting Metrics](../../tasks/exporting-metrics/) for more. diff --git a/linkerd.io/content/2.11/features/traffic-split.md b/linkerd.io/content/2.11/features/traffic-split.md new file mode 100644 index 0000000000..34e292f5d0 --- /dev/null +++ b/linkerd.io/content/2.11/features/traffic-split.md @@ -0,0 +1,36 @@ ++++ +title = "Traffic Split (canaries, blue/green deploys)" +description = "Linkerd can dynamically send a portion of traffic to different services." ++++ + +Linkerd's traffic split functionality allows you to dynamically shift arbitrary +portions of traffic destined for a Kubernetes service to a different destination +service. This feature can be used to implement sophisticated rollout strategies +such as [canary deployments](https://martinfowler.com/bliki/CanaryRelease.html) +and +[blue/green deployments](https://martinfowler.com/bliki/BlueGreenDeployment.html), +for example, by slowly easing traffic off of an older version of a service and +onto a newer version. + +{{< note >}} +If working with headless services, traffic splits cannot be retrieved. Linkerd +reads service discovery information based off the target IP address, and if that +happens to be a pod IP address then it cannot tell which service the pod belongs +to. +{{< /note >}} + +Linkerd exposes this functionality via the +[Service Mesh Interface](https://smi-spec.io/) (SMI) +[TrafficSplit API](https://github.com/servicemeshinterface/smi-spec/tree/master/apis/traffic-split). +To use this feature, you create a Kubernetes resource as described in the +TrafficSplit spec, and Linkerd takes care of the rest. + +By combining traffic splitting with Linkerd's metrics, it is possible to +accomplish even more powerful deployment techniques that automatically take into +account the success rate and latency of old and new versions. See the +[Flagger](https://flagger.app/) project for one example of this. + +Check out some examples of what you can do with traffic splitting: + +- [Canary Releases](../../tasks/canary-release/) +- [Fault Injection](../../tasks/fault-injection/) diff --git a/linkerd.io/content/2.11/getting-started/_index.md b/linkerd.io/content/2.11/getting-started/_index.md new file mode 100644 index 0000000000..1ced97793a --- /dev/null +++ b/linkerd.io/content/2.11/getting-started/_index.md @@ -0,0 +1,258 @@ ++++ +title = "Getting Started" +aliases = [ + "/getting-started/istio/", + "/choose-your-platform/", + "/../katacoda/", + "/doc/getting-started", + "/getting-started" +] +weight = 2 +[sitemap] + priority = 1.0 ++++ + +Welcome to Linkerd! 🎈 + +In this guide, we'll walk you through how to install Linkerd into your +Kubernetes cluster. Then we'll deploy a sample application to show off what +Linkerd can do. + +Installing Linkerd is easy. First, you will install the CLI (command-line +interface) onto your local machine. Using this CLI, you'll then install the +*control plane* onto your Kubernetes cluster. Finally, you'll "mesh" one or +more of your own services by adding Linkerd's *data plane* to them. + +## Step 0: Setup + +Before we can do anything, we need to ensure you have access to modern +Kubernetes cluster and a functioning `kubectl` command on your local machine. +(If you don't already have a Kubernetes cluster, one easy option is to run one +on your local machine. There are many ways to do this, including +[kind](https://kind.sigs.k8s.io/), [k3d](https://k3d.io/), [Docker for +Desktop](https://www.docker.com/products/docker-desktop), [and +more](https://kubernetes.io/docs/setup/).) + +You can validate your setup by running: + +```bash +kubectl version --short +``` + +You should see output with both a `Client Version` and `Server Version` +component. + +Now that we have our cluster, we'll install the Linkerd CLI and use it validate +that your cluster is capable of hosting the Linkerd control plane. + +(Note: if you're using a GKE "private cluster", there are some [extra steps +required](../reference/cluster-configuration/#private-clusters) before you can +proceed to the next step.) + +## Step 1: Install the CLI + +If this is your first time running Linkerd, you will need to download the +`linkerd` command-line interface (CLI) onto your local machine. The CLI will +allow you to interact with your Linkerd deployment. + +To install the CLI manually, run: + +```bash +curl -sL run.linkerd.io/install | sh +``` + +Be sure to follow the instructions to add it to your path. + +Alternatively, if you use [Homebrew](https://brew.sh), you can install the CLI +with `brew install linkerd`. You can also download the CLI directly via the +[Linkerd releases page](https://github.com/linkerd/linkerd2/releases/). + +Once installed, verify the CLI is running correctly with: + +```bash +linkerd version +``` + +You should see the CLI version, and also `Server version: unavailable`. This is +because you haven't installed the control plane on your cluster. Don't +worry—we'll fix that soon enough. + +## Step 2: Validate your Kubernetes cluster + +Kubernetes clusters can be configured in many different ways. Before we can +install the Linkerd control plane, we need to check and validate that +everything is configured correctly. To check that your cluster is ready to +install Linkerd, run: + +```bash +linkerd check --pre +``` + +If there are any checks that do not pass, make sure to follow the provided links +and fix those issues before proceeding. + +## Step 3: Install the control plane onto your cluster + +Now that you have the CLI running locally and a cluster that is ready to go, +it's time to install the control plane. + +The first step is to install the control plane core. To do this, run: + +```bash +linkerd install | kubectl apply -f - +``` + +The `linkerd install` command generates a Kubernetes manifest with all the core +control plane resources. (Feel free to inspect the output.) Piping this +manifest into `kubectl apply` then instructs Kubernetes to add those resources +to your cluster. + +{{< note >}} +Some control plane resources require cluster-wide permissions. If you are +installing on a cluster where these permissions are restricted, you may prefer +the alternative [multi-stage install](../tasks/install/#multi-stage-install) +process, which will split these "sensitive" components into a separate, +self-contained step which can be handed off to another party. +{{< /note >}} + +Now let's wait for the control plane to finish installing. Depending on the +speed of your cluster's Internet connection, this may take a minute or two. +Wait for the control plane to be ready (and verify your installation) by +running: + +```bash +linkerd check +``` + +Next, we'll install some *extensions*. Extensions add non-critical but often +useful functionality to Linkerd. For this guide, we will need: + +1. The **viz** extension, which will install an on-cluster metric stack; or +2. The **buoyant-cloud** extension, which will connect to a hosted metrics stack. + +For this guide, you can install either or both. To install the viz extension, +run: + +```bash +linkerd viz install | kubectl apply -f - # install the on-cluster metrics stack +``` + +To install the buoyant-cloud extension, run: + +```bash +curl -sL buoyant.cloud/install | sh # get the installer +linkerd buoyant install | kubectl apply -f - # connect to the hosted metrics stack +``` + +Once you've installed your extensions, let's validate everything one last time: + +```bash +linkerd check +``` + +Assuming everything is green, we're ready for the next step! + +## Step 4: Explore Linkerd! + +With the control plane and extensions installed and running, we're now ready +to explore Linkerd! If you installed the viz extension, run: + +```bash +linkerd viz dashboard & +``` + +You should see a screen like this: + +{{< fig src="/images/getting-started/viz-empty-dashboard.png" + title="The Linkerd dashboard in action" >}} + +If you installed the buoyant-cloud extension, run: + +```bash +linkerd buoyant dashboard & +``` + +You should see a screen lke this: +{{< fig src="/images/getting-started/bcloud-empty-dashboard.png" + title="The Linkerd dashboard in action" >}} + +Click around, explore, and have fun! One thing you'll see is that, even if you +don't have any applications running on this cluster, you still have traffic! +This is because Linkerd's control plane components all have the proxy injected +(i.e. the control plane runs on the data plane), so traffic between control +plane compnments is also part of the mesh. + +## Step 5: Install the demo app + +To get a feel for how Linkerd would work for one of your services, you can +install a demo application. The *emojivoto* application is a standalone +Kubernetes application that uses a mix of gRPC and HTTP calls to allow the +users to vote on their favorite emojis. + +Install *emojivoto* into the `emojivoto` namespace by running: + +```bash +curl -sL run.linkerd.io/emojivoto.yml | kubectl apply -f - +``` + +Before we mesh it, let's take a look at the app. If you're using [Docker +Desktop](https://www.docker.com/products/docker-desktop) at this point you can +visit [http://localhost](http://localhost) directly. If you're not using +Docker Desktop, we'll need to forward the `web-svc` service. To forward +`web-svc` locally to port 8080, you can run: + +```bash +kubectl -n emojivoto port-forward svc/web-svc 8080:80 +``` + +Now visit [http://localhost:8080](http://localhost:8080). Voila! The emojivoto +app in all its glory. + +Clicking around, you might notice that some parts of *emojivoto* are broken! +For example, if you click on a doughnut emoji, you'll get a 404 page. Don't +worry, these errors are intentional. (And we can use Linkerd to identify the +problem. Check out the [debugging guide](../debugging-an-app/) if you're +interested in how to figure out exactly what is wrong.) + +Next, let's add Linkerd to *emojivoto* by running: + +```bash +kubectl get -n emojivoto deploy -o yaml \ + | linkerd inject - \ + | kubectl apply -f - +``` + +This command retrieves all of the deployments running in the `emojivoto` +namespace, runs the manifest through `linkerd inject`, and then reapplies it to +the cluster. The `linkerd inject` command adds annotations to the pod spec +instructing Linkerd to "inject" the proxy as a container to the pod spec. + +As with `install`, `inject` is a pure text operation, meaning that you can +inspect the input and output before you use it. Once piped into `kubectl +apply`, Kubernetes will execute a rolling deploy and update each pod with the +data plane's proxies, all without any downtime. + +Congratulations! You've now added Linkerd to existing services! Just as with +the control plane, it is possible to verify that everything worked the way it +should with the data plane. To do this check, run: + +```bash +linkerd -n emojivoto check --proxy +``` + +## That's it! 👏 + +Congratulations, you're now a Linkerd user! Here are some suggested next steps: + +* Use Linkerd to [debug the errors in *emojivoto*](../debugging-an-app/) +* [Add your own service](../adding-your-service/) to Linkerd without downtime +* Set up [automatic control plane mTLS credential + rotation](../tasks/automatically-rotating-control-plane-tls-credentials/) or + set a reminder to [do it + manually](../tasks/manually-rotating-control-plane-tls-credentials/) before + they expire +* Learn more about [Linkerd's architecture](../reference/architecture/) +* Hop into the #linkerd2 channel on [the Linkerd + Slack](https://slack.linkerd.io) + +Welcome to the Linkerd community! diff --git a/linkerd.io/content/2.11/overview/_index.md b/linkerd.io/content/2.11/overview/_index.md new file mode 100644 index 0000000000..5646645168 --- /dev/null +++ b/linkerd.io/content/2.11/overview/_index.md @@ -0,0 +1,69 @@ ++++ +title = "Overview" +aliases = [ + "/docs", + "/documentation", + "/2.11/", + "../docs/", + "/doc/network-performance/", + "/in-depth/network-performance/", + "/in-depth/debugging-guide/", + "/in-depth/concepts/" +] +weight = 1 ++++ + +Linkerd is a _service mesh_ for Kubernetes. It makes running services easier +and safer by giving you runtime debugging, observability, reliability, and +security—all without requiring any changes to your code. + +For a brief introduction to the service mesh model, we recommend reading [The +Service Mesh: What Every Software Engineer Needs to Know about the World's Most +Over-Hyped Technology](https://servicemesh.io/). + +Linkerd is fully open source, licensed under [Apache +v2](https://github.com/linkerd/linkerd2/blob/main/LICENSE), and is a [Cloud +Native Computing Foundation](https://cncf.io) graduated project. Linkerd is +developed in the open in the [Linkerd GitHub organization](https://github.com/linkerd). + +Linkerd has three basic components: a UI, a *data plane*, and a *control +plane*. You run Linkerd by: + +1. [Installing the CLI on your local system](../getting-started/#step-1-install-the-cli); +1. [Installing the control plane into your cluster](../getting-started/#step-3-install-linkerd-onto-the-cluster); +1. [Adding your services to Linkerd's data plane](../tasks/adding-your-service/). + +Once a service is running with Linkerd, you can use [Linkerd's +UI](../getting-started/#step-4-explore-linkerd) to inspect and +manipulate it. + +You can [get started](../getting-started/) in minutes! + +## How it works + +Linkerd works by installing a set of ultralight, transparent proxies next to +each service instance. These proxies automatically handle all traffic to and +from the service. Because they're transparent, these proxies act as highly +instrumented out-of-process network stacks, sending telemetry to, and receiving +control signals from, the control plane. This design allows Linkerd to measure +and manipulate traffic to and from your service without introducing excessive +latency. + +In order to be as small, lightweight, and safe as possible, Linkerd's proxies +are written in [Rust](https://www.rust-lang.org/) and specialized for Linkerd. +You can learn more about the proxies in the [Linkerd proxy +repo](https://github.com/linkerd/linkerd2-proxy). + +## Versions and channels + +Linkerd is currently published in several tracks: + +* [Linkerd 2.x stable releases](/edge/) +* [Linkerd 2.x edge releases.](/edge/) +* [Linkerd 1.x.](/1/overview/) + +## Next steps + +[Get started with Linkerd](../getting-started/) in minutes, or check out the +[architecture](../reference/architecture/) for more details on Linkerd's +components and how they all fit together. diff --git a/linkerd.io/content/2.11/reference/_index.md b/linkerd.io/content/2.11/reference/_index.md new file mode 100644 index 0000000000..192c211e5f --- /dev/null +++ b/linkerd.io/content/2.11/reference/_index.md @@ -0,0 +1,6 @@ ++++ +title = "Reference" +weight = 5 ++++ + +{{% sectiontoc "reference" %}} diff --git a/linkerd.io/content/2.11/reference/architecture.md b/linkerd.io/content/2.11/reference/architecture.md new file mode 100644 index 0000000000..1868c19575 --- /dev/null +++ b/linkerd.io/content/2.11/reference/architecture.md @@ -0,0 +1,155 @@ ++++ +title = "Architecture" +description = "Deep dive into the architecture of Linkerd." +aliases = [ + "../architecture/" +] ++++ + +At a high level, Linkerd consists of a *control plane* and a *data plane*. + +The *control plane* is a set of services that run in a dedicated +namespace. These services accomplish various things---aggregating telemetry +data, providing a user-facing API, providing control data to the data plane +proxies, etc. Together, they drive the behavior of the data plane. + +The *data plane* consists of transparent proxies that are run next +to each service instance. These proxies automatically handle all traffic to and +from the service. Because they're transparent, these proxies act as highly +instrumented out-of-process network stacks, sending telemetry to, and receiving +control signals from, the control plane. + +{{< fig src="/images/architecture/control-plane.png" +title="Linkerd's architecture" >}} + +## CLI + +The Linkerd CLI is typically run outside of the cluster (e.g. on your local +machine) and is used to interact with the Linkerd control planes. + +## Control Plane + +The Linkerd control plane is a set of services that run in a dedicated +Kubernetes namespace (`linkerd` by default). The control plane has several +components, enumerated below. + +### Controller + +The controller component provides an API for the CLI to interface with. + +### Destination + +The destination component is used by data plane proxies to look up where to +send requests. The destination deployment is also used to fetch service profile +information used for per-route metrics, retries and timeouts. + +### Identity + +The identity component acts as a [TLS Certificate +Authority](https://en.wikipedia.org/wiki/Certificate_authority) that accepts +[CSRs](https://en.wikipedia.org/wiki/Certificate_signing_request) from proxies +and returns signed certificates. These certificates are issued at proxy +initialization time and are used for proxy-to-proxy connections to implement +[mTLS](../../features/automatic-mtls/). + +### Proxy Injector + +The proxy injector is a Kubernetes [admission +controller][admission-controller] which receives a webhook request every time a +pod is created. This injector inspects resources for a Linkerd-specific +annotation (`linkerd.io/inject: enabled`). When that annotation exists, the +injector mutates the pod's specification and adds the `proxy-init` and +`linkerd-proxy` containers to the pod. + +### Service Profile Validator (sp-validator) + +The validator is a Kubernetes [admission controller][admission-controller], +which validates new [service profiles](../service-profiles/) before they are +saved. + +[admission-controller]: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/ + +## Data Plane + +The Linkerd data plane comprises ultralight _micro-proxies_, written in Rust, +which are deployed as sidecar containers alongside each instance of your +service code. + +These proxies transparently intercept communication to and from each pod by +utilizing iptables rules that are automatically configured by +[linkerd-init](#linkerd-init-container). These proxies are not designed to be +configured by hand. Rather, their behavior is driven by the control plane. + +You can read more about these micro-proxies here: + +* [Why Linkerd doesn't use Envoy](/2020/12/03/why-linkerd-doesnt-use-envoy/) +* [Under the hood of Linkerd's state-of-the-art Rust proxy, + Linkerd2-proxy](/2020/07/23/under-the-hood-of-linkerds-state-of-the-art-rust-proxy-linkerd2-proxy/) + +### Proxy + +An ultralight transparent _micro-proxy_ written in +[Rust](https://www.rust-lang.org/), the proxy is installed into each pod of a +meshed workload, and handles all incoming and outgoing TCP traffic to/from that +pod. This model (called a "sidecar container" or "sidecar proxy") +allows it to add functionality without requiring code changes. + +The proxy's features include: + +* Transparent, zero-config proxying for HTTP, HTTP/2, and arbitrary TCP + protocols. +* Automatic Prometheus metrics export for HTTP and TCP traffic. +* Transparent, zero-config WebSocket proxying. +* Automatic, latency-aware, layer-7 load balancing. +* Automatic layer-4 load balancing for non-HTTP traffic. +* Automatic TLS. +* An on-demand diagnostic tap API. +* And lots more. + +The proxy supports service discovery via DNS and the +[destination gRPC API](https://github.com/linkerd/linkerd2-proxy-api). + +### Linkerd Init Container + +The `linkerd-init` container is added to each meshed pod as a Kubernetes [init +container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) +that runs before any other containers are started. It [uses +iptables](https://github.com/linkerd/linkerd2-proxy-init) to route all TCP +traffic to and from the pod through the proxy. + +There are two main rules that `iptables` uses: + +* Any traffic being sent to the pod's external IP address (10.0.0.1 for + example) is forwarded to a specific port on the proxy (4143). By setting + `SO_ORIGINAL_DST` on the socket, the proxy is able to forward the traffic to + the original destination port that your application is listening on. +* Any traffic originating from within the pod and being sent to an external IP + address (not 127.0.0.1) is forwarded to a specific port on the proxy (4140). + Because `SO_ORIGINAL_DST` was set on the socket, the proxy is able to forward + the traffic to the original recipient (unless there is a reason to send it + elsewhere). This does not result in a traffic loop because the `iptables` + rules explicitly skip the proxy's UID. + +Additionally, `iptables` has rules in place for special scenarios, such as when +traffic is sent over the loopback interface: + +* When traffic is sent over the loopback interface by the application, it will + be sent directly to the process, instead of being forwarded to the proxy. This + allows an application to talk to itself, or to another container in the pod, + without being intercepted by the proxy, as long as the destination is a port + bound on localhost (such as 127.0.0.1:80, localhost:8080), or the pod's own + IP. +* When traffic is sent by the application to its own cluster IP, it will be + forwarded to the proxy. If the proxy chooses its own pod as an endpoint, then + traffic will be sent over the loopback interface directly to the application. + Consequently, traffic will not be opportunistically upgraded to mTLS or + HTTP/2. + +A list of all `iptables` rules used by Linkerd can be found [here](../iptables/) + +{{< note >}} +By default, most ports are forwarded through the proxy. This is not always +desirable and it is possible to have specific ports skip the proxy entirely for +both incoming and outgoing traffic. See the [protocol +detection](../../features/protocol-detection/) documentation. +{{< /note >}} diff --git a/linkerd.io/content/2.11/reference/authorization-policy.md b/linkerd.io/content/2.11/reference/authorization-policy.md new file mode 100644 index 0000000000..811f4e1925 --- /dev/null +++ b/linkerd.io/content/2.11/reference/authorization-policy.md @@ -0,0 +1,256 @@ ++++ +title = "Authorization Policy" +description = "Details on the specification and what is possible with policy resources." ++++ + +[Server](#server) and [ServerAuthorization](#serverauthorization) are the two types +of policy resources in Linkerd, used to control inbound access to your meshed +applications. + +During the linkerd install, the `policyController.defaultAllowPolicy` field is used +to specify the default policy when no [Server](#server) selects a pod. +This field can be one of the following: + +- `all-unauthenticated`: allow all requests. This is the default. +- `all-authenticated`: allow requests from meshed clients in the same or from + a different cluster (with multi-cluster). +- `cluster-authenticated`: allow requests from meshed clients in the same cluster. +- `cluster-unauthenticated`: allow requests from both meshed and non-meshed clients + in the same cluster. +- `deny`: all requests are denied. (Policy resources should then be created to + allow specific communications between services). + +This default can be overridden by setting the annotation `config.linkerd.io/default- +inbound-policy` on either a pod spec or its namespace. + +Once a [Server](#server) is configured for a pod & port, its default behavior +is to _deny_ traffic and [ServerAuthorization](#serverauthorization) resources +must be created to allow traffic on a `Server`. + +## Server + +A `Server` selects a port on a set of pods in the same namespace as the server. +It typically selects a single port on a pod, though it may select multiple +ports when referring to the port by name (e.g. `admin-http`). While the +`Server` resource is similar to a Kubernetes `Service`, it has the added +restriction that multiple `Server` instances must not overlap: they must not +select the same pod/port pairs. Linkerd ships with an admission controller that +tries to prevent overlapping servers from being created. + +When a Server selects a port, traffic is denied by default and [`ServerAuthorizations`](#serverauthorization) +must be used to authorize traffic on ports selected by the Server. + +### Spec + +A `Server` spec may contain the following top level fields: + +{{< table >}} +| field| value | +|------|-------| +| `podSelector`| A [podSelector](#podselector) selects pods in the same namespace. | +| `port`| A port name or number. Only ports in a pod spec's `ports` are considered. | +| `proxyProtocol`| Configures protocol discovery for inbound connections. Supersedes the `config.linkerd.io/opaque-ports` annotation. Must be one of `unknown`,`HTTP/1`,`HTTP/2`,`gRPC`,`opaque`,`TLS`. Defaults to `unknown` if not set. | +{{< /table >}} + +### podSelector + +This is the [same labelSelector field in Kubernetes](https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/label-selector/#LabelSelector). +All the pods that are part of this selector will be part of the `Server` group. +A podSelector object must contain _exactly one_ of the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `matchExpressions` | matchExpressions is a list of label selector requirements. The requirements are ANDed. | +| `matchLabels` | matchLabels is a map of {key,value} pairs. | +{{< /table >}} + +See [the Kubernetes LabelSelector reference](https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/label-selector/#LabelSelector) +for more details. + +### Server Examples + +A [Server](#server) that selects over pods with a specific label, with `gRPC` as +the `proxyProtocol`. + +```yaml +apiVersion: policy.linkerd.io/v1beta1 +kind: Server +metadata: + namespace: emojivoto + name: emoji-grpc +spec: + podSelector: + matchLabels: + app: emoji-svc + port: grpc + proxyProtocol: gRPC +``` + +A [Server](#server) that selects over pods with `matchExpressions`, with `HTTP/2` +as the `proxyProtocol`, on port `8080`. + +```yaml +apiVersion: policy.linkerd.io/v1beta1 +kind: Server +metadata: + namespace: emojivoto + name: backend-services +spec: + podSelector: + matchExpressions: + - {key: app, operator: In, values: [voting-svc, emoji-svc]} + - {key: environment, operator: NotIn, values: [dev]} + port: 8080 + proxyProtocol: "HTTP/2" +``` + +## ServerAuthorization + +A [ServerAuthorization](#serverauthorization) provides a way to authorize +traffic to one or more [`Server`](#server)s. + +### Spec + +A ServerAuthorization spec must contain the following top level fields: + +{{< table >}} +| field| value | +|------|-------| +| `client`| A [client](#client) describes clients authorized to access a server. | +| `server`| A [server](#server) identifies `Servers` in the same namespace for which this authorization applies. | +{{< /table >}} + +### Server + +A `Server` object must contain _exactly one_ of the following fields: + +{{< table >}} +| field| value | +|------|-------| +| `name`| References a `Server` instance by name. | +| `selector`| A [selector](#selector) selects servers on which this authorization applies in the same namespace. | +{{< /table >}} + +### selector + +This is the [same labelSelector field in Kubernetes](https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/label-selector/#LabelSelector). +All the servers that are part of this selector will have this authorization applied. +A selector object must contain _exactly one_ of the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `matchExpressions` | matchExpressions is a list of label selector requirements. The requirements are ANDed. | +| `matchLabels` | matchLabels is a map of {key,value} pairs. | +{{< /table >}} + +See [the Kubernetes LabelSelector reference](https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/label-selector/#LabelSelector) +for more details. + +### client + +A `client` object must contain _exactly one_ of the following fields: + +{{< table >}} +| field| value | +|------|-------| +| `meshTLS`| A [meshTLS](#meshtls) is used to authorize meshed clients to access a server. | +| `unauthenticated`| A boolean value that authorizes unauthenticated clients to access a server. | +{{< /table >}} + +Optionally, it can also contain the `networks` field: + +{{< table >}} +| field| value | +|------|-------| +| `networks`| Limits the client IP addresses to which this authorization applies. If unset, the server chooses a default (typically, all IPs or the cluster's pod network). | +{{< /table >}} + +### meshTLS + +A `meshTLS` object must contain _exactly one_ of the following fields: + +{{< table >}} +| field| value | +|------|-------| +| `unauthenticatedTLS`| A boolean to indicate that no client identity is required for communication.This is mostly important for the identity controller, which must terminate TLS connections from clients that do not yet have a certificate. | +| `identities`| A list of proxy identity strings (as provided via MTLS) that are authorized. The `*` prefix can be used to match all identities in a domain. An identity string of `*` indicates that all authentication clients are authorized. | +| `serviceAccounts`| A list of authorized client [serviceAccount](#serviceAccount)s (as provided via MTLS). | +{{< /table >}} + +### serviceAccount + +A serviceAccount field contains the following top level fields: + +{{< table >}} +| field| value | +|------|-------| +| `name`| The ServiceAccount's name. | +| `namespace`| The ServiceAccount's namespace. If unset, the authorization's namespace is used. | +{{< /table >}} + +### ServerAuthorization Examples + +A [ServerAuthorization](#serverauthorization) that allows meshed clients with +`*.emojivoto.serviceaccount.identity.linkerd.cluster.local` proxy identity i.e. all +service accounts in the `emojivoto` namespace. + +```yaml +apiVersion: policy.linkerd.io/v1beta1 +kind: ServerAuthorization +metadata: + namespace: emojivoto + name: emoji-grpc +spec: + # Allow all authenticated clients to access the (read-only) emoji service. + server: + selector: + matchLabels: + app: emoji-svc + client: + meshTLS: + identities: + - "*.emojivoto.serviceaccount.identity.linkerd.cluster.local" +``` + +A [ServerAuthorization](#serverauthorization) that allows any unauthenticated +clients. + +```yaml +apiVersion: policy.linkerd.io/v1beta1 +kind: ServerAuthorization +metadata: + namespace: emojivoto + name: web-public +spec: + server: + name: web-http + # Allow all clients to access the web HTTP port without regard for + # authentication. If unauthenticated connections are permitted, there is no + # need to describe authenticated clients. + client: + unauthenticated: true + networks: + - cidr: 0.0.0.0/0 + - cidr: ::/0 +``` + +A [ServerAuthorization](#serverauthorization) that allows meshed clients with a +specific service account. + +```yaml +apiVersion: policy.linkerd.io/v1beta1 +kind: ServerAuthorization +metadata: + namespace: emojivoto + name: prom-prometheus +spec: + server: + name: prom + client: + meshTLS: + serviceAccounts: + - namespace: linkerd-viz + name: prometheus +``` diff --git a/linkerd.io/content/2.11/reference/cli/_index.md b/linkerd.io/content/2.11/reference/cli/_index.md new file mode 100644 index 0000000000..084ff31ec7 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/_index.md @@ -0,0 +1,21 @@ ++++ +title = "CLI" +description = "Reference documentation for all the CLI commands." +aliases = [ + "../cli/" +] ++++ + +The Linkerd CLI is the primary way to interact with Linkerd. It can install the +control plane to your cluster, add the proxy to your service and provide +detailed metrics for how your service is performing. + +As reference, check out the commands below: + +{{< cli-2-10 >}} + +## Global flags + +The following flags are available for *all* linkerd CLI commands: + +{{< global-flags >}} diff --git a/linkerd.io/content/2.11/reference/cli/check.md b/linkerd.io/content/2.11/reference/cli/check.md new file mode 100644 index 0000000000..d060ca11c9 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/check.md @@ -0,0 +1,72 @@ ++++ +title = "check" +aliases = [ + "../check-reference/" +] ++++ + +{{< cli-2-10/description "check" >}} + +Take a look at the [troubleshooting](../../../tasks/troubleshooting/) documentation +for a full list of all the possible checks, what they do and how to fix them. + +{{< cli-2-10/examples "check" >}} + +## Example output + +```bash +$ linkerd check +kubernetes-api +-------------- +√ can initialize the client +√ can query the Kubernetes API + +kubernetes-version +------------------ +√ is running the minimum Kubernetes API version + +linkerd-existence +----------------- +√ control plane namespace exists +√ controller pod is running +√ can initialize the client +√ can query the control plane API + +linkerd-api +----------- +√ control plane pods are ready +√ control plane self-check +√ [kubernetes] control plane can talk to Kubernetes +√ [prometheus] control plane can talk to Prometheus + +linkerd-service-profile +----------------------- +√ no invalid service profiles + +linkerd-version +--------------- +√ can determine the latest version +√ cli is up-to-date + +control-plane-version +--------------------- +√ control plane is up-to-date +√ control plane and cli versions match + +Status check results are √ +``` + +{{< cli-2-10/flags "check" >}} + +## Subcommands + +Check supports subcommands as part of the +[Multi-stage install](../../../tasks/install/#multi-stage-install) feature. + +### config + +{{< cli-2-10/description "check config" >}} + +{{< cli-2-10/examples "check config" >}} + +{{< cli-2-10/flags "check config" >}} diff --git a/linkerd.io/content/2.11/reference/cli/completion.md b/linkerd.io/content/2.11/reference/cli/completion.md new file mode 100644 index 0000000000..3a7a34320c --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/completion.md @@ -0,0 +1,9 @@ ++++ +title = "completion" ++++ + +{{< cli-2-10/description "completion" >}} + +{{< cli-2-10/examples "completion" >}} + +{{< cli-2-10/flags "completion" >}} diff --git a/linkerd.io/content/2.11/reference/cli/diagnostics.md b/linkerd.io/content/2.11/reference/cli/diagnostics.md new file mode 100644 index 0000000000..5739e78fdd --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/diagnostics.md @@ -0,0 +1,48 @@ ++++ +title = "diagnostics" +aliases = [ + "endpoints", + "install-sp", + "metrics" +] ++++ + +{{< cli-2-10/description "diagnostics" >}} + +{{< cli-2-10/examples "diagnostics" >}} + +{{< cli-2-10/flags "diagnostics" >}} + +## Subcommands + +### controller-metrics + +{{< cli-2-10/description "diagnostics controller-metrics" >}} + +{{< cli-2-10/examples "diagnostics controller-metrics" >}} + +{{< cli-2-10/flags "diagnostics controller-metrics" >}} + +### endpoints + +{{< cli-2-10/description "diagnostics endpoints" >}} + +{{< cli-2-10/examples "diagnostics endpoints" >}} + +{{< cli-2-10/flags "diagnostics endpoints" >}} + +### install-sp + +{{< cli-2-10/description "diagnostics install-sp" >}} + +{{< cli-2-10/examples "diagnostics install-sp" >}} + +{{< cli-2-10/flags "diagnostics install-sp" >}} + +### proxy-metrics + +{{< cli-2-10/description "diagnostics proxy-metrics" >}} + +{{< cli-2-10/examples "diagnostics proxy-metrics" >}} + +{{< cli-2-10/flags "diagnostics proxy-metrics" >}} diff --git a/linkerd.io/content/2.11/reference/cli/identity.md b/linkerd.io/content/2.11/reference/cli/identity.md new file mode 100644 index 0000000000..27302e39ff --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/identity.md @@ -0,0 +1,9 @@ ++++ +title = "identity" ++++ + +{{< cli-2-10/description "identity" >}} + +{{< cli-2-10/examples "identity" >}} + +{{< cli-2-10/flags "identity" >}} diff --git a/linkerd.io/content/2.11/reference/cli/inject.md b/linkerd.io/content/2.11/reference/cli/inject.md new file mode 100644 index 0000000000..ced8ba8861 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/inject.md @@ -0,0 +1,27 @@ ++++ +title = "inject" +aliases = [ + "../inject-reference/" +] ++++ + +The `inject` command is a text transform that modifies Kubernetes manifests +passed to it either as a file or as a stream (`-`) to adds a +`linkerd.io/inject: enabled` annotation to eligible resources in the manifest. +When the resulting annotated manifest is applied to the Kubernetes cluster, +Linkerd's [proxy autoinjector](../../../features/proxy-injection/) automatically +adds the Linkerd data plane proxies to the corresponding pods. + +Note that there is no *a priori* reason to use this command. In production, +these annotations may be instead set by a CI/CD system, or any other +deploy-time mechanism. + +## Manual injection + +Alternatively, this command can also perform the full injection purely on the +client side, by enabling with the `--manual` flag. (Prior to Linkerd 2.4, this +was the default behavior.) + +{{< cli-2-10/examples "inject" >}} + +{{< cli-2-10/flags "inject" >}} diff --git a/linkerd.io/content/2.11/reference/cli/install-cni.md b/linkerd.io/content/2.11/reference/cli/install-cni.md new file mode 100644 index 0000000000..dff89cf4f6 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/install-cni.md @@ -0,0 +1,9 @@ ++++ +title = "install-cni" ++++ + +{{< cli-2-10/description "install-cni" >}} + +{{< cli-2-10/examples "install-cni" >}} + +{{< cli-2-10/flags "install-cni" >}} diff --git a/linkerd.io/content/2.11/reference/cli/install.md b/linkerd.io/content/2.11/reference/cli/install.md new file mode 100644 index 0000000000..8f2fbb7e24 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/install.md @@ -0,0 +1,33 @@ ++++ +title = "install" ++++ + +{{< cli-2-10/description "install" >}} + +For further details on how to install Linkerd onto your own cluster, check out +the [install documentation](../../../tasks/install/). + +{{< cli-2-10/examples "install" >}} + +{{< cli-2-10/flags "install" >}} + +## Subcommands + +Install supports subcommands as part of the +[Multi-stage install](../../../tasks/install/#multi-stage-install) feature. + +### config + +{{< cli-2-10/description "install config" >}} + +{{< cli-2-10/examples "install config" >}} + +{{< cli-2-10/flags "install config" >}} + +### control-plane + +{{< cli-2-10/description "install control-plane" >}} + +{{< cli-2-10/examples "install control-plane" >}} + +{{< cli-2-10/flags "install control-plane" >}} diff --git a/linkerd.io/content/2.11/reference/cli/jaeger.md b/linkerd.io/content/2.11/reference/cli/jaeger.md new file mode 100644 index 0000000000..fc17088bcb --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/jaeger.md @@ -0,0 +1,51 @@ ++++ +title = "jaeger" ++++ + +{{< cli-2-10/description "jaeger" >}} + +{{< cli-2-10/examples "jaeger" >}} + +{{< cli-2-10/flags "jaeger" >}} + +## Subcommands + +### check + +{{< cli-2-10/description "jaeger check" >}} + +{{< cli-2-10/examples "jaeger check" >}} + +{{< cli-2-10/flags "jaeger check" >}} + +### dashboard + +{{< cli-2-10/description "jaeger dashboard" >}} + +{{< cli-2-10/examples "jaeger dashboard" >}} + +{{< cli-2-10/flags "jaeger dashboard" >}} + +### install + +{{< cli-2-10/description "jaeger install" >}} + +{{< cli-2-10/examples "jaeger install" >}} + +{{< cli-2-10/flags "jaeger install" >}} + +### list + +{{< cli-2-10/description "jaeger list" >}} + +{{< cli-2-10/examples "jaeger list" >}} + +{{< cli-2-10/flags "jaeger list" >}} + +### uninstall + +{{< cli-2-10/description "jaeger uninstall" >}} + +{{< cli-2-10/examples "jaeger uninstall" >}} + +{{< cli-2-10/flags "jaeger uninstall" >}} diff --git a/linkerd.io/content/2.11/reference/cli/multicluster.md b/linkerd.io/content/2.11/reference/cli/multicluster.md new file mode 100644 index 0000000000..292bcfa194 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/multicluster.md @@ -0,0 +1,67 @@ ++++ +title = "multicluster" ++++ + +{{< cli-2-10/description "multicluster" >}} + +{{< cli-2-10/examples "multicluster" >}} + +{{< cli-2-10/flags "multicluster" >}} + +## Subcommands + +### allow + +{{< cli-2-10/description "multicluster allow" >}} + +{{< cli-2-10/examples "multicluster allow" >}} + +{{< cli-2-10/flags "multicluster allow" >}} + +### check + +{{< cli-2-10/description "multicluster check" >}} + +{{< cli-2-10/examples "multicluster check" >}} + +{{< cli-2-10/flags "multicluster check" >}} + +### gateways + +{{< cli-2-10/description "multicluster gateways" >}} + +{{< cli-2-10/examples "multicluster gateways" >}} + +{{< cli-2-10/flags "multicluster gateways" >}} + +### install + +{{< cli-2-10/description "multicluster install" >}} + +{{< cli-2-10/examples "multicluster install" >}} + +{{< cli-2-10/flags "multicluster install" >}} + +### link + +{{< cli-2-10/description "multicluster link" >}} + +{{< cli-2-10/examples "multicluster link" >}} + +{{< cli-2-10/flags "multicluster link" >}} + +### uninstall + +{{< cli-2-10/description "multicluster uninstall" >}} + +{{< cli-2-10/examples "multicluster uninstall" >}} + +{{< cli-2-10/flags "multicluster uninstall" >}} + +### unlink + +{{< cli-2-10/description "multicluster unlink" >}} + +{{< cli-2-10/examples "multicluster unlink" >}} + +{{< cli-2-10/flags "multicluster unlink" >}} diff --git a/linkerd.io/content/2.11/reference/cli/profile.md b/linkerd.io/content/2.11/reference/cli/profile.md new file mode 100644 index 0000000000..5491277757 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/profile.md @@ -0,0 +1,13 @@ ++++ +title = "profile" ++++ + +{{< cli-2-10/description "profile" >}} + +Check out the [service profile](../../../features/service-profiles/) +documentation for more details on what this command does and what you can do +with service profiles. + +{{< cli-2-10/examples "profile" >}} + +{{< cli-2-10/flags "profile" >}} diff --git a/linkerd.io/content/2.11/reference/cli/repair.md b/linkerd.io/content/2.11/reference/cli/repair.md new file mode 100644 index 0000000000..184fdb88c4 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/repair.md @@ -0,0 +1,9 @@ ++++ +title = "repair" ++++ + +{{< cli-2-10/description "repair" >}} + +{{< cli-2-10/examples "repair" >}} + +{{< cli-2-10/flags "repair" >}} diff --git a/linkerd.io/content/2.11/reference/cli/uninject.md b/linkerd.io/content/2.11/reference/cli/uninject.md new file mode 100644 index 0000000000..d5a8fc384f --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/uninject.md @@ -0,0 +1,9 @@ ++++ +title = "uninject" ++++ + +{{< cli-2-10/description "uninject" >}} + +{{< cli-2-10/examples "uninject" >}} + +{{< cli-2-10/flags "uninject" >}} diff --git a/linkerd.io/content/2.11/reference/cli/uninstall.md b/linkerd.io/content/2.11/reference/cli/uninstall.md new file mode 100644 index 0000000000..6725ca3426 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/uninstall.md @@ -0,0 +1,9 @@ ++++ +title = "uninstall" ++++ + +{{< cli-2-10/description "uninstall" >}} + +{{< cli-2-10/examples "uninstall" >}} + +{{< cli-2-10/flags "uninstall" >}} diff --git a/linkerd.io/content/2.11/reference/cli/upgrade.md b/linkerd.io/content/2.11/reference/cli/upgrade.md new file mode 100644 index 0000000000..6f3bebff1b --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/upgrade.md @@ -0,0 +1,30 @@ ++++ +title = "upgrade" ++++ + +{{< cli-2-10/description "upgrade" >}} + +{{< cli-2-10/examples "upgrade" >}} + +{{< cli-2-10/flags "upgrade" >}} + +## Subcommands + +Upgrade supports subcommands as part of the +[Multi-stage install](../../../tasks/install/#multi-stage-install) feature. + +### config + +{{< cli-2-10/description "upgrade config" >}} + +{{< cli-2-10/examples "upgrade config" >}} + +{{< cli-2-10/flags "upgrade config" >}} + +### control-plane + +{{< cli-2-10/description "upgrade control-plane" >}} + +{{< cli-2-10/examples "upgrade control-plane" >}} + +{{< cli-2-10/flags "upgrade control-plane" >}} diff --git a/linkerd.io/content/2.11/reference/cli/version.md b/linkerd.io/content/2.11/reference/cli/version.md new file mode 100644 index 0000000000..dc89774b31 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/version.md @@ -0,0 +1,9 @@ ++++ +title = "version" ++++ + +{{< cli-2-10/description "version" >}} + +{{< cli-2-10/examples "version" >}} + +{{< cli-2-10/flags "version" >}} diff --git a/linkerd.io/content/2.11/reference/cli/viz.md b/linkerd.io/content/2.11/reference/cli/viz.md new file mode 100644 index 0000000000..66c5180ef7 --- /dev/null +++ b/linkerd.io/content/2.11/reference/cli/viz.md @@ -0,0 +1,167 @@ ++++ +title = "viz" +aliases = [ + "dashboard", + "edges", + "routes", + "stat", + "tap", + "top" +] ++++ + +{{< cli-2-10/description "viz" >}} + +{{< cli-2-10/examples "viz" >}} + +{{< cli-2-10/flags "viz" >}} + +## Subcommands + +### check + +{{< cli-2-10/description "viz check" >}} + +{{< cli-2-10/examples "viz check" >}} + +{{< cli-2-10/flags "viz check" >}} + +### dashboard + +{{< cli-2-10/description "viz dashboard" >}} + +Check out the [architecture](../../architecture/#dashboard) docs for a +more thorough explanation of what this command does. + +{{< cli-2-10/examples "viz dashboard" >}} + +{{< cli-2-10/flags "viz dashboard" >}} + +(*) You'll need to tweak the dashboard's `enforced-host` parameter with this +value, as explained in [the DNS-rebinding protection +docs](../../../tasks/exposing-dashboard/#tweaking-host-requirement) + +### edges + +{{< cli-2-10/description "viz edges" >}} + +{{< cli-2-10/examples "viz edges" >}} + +{{< cli-2-10/flags "viz edges" >}} + +### install + +{{< cli-2-10/description "viz install" >}} + +{{< cli-2-10/examples "viz install" >}} + +{{< cli-2-10/flags "viz install" >}} + +### list + +{{< cli-2-10/description "viz list" >}} + +{{< cli-2-10/examples "viz list" >}} + +{{< cli-2-10/flags "viz list" >}} + +### profile + +{{< cli-2-10/description "viz profile" >}} + +{{< cli-2-10/examples "viz profile" >}} + +{{< cli-2-10/flags "viz profile" >}} + +### routes + +The `routes` command displays per-route service metrics. In order for +this information to be available, a service profile must be defined for the +service that is receiving the requests. For more information about how to +create a service profile, see [service profiles](../../../features/service-profiles/). +and the [profile](../../cli/profile/) command reference. + +## Inbound Metrics + +By default, `routes` displays *inbound* metrics for a target. In other +words, it shows information about requests which are sent to the target and +responses which are returned by the target. For example, the command: + +```bash +linkerd viz routes deploy/webapp +``` + +Displays the request volume, success rate, and latency of requests to the +`webapp` deployment. These metrics are from the `webapp` deployment's +perspective, which means that, for example, these latencies do not include the +network latency between a client and the `webapp` deployment. + +## Outbound Metrics + +If you specify the `--to` flag then `linkerd viz routes` displays *outbound* metrics +from the target resource to the resource in the `--to` flag. In contrast to +the inbound metrics, these metrics are from the perspective of the sender. This +means that these latencies do include the network latency between the client +and the server. For example, the command: + +```bash +linkerd viz routes deploy/traffic --to deploy/webapp +``` + +Displays the request volume, success rate, and latency of requests from +`traffic` to `webapp` from the perspective of the `traffic` deployment. + +## Effective and Actual Metrics + +If you are looking at *outbound* metrics (by specifying the `--to` flag) you +can also supply the `-o wide` flag to differentiate between *effective* and +*actual* metrics. + +Effective requests are requests which are sent by some client to the Linkerd +proxy. Actual requests are requests which the Linkerd proxy sends to some +server. If the Linkerd proxy is performing retries, one effective request can +translate into more than one actual request. If the Linkerd proxy is not +performing retries, effective requests and actual requests will always be equal. +When enabling retries, you should expect to see the actual request rate +increase and the effective success rate increase. See the +[retries and timeouts section](../../../features/retries-and-timeouts/) for more +information. + +Because retries are only performed on the *outbound* (client) side, the +`-o wide` flag can only be used when the `--to` flag is specified. + +{{< cli-2-10/examples "viz routes" >}} + +{{< cli-2-10/flags "viz routes" >}} + +### stat + +{{< cli-2-10/description "viz stat" >}} + +{{< cli-2-10/examples "viz stat" >}} + +{{< cli-2-10/flags "viz stat" >}} + +### tap + +{{< cli-2-10/description "viz tap" >}} + +{{< cli-2-10/examples "viz tap" >}} + +{{< cli-2-10/flags "viz tap" >}} + +### top + +{{< cli-2-10/description "viz top" >}} + +{{< cli-2-10/examples "viz top" >}} + +{{< cli-2-10/flags "viz top" >}} + +### uninstall + +{{< cli-2-10/description "viz uninstall" >}} + +{{< cli-2-10/examples "viz uninstall" >}} + +{{< cli-2-10/flags "viz uninstall" >}} diff --git a/linkerd.io/content/2.11/reference/cluster-configuration.md b/linkerd.io/content/2.11/reference/cluster-configuration.md new file mode 100644 index 0000000000..d488c59adf --- /dev/null +++ b/linkerd.io/content/2.11/reference/cluster-configuration.md @@ -0,0 +1,77 @@ ++++ +title = "Cluster Configuration" +description = "Configuration settings unique to providers and install methods." ++++ + +## GKE + +### Private Clusters + +If you are using a **private GKE cluster**, you are required to create a +firewall rule that allows the GKE operated api-server to communicate with the +Linkerd control plane. This makes it possible for features such as automatic +proxy injection to receive requests directly from the api-server. + +In this example, we will use [gcloud](https://cloud.google.com/sdk/install) to +simplify the creation of the said firewall rule. + +Setup: + +```bash +CLUSTER_NAME=your-cluster-name +gcloud config set compute/zone your-zone-or-region +``` + +Get the cluster `MASTER_IPV4_CIDR`: + +```bash +MASTER_IPV4_CIDR=$(gcloud container clusters describe $CLUSTER_NAME \ + | grep "masterIpv4CidrBlock: " \ + | awk '{print $2}') +``` + +Get the cluster `NETWORK`: + +```bash +NETWORK=$(gcloud container clusters describe $CLUSTER_NAME \ + | grep "^network: " \ + | awk '{print $2}') +``` + +Get the cluster auto-generated `NETWORK_TARGET_TAG`: + +```bash +NETWORK_TARGET_TAG=$(gcloud compute firewall-rules list \ + --filter network=$NETWORK --format json \ + | jq ".[] | select(.name | contains(\"$CLUSTER_NAME\"))" \ + | jq -r '.targetTags[0]' | head -1) +``` + +The format of the network tag should be something like `gke-cluster-name-xxxx-node`. + +Verify the values: + +```bash +echo $MASTER_IPV4_CIDR $NETWORK $NETWORK_TARGET_TAG + +# example output +10.0.0.0/28 foo-network gke-foo-cluster-c1ecba83-node +``` + +Create the firewall rules for `proxy-injector` and `tap`: + +```bash +gcloud compute firewall-rules create gke-to-linkerd-control-plane \ + --network "$NETWORK" \ + --allow "tcp:8443,tcp:8089" \ + --source-ranges "$MASTER_IPV4_CIDR" \ + --target-tags "$NETWORK_TARGET_TAG" \ + --priority 1000 \ + --description "Allow traffic on ports 8443, 8089 for linkerd control-plane components" +``` + +Finally, verify that the firewall is created: + +```bash +gcloud compute firewall-rules describe gke-to-linkerd-control-plane +``` diff --git a/linkerd.io/content/2.11/reference/extension-list.md b/linkerd.io/content/2.11/reference/extension-list.md new file mode 100644 index 0000000000..2069506a52 --- /dev/null +++ b/linkerd.io/content/2.11/reference/extension-list.md @@ -0,0 +1,14 @@ ++++ +title = "Extensions List" +description = "List of Linkerd extensions that can be added to the installation for additional functionality" ++++ + +Linkerd provides a mix of built-in and third-party +[extensions](../../tasks/extensions/) to add additional functionality to the +base installation. The following is the list of known extensions: + +{{< extensions-2-10 >}} + +If you have an extension for Linkerd and it is not on the list, [please edit +this +page!](https://github.com/linkerd/website/edit/main/linkerd.io/data/extension-list.yaml) diff --git a/linkerd.io/content/2.11/reference/iptables.md b/linkerd.io/content/2.11/reference/iptables.md new file mode 100644 index 0000000000..ade6de3148 --- /dev/null +++ b/linkerd.io/content/2.11/reference/iptables.md @@ -0,0 +1,198 @@ ++++ +title = "IPTables Reference" +description = "A table with all of the chains and associated rules" ++++ + +In order to route TCP traffic in a pod to and from the proxy, an [`init +container`](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) +is used to set up `iptables` rules at the start of an injected pod's +lifecycle. + +At first, `linkerd-init` will create two chains in the `nat` table: +`PROXY_INIT_REDIRECT`, and `PROXY_INIT_OUTPUT`. These chains are used to route +inbound and outbound packets through the proxy. Each chain has a set of rules +attached to it, these rules are traversed by a packet in order. + +## Inbound connections + +When a packet arrives in a pod, it will typically be processed by the +`PREROUTING` chain, a default chain attached to the `nat` table. The sidecar +container will create a new chain to process inbound packets, called +`PROXY_INIT_REDIRECT`. The sidecar container creates a rule +(`install-proxy-init-prerouting`) to send packets from the `PREROUTING` chain +to our redirect chain. This is the first rule traversed by an inbound packet. + +The redirect chain will be configured with two more rules: + +1. `ignore-port`: will ignore processing packets whose destination ports are + included in the `skip-inbound-ports` install option. +2. `proxy-init-redirect-all`: will redirect all incoming TCP packets through + the proxy, on port `4143`. + +Based on these two rules, there are two possible paths that an inbound packet +can take, both of which are outlined below. + +{{}} + +The packet will arrive on the `PREROUTING` chain and will be immediately routed +to the redirect chain. If its destination port matches any of the inbound ports +to skip, then it will be forwarded directly to the application process, +_bypassing the proxy_. The list of destination ports to check against can be +[configured when installing Linkerd](/2.11/reference/cli/install/#). If the +packet does not match any of the ports in the list, it will be redirected +through the proxy. Redirection is done by changing the incoming packet's +destination header, the target port will be replaced with `4143`, which is the +proxy's inbound port. The proxy will process the packet and produce a new one +that will be forwarded to the service; it will be able to get the original +target (IP:PORT) of the inbound packet by using a special socket option +[`SO_ORIGINAL_DST`](https://linux.die.net/man/3/getsockopt). The new packet +will be routed through the `OUTPUT` chain, from there it will be sent to the +application. The `OUTPUT` chain rules are covered in more detail below. + +## Outbound connections + +When a packet leaves a pod, it will first traverse the `OUTPUT` chain, the +first default chain an outgoing packet traverses in the `nat` table. To +redirect outgoing packets through the outbound side of the proxy, the sidecar +container will again create a new chain. The first outgoing rule is similar to +the inbound counterpart: any packet that traverses the `OUTPUT` chain should be +forwarded to our `PROXY_INIT_OUTPUT` chain to be processed. + +The output redirect chain is slightly harder to understand but follows the same +logical flow as the inbound redirect chain, in total there are 4 rules +configured: + +1. `ignore-proxy-uid`: any packets owned by the proxy (whose user id is + `2102`), will skip processing and return to the previous (`OUTPUT`) chain. + From there, it will be sent on the outbound network interface (either to + the application, in the case of an inbound packet, or outside of the pod, + for an outbound packet). +2. `ignore-loopback`: if the packet is sent over the loopback interface + (`lo`), it will skip processing and return to the previous chain. From + here, the packet will be sent to the destination, much like the first rule + in the chain. +3. `ignore-port`: will ignore processing packets whose destination ports are + included in the `skip-outbound-ports` install option. +4. `redirect-all-outgoing`: the last rule in the chain, it will redirect all + outgoing TCP packets to port `4140`, the proxy's outbound port. If a + packet has made it this far, it is guaranteed its destination is not local + (i.e `lo`) and it has not been produced by the proxy. This means the + packet has been produced by the service, so it should be forwarded to its + destination by the proxy. + +{{< fig src="/images/iptables/iptables-fig2-2.png" +title="Outbound iptables chain traversal" >}} + +A packet produced by the service will first hit the `OUTPUT` chain; from here, +it will be sent to our own output chain for processing. The first rule it +encounters in `PROXY_INIT_OUTPUT` will be `ignore-proxy-uid`. Since the packet +was generated by the service, this rule will be skipped. If the packet's +destination is not a port bound on localhost (e.g `127.0.0.1:80`), then it will +skip the second rule as well. The third rule, `ignore-port` will be matched if +the packet's destination port is in the outbound ports to skip list, in this +case, it will be sent out on the network interface, bypassing the proxy. If the +rule is not matched, then the packet will reach the final rule in the chain +`redirect-all-outgoing`-- as the name implies, it will be sent to the proxy to +be processed, on its outbound port `4140`. Much like in the inbound case, the +routing happens at the `nat` level, the packet's header will be re-written to +target the outbound port. The proxy will process the packet and then forward it +to its destination. The new packet will take the same path through the `OUTPUT` +chain, however, it will stop at the first rule, since it was produced by the +proxy. + +The substantiated explanation applies to a packet whose destination is another +service, outside of the pod. In practice, an application can also send traffic +locally. As such, there are two other possible scenarios that we will explore: +_when a service talks to itself_ (by sending traffic over localhost or by using +its own endpoint address), and when _a service talks to itself through a +`clusterIP` target_. Both scenarios are somehow related, but the path a packet +takes differs. + +**A service may send requests to itself**. It can also target another container +in the pod. This scenario would typically apply when: + +* The destination is the pod (or endpoint) IP address. +* The destination is a port bound on localhost (regardless of which container +it belongs to). + +{{< fig src="/images/iptables/iptables-fig2-3.png" +title="Outbound iptables chain traversal" >}} + +When the application targets itself through its pod's IP (or loopback address), +the packets will traverse the two output chains. The first rule will be +skipped, since the owner is the application, and not the proxy. Once the second +rule is matched, the packets will return to the first output chain, from here, +they'll be sent directly to the service. + +{{< note >}} +Usually, packets traverse another chain on the outbound side called +`POSTROUTING`. This chain is traversed after the `OUTPUT` chain, but to keep +the explanation simple, it has not been mentioned. Likewise, outbound packets that +are sent over the loopback interface become inbound packets, since they need to +be processed again. The kernel takes shortcuts in this case and bypasses the +`PREROUTING` chain that inbound packets from the outside world traverse when +they first arrive. For this reason, we do not need any special rules on the +inbound side to account for outbound packets that are sent locally. +{{< /note >}} + +**A service may send requests to itself using its clusterIP**. In such cases, +it is not guaranteed that the destination will be local. The packet follows an +unusual path, as depicted in the diagram below. + +{{< fig src="/images/iptables/iptables-fig2-4.png" +title="Outbound iptables chain traversal" >}} + +When the packet first traverses the output chains, it will follow the same path +an outbound packet would normally take. In such a scenario, the packet's +destination will be an address that is not considered to be local by the +kernel-- it is, after all, a virtual IP. The proxy will process the packet, at +a connection level, connections to a `clusterIP` will be load balanced between +endpoints. Chances are that the endpoint selected will be the pod itself, +packets will therefore never leave the pod; the destination will be resolved to +the podIP. The packets produced by the proxy will traverse the output chain and +stop at the first rule, then they will be forwarded to the service. This +constitutes an edge case because at this point, the packet has been processed +by the proxy, unlike the scenario previously discussed where it skips it +altogether. For this reason, at a connection level, the proxy will _not_ mTLS +or opportunistically upgrade the connection to HTTP/2 when the endpoint is +local to the pod. In practice, this is treated as if the destination was +loopback, with the exception that the packet is forwarded through the proxy, +instead of being forwarded from the service directly to itself. + +## Rules table + +For reference, you can find the actual commands used to create the rules below. +Alternatively, if you want to inspect the iptables rules created for a pod, you +can retrieve them through the following command: + +```bash +$ kubectl -n logs linkerd-init +# where is the name of the pod +# you want to see the iptables rules for +``` + +### Inbound + +{{< table >}} +| # | name | iptables rule | description| +|---|------|---------------|------------| +| 1 | redirect-common-chain | `iptables -t nat -N PROXY_INIT_REDIRECT`| creates a new `iptables` chain to add inbound redirect rules to; the chain is attached to the `nat` table | +| 2 | ignore-port | `iptables -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports -j RETURN` | configures `iptables` to ignore the redirect chain for packets whose dst ports are included in the `--skip-inbound-ports` config option | +| 3 | proxy-init-redirect-all | `iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143` | configures `iptables` to redirect all incoming TCP packets to port `4143`, the proxy's inbound port | +| 4 | install-proxy-init-prerouting | `iptables -t nat -A PREROUTING -j PROXY_INIT_REDIRECT` | the last inbound rule configures the `PREROUTING` chain (first chain a packet traverses inbound) to send packets to the redirect chain for processing | +{{< /table >}} + +### Outbound + +{{< table >}} +| # | name | iptables rule | description | +|---|------|---------------|-------------| +| 1 | redirect-common-chain | `iptables -t nat -N PROXY_INIT_OUTPUT`| creates a new `iptables` chain to add outbound redirect rules to, also attached to the `nat` table | +| 2 | ignore-proxy-uid | `iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN` | when a packet is owned by the proxy (`--uid-owner 2102`), skip processing and return to the previous (`OUTPUT`) chain | +| 3 | ignore-loopback | `iptables -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN` | when a packet is sent over the loopback interface (`lo`), skip processing and return to the previous chain | +| 4 | ignore-port | `iptables -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports -j RETURN` | configures `iptables` to ignore the redirect output chain for packets whose dst ports are included in the `--skip-outbound-ports` config option | +| 5 | redirect-all-outgoing | `iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140`| configures `iptables` to redirect all outgoing TCP packets to port `4140`, the proxy's outbound port | +| 6 | install-proxy-init-output | `iptables -t nat -A OUTPUT -j PROXY_INIT_OUTPUT` | the last outbound rule configures the `OUTPUT` chain (second before last chain a packet traverses outbound) to send packets to the redirect output chain for processing | +{{< /table >}} + diff --git a/linkerd.io/content/2.11/reference/proxy-configuration.md b/linkerd.io/content/2.11/reference/proxy-configuration.md new file mode 100644 index 0000000000..63fecda896 --- /dev/null +++ b/linkerd.io/content/2.11/reference/proxy-configuration.md @@ -0,0 +1,55 @@ ++++ +title = "Proxy Configuration" +description = "Linkerd provides a set of annotations that can be used to override the data plane proxy's configuration." ++++ + +Linkerd provides a set of annotations that can be used to **override** the data +plane proxy's configuration. This is useful for **overriding** the default +configurations of [auto-injected proxies](../../features/proxy-injection/). + +The following is the list of supported annotations: + +{{< cli-2-10/annotations "inject" >}} + +For example, to update an auto-injected proxy's CPU and memory resources, we +insert the appropriate annotations into the `spec.template.metadata.annotations` +of the owner's pod spec, using `kubectl edit` like this: + +```yaml +spec: + template: + metadata: + annotations: + config.linkerd.io/proxy-cpu-limit: "1" + config.linkerd.io/proxy-cpu-request: "0.2" + config.linkerd.io/proxy-memory-limit: 2Gi + config.linkerd.io/proxy-memory-request: 128Mi +``` + +See [here](../../tasks/configuring-proxy-concurrency/) for details on tuning the +proxy's resource usage. + +For proxies injected using the `linkerd inject` command, configuration can be +overridden using the [command-line flags](../cli/inject/). + +## Ingress Mode + +Proxy ingress mode is a mode of operation designed to help Linkerd integrate +with certain ingress controllers. Ingress mode is necessary if the ingress +itself cannot be otherwise configured to use the Service port/ip as the +destination. + +When an individual Linkerd proxy is set to `ingress` mode, it will route +requests based on their `:authority`, `Host`, or `l5d-dst-override` headers +instead of their original destination. This will inform Linkerd to override the +endpoint selection of the ingress container and to perform its own endpoint +selection, enabling features such as per-route metrics and traffic splitting. + +The proxy can be made to run in `ingress` mode by used the `linkerd.io/inject: +ingress` annotation rather than the default `linkerd.io/inject: enabled` +annotation. This can also be done with the `--ingress` flag in the `inject` CLI +command: + +```bash +kubectl get deployment -n -o yaml | linkerd inject --ingress - | kubectl apply -f - +``` diff --git a/linkerd.io/content/2.11/reference/proxy-log-level.md b/linkerd.io/content/2.11/reference/proxy-log-level.md new file mode 100644 index 0000000000..facb9eb161 --- /dev/null +++ b/linkerd.io/content/2.11/reference/proxy-log-level.md @@ -0,0 +1,39 @@ ++++ +title = "Proxy Log Level" +description = "Syntax of the proxy log level." ++++ + +The Linkerd proxy's log level can be configured via the: + +* `LINKERD_PROXY_LOG` environment variable +* `--proxy-log-level` CLI flag of the `install`, `inject` and `upgrade` commands +* `config.linkerd.io/proxy-log-level` annotation + (see [Proxy Configuration](../proxy-configuration/)) + which sets `LINKERD_PROXY_LOG` environment-variable on the injected sidecar +* an [endpoint on the admin port](../../tasks/modifying-proxy-log-level/) + of a running proxy. + +The log level is a comma-separated list of log directives, which is +based on the logging syntax of the [`env_logger` crate](https://docs.rs/env_logger/0.6.1/env_logger/#enabling-logging). + +A log directive consists of either: + +* A level (e.g. `info`), which sets the global log level, or +* A module path (e.g. `foo` or `foo::bar::baz`), or +* A module path followed by an equals sign and a level (e.g. `foo=warn` +or `foo::bar::baz=debug`), which sets the log level for that module + +A level is one of: + +* `trace` +* `debug` +* `info` +* `warn` +* `error` + +A module path represents the path to a Rust module. It consists of one or more +module names, separated by `::`. + +A module name starts with a letter, and consists of alphanumeric characters and `_`. + +The proxy's default log level is set to `warn,linkerd2_proxy=info`. diff --git a/linkerd.io/content/2.11/reference/proxy-metrics.md b/linkerd.io/content/2.11/reference/proxy-metrics.md new file mode 100644 index 0000000000..3dee4fb451 --- /dev/null +++ b/linkerd.io/content/2.11/reference/proxy-metrics.md @@ -0,0 +1,206 @@ ++++ +title = "Proxy Metrics" +description = "The Linkerd proxy natively exports Prometheus metrics for all incoming and outgoing traffic." +aliases = [ + "/proxy-metrics/", + "../proxy-metrics/", + "../observability/proxy-metrics/" +] ++++ + +The Linkerd proxy exposes metrics that describe the traffic flowing through the +proxy. The following metrics are available at `/metrics` on the proxy's metrics +port (default: `:4191`) in the [Prometheus format][prom-format]. + +## Protocol-Level Metrics + +* `request_total`: A counter of the number of requests the proxy has received. + This is incremented when the request stream begins. + +* `response_total`: A counter of the number of responses the proxy has received. + This is incremented when the response stream ends. + +* `response_latency_ms`: A histogram of response latencies. This measurement + reflects the [time-to-first-byte][ttfb] (TTFB) by recording the elapsed time + between the proxy processing a request's headers and the first data frame of the + response. If a response does not include any data, the end-of-stream event is + used. The TTFB measurement is used so that Linkerd accurately reflects + application behavior when a server provides response headers immediately but is + slow to begin serving the response body. + +* `route_request_total`, `route_response_latency_ms`, and `route_response_total`: + These metrics are analogous to `request_total`, `response_latency_ms`, and + `response_total` except that they are collected at the route level. This + means that they do not have `authority`, `tls`, `grpc_status_code` or any + outbound labels but instead they have: + * `dst`: The authority of this request. + * `rt_route`: The name of the route for this request. + +* `control_request_total`, `control_response_latency_ms`, and `control_response_total`: + These metrics are analogous to `request_total`, `response_latency_ms`, and + `response_total` but for requests that the proxy makes to the Linkerd control + plane. Instead of `authority`, `direction`, or any outbound labels, instead + they have: + * `addr`: The address used to connect to the control plane. + +Note that latency measurements are not exported to Prometheus until the stream +_completes_. This is necessary so that latencies can be labeled with the appropriate +[response classification](#response-labels). + +### Labels + +Each of these metrics has the following labels: + +* `authority`: The value of the `:authority` (HTTP/2) or `Host` (HTTP/1.1) + header of the request. +* `direction`: `inbound` if the request originated from outside of the pod, + `outbound` if the request originated from inside of the pod. +* `tls`: `true` if the request's connection was secured with TLS. + +#### Response Labels + +The following labels are only applicable on `response_*` metrics. + +* `classification`: `success` if the response was successful, or `failure` if + a server error occurred. This classification is based on + the gRPC status code if one is present, and on the HTTP + status code otherwise. Only applicable to response metrics. +* `grpc_status_code`: The value of the `grpc-status` trailer. Only applicable + for gRPC responses. +* `status_code`: The HTTP status code of the response. + +#### Outbound labels + +The following labels are only applicable if `direction=outbound`. + +* `dst_deployment`: The deployment to which this request is being sent. +* `dst_k8s_job`: The job to which this request is being sent. +* `dst_replicaset`: The replica set to which this request is being sent. +* `dst_daemonset`: The daemon set to which this request is being sent. +* `dst_statefulset`: The stateful set to which this request is being sent. +* `dst_replicationcontroller`: The replication controller to which this request + is being sent. +* `dst_namespace`: The namespace to which this request is being sent. +* `dst_service`: The service to which this request is being sent. +* `dst_pod_template_hash`: The [pod-template-hash][pod-template-hash] of the pod + to which this request is being sent. This label + selector roughly approximates a pod's `ReplicaSet` or + `ReplicationController`. + +#### Prometheus Collector labels + +The following labels are added by the Prometheus collector. + +* `instance`: ip:port of the pod. +* `job`: The Prometheus job responsible for the collection, typically + `linkerd-proxy`. + +##### Kubernetes labels added at collection time + +Kubernetes namespace, pod name, and all labels are mapped to corresponding +Prometheus labels. + +* `namespace`: Kubernetes namespace that the pod belongs to. +* `pod`: Kubernetes pod name. +* `pod_template_hash`: Corresponds to the [pod-template-hash][pod-template-hash] + Kubernetes label. This value changes during redeploys and + rolling restarts. This label selector roughly + approximates a pod's `ReplicaSet` or + `ReplicationController`. + +##### Linkerd labels added at collection time + +Kubernetes labels prefixed with `linkerd.io/` are added to your application at +`linkerd inject` time. More specifically, Kubernetes labels prefixed with +`linkerd.io/proxy-*` will correspond to these Prometheus labels: + +* `daemonset`: The daemon set that the pod belongs to (if applicable). +* `deployment`: The deployment that the pod belongs to (if applicable). +* `k8s_job`: The job that the pod belongs to (if applicable). +* `replicaset`: The replica set that the pod belongs to (if applicable). +* `replicationcontroller`: The replication controller that the pod belongs to + (if applicable). +* `statefulset`: The stateful set that the pod belongs to (if applicable). + +### Example + +Here's a concrete example, given the following pod snippet: + +```yaml +name: vote-bot-5b7f5657f6-xbjjw +namespace: emojivoto +labels: + app: vote-bot + linkerd.io/control-plane-ns: linkerd + linkerd.io/proxy-deployment: vote-bot + pod-template-hash: "3957278789" + test: vote-bot-test +``` + +The resulting Prometheus labels will look like this: + +```bash +request_total{ + pod="vote-bot-5b7f5657f6-xbjjw", + namespace="emojivoto", + app="vote-bot", + control_plane_ns="linkerd", + deployment="vote-bot", + pod_template_hash="3957278789", + test="vote-bot-test", + instance="10.1.3.93:4191", + job="linkerd-proxy" +} +``` + +## Transport-Level Metrics + +The following metrics are collected at the level of the underlying transport +layer. + +* `tcp_open_total`: A counter of the total number of opened transport + connections. +* `tcp_close_total`: A counter of the total number of transport connections + which have closed. +* `tcp_open_connections`: A gauge of the number of transport connections + currently open. +* `tcp_write_bytes_total`: A counter of the total number of sent bytes. This is + updated when the connection closes. +* `tcp_read_bytes_total`: A counter of the total number of received bytes. This + is updated when the connection closes. +* `tcp_connection_duration_ms`: A histogram of the duration of the lifetime of a + connection, in milliseconds. This is updated when the connection closes. + +### Labels + +Each of these metrics has the following labels: + +* `direction`: `inbound` if the connection was established either from outside the + pod to the proxy, or from the proxy to the application, + `outbound` if the connection was established either from the + application to the proxy, or from the proxy to outside the pod. +* `peer`: `src` if the connection was accepted by the proxy from the source, + `dst` if the connection was opened by the proxy to the destination. + +Note that the labels described above under the heading "Prometheus Collector labels" +are also added to transport-level metrics, when applicable. + +#### Connection Close Labels + +The following labels are added only to metrics which are updated when a +connection closes (`tcp_close_total` and `tcp_connection_duration_ms`): + +* `classification`: `success` if the connection terminated cleanly, `failure` if + the connection closed due to a connection failure. + +[prom-format]: https://prometheus.io/docs/instrumenting/exposition_formats/#format-version-0.0.4 +[pod-template-hash]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#pod-template-hash-label +[ttfb]: https://en.wikipedia.org/wiki/Time_to_first_byte + +## Identity Metrics + +* `identity_cert_expiration_timestamp_seconds`: A gauge of the time when the + proxy's current mTLS identity certificate will expire (in seconds since the UNIX + epoch). +* `identity_cert_refresh_count`: A counter of the total number of times the + proxy's mTLS identity certificate has been refreshed by the Identity service. diff --git a/linkerd.io/content/2.11/reference/service-profiles.md b/linkerd.io/content/2.11/reference/service-profiles.md new file mode 100644 index 0000000000..4cf32098df --- /dev/null +++ b/linkerd.io/content/2.11/reference/service-profiles.md @@ -0,0 +1,135 @@ ++++ +title = "Service Profiles" +description = "Details on the specification and what is possible with service profiles." ++++ + +[Service profiles](../../features/service-profiles/) provide Linkerd additional +information about a service. This is a reference for everything that can be done +with service profiles. + +## Spec + +A service profile spec must contain the following top level fields: + +{{< table >}} +| field| value | +|------|-------| +| `routes`| a list of [route](#route) objects | +| `retryBudget`| a [retry budget](#retry-budget) object that defines the maximum retry rate to this service | +{{< /table >}} + +## Route + +A route object must contain the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `name` | the name of this route as it will appear in the route label | +| `condition` | a [request match](#request-match) object that defines if a request matches this route | +| `responseClasses` | (optional) a list of [response class](#response-class) objects | +| `isRetryable` | indicates that requests to this route are always safe to retry and will cause the proxy to retry failed requests on this route whenever possible | +| `timeout` | the maximum amount of time to wait for a response (including retries) to complete after the request is sent | +{{< /table >}} + +## Request Match + +A request match object must contain _exactly one_ of the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `pathRegex` | a regular expression to match the request path against | +| `method` | one of GET, POST, PUT, DELETE, OPTION, HEAD, TRACE | +| `all` | a list of [request match](#request-match) objects which must _all_ match | +| `any` | a list of [request match](#request-match) objects, at least one of which must match | +| `not` | a [request match](#request-match) object which must _not_ match | +{{< /table >}} + +### Request Match Usage Examples + +The simplest condition is a path regular expression: + +```yaml +pathRegex: '/authors/\d+' +``` + +This is a condition that checks the request method: + +```yaml +method: POST +``` + +If more than one condition field is set, all of them must be satisfied. This is +equivalent to using the 'all' condition: + +```yaml +all: +- pathRegex: '/authors/\d+' +- method: POST +``` + +Conditions can be combined using 'all', 'any', and 'not': + +```yaml +any: +- all: + - method: POST + - pathRegex: '/authors/\d+' +- all: + - not: + method: DELETE + - pathRegex: /info.txt +``` + +## Response Class + +A response class object must contain the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `condition` | a [response match](#response-match) object that defines if a response matches this response class | +| `isFailure` | a boolean that defines if these responses should be classified as failed | +{{< /table >}} + +## Response Match + +A response match object must contain _exactly one_ of the following fields: + +{{< table >}} +| field | value | +|-------|-------| +| `status` | a [status range](#status-range) object to match the response status code against | +| `all` | a list of [response match](#response-match) objects which must _all_ match | +| `any` | a list of [response match](#response-match) objects, at least one of which must match | +| `not` | a [response match](#response-match) object which must _not_ match | +{{< /table >}} + +Response Match conditions can be combined in a similar way as shown above for +[Request Match Usage Examples](#request-match-usage-examples) + +## Status Range + +A status range object must contain _at least one_ of the following fields. +Specifying only one of min or max matches just that one status code. + +{{< table >}} +| field | value | +|-------|-------| +| `min` | the status code must be greater than or equal to this value | +| `max` | the status code must be less than or equal to this value | +{{< /table >}} + +## Retry Budget + +A retry budget specifies the maximum total number of retries that should be sent +to this service as a ratio of the original request volume. + +{{< table >}} +| field | value | +|-------|-------| +| `retryRatio` | the maximum ratio of retries requests to original requests | +| `minRetriesPerSecond` | allowance of retries per second in addition to those allowed by the retryRatio | +| `ttl` | indicates for how long requests should be considered for the purposes of calculating the retryRatio | +{{< /table >}} diff --git a/linkerd.io/content/2.11/tasks/_index.md b/linkerd.io/content/2.11/tasks/_index.md new file mode 100644 index 0000000000..8b03a48e00 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/_index.md @@ -0,0 +1,15 @@ ++++ +title = "Tasks" +weight = 4 +aliases = [ + "./next-steps/" +] ++++ + +As a complement to the [Linkerd feature docs]({{% ref "../features" %}}) and +the [Linkerd reference docs]({{% ref "../reference" %}}), we've provided guides +and examples of common tasks that you may need to perform when using Linkerd. + +## Common Linkerd tasks + +{{% sectiontoc "tasks" %}} diff --git a/linkerd.io/content/2.11/tasks/adding-your-service.md b/linkerd.io/content/2.11/tasks/adding-your-service.md new file mode 100644 index 0000000000..0d3e5f51ea --- /dev/null +++ b/linkerd.io/content/2.11/tasks/adding-your-service.md @@ -0,0 +1,109 @@ ++++ +title = "Adding Your Services to Linkerd" +description = "In order for your services to take advantage of Linkerd, they also need to be *meshed* by injecting Linkerd's data plane proxy into their pods." +aliases = [ + "../adding-your-service/", + "../automating-injection/" +] ++++ + +Adding Linkerd's control plane to your cluster doesn't change anything about +your application. In order for your services to take advantage of Linkerd, they +need to be *meshed*, by injecting Linkerd's data plane proxy into their pods. + +For most applications, meshing a service is as simple as adding a Kubernetes +annotation. However, services that make network calls immediately on startup +may need to [handle startup race +conditions](#a-note-on-startup-race-conditions), and services that use MySQL, +SMTP, Memcache, and similar protocols may need to [handle server-speaks-first +protocols](#a-note-on-server-speaks-first-protocols). + +Read on for more! + +## Meshing a service with annotations + +Meshing a Kubernetes resource is typically done by annotating the resource, or +its namespace, with the `linkerd.io/inject: enabled` Kubernetes annotation. +This annotation triggers automatic proxy injection when the resources are +created or updated. (See the [proxy injection +page](../../features/proxy-injection/) for more on how this works.) + +For convenience, Linkerd provides a [`linkerd +inject`](../../reference/cli/inject/) text transform command will add this +annotation to a given Kubernetes manifest. Of course, these annotations can be +set by any other mechanism. + +{{< note >}} +Simply adding the annotation will not automatically mesh existing pods. After +setting the annotation, you will need to recreate or update any resources (e.g. +with `kubectl rollout restart`) to trigger proxy injection. (Often, a +[rolling +update](https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/) +can be performed to inject the proxy into a live service without interruption.) +{{< /note >}} + +## Example + +To add Linkerd's data plane proxies to a service defined in a Kubernetes +manifest, you can use `linkerd inject` to add the annotations before applying +the manifest to Kubernetes: + +```bash +cat deployment.yml | linkerd inject - | kubectl apply -f - +``` + +This example transforms the `deployment.yml` file to add injection annotations +in the correct places, then applies it to the cluster. + +## Verifying the data plane pods have been injected + +To verify that your services have been added to the mesh, you can query +Kubernetes for the list of containers in the pods and ensure that the proxy is +listed: + +```bash +kubectl -n MYNAMESPACE get po -o jsonpath='{.items[0].spec.containers[*].name}' +``` + +If everything was successful, you'll see `linkerd-proxy` in the output, e.g.: + +```bash +MYCONTAINER linkerd-proxy +``` + +## A note on startup race conditions + +While the proxy starts very quickly, Kubernetes doesn't provide any guarantees +about container startup ordering, so the application container may start before +the proxy is ready. This means that any connections made immediately at app +startup time may fail until the proxy is active. + +In many cases, this can be ignored: the application will ideally retry the +connection, or Kubernetes will restart the container after it fails, and +eventually the proxy will be ready. Alternatively, you can use +[linkerd-await](https://github.com/linkerd/linkerd-await) to delay the +application container until the proxy is ready, or set a +[`skip-outbound-ports` +annotation](../../features/protocol-detection/#skipping-the-proxy) +to bypass the proxy for these connections. + +## A note on server-speaks-first protocols + +Linkerd's [protocol +detection](../../features/protocol-detection/) works by +looking at the first few bytes of client data to determine the protocol of the +connection. Some protocols such as MySQL, SMTP, and other server-speaks-first +protocols don't send these bytes. In some cases, this may require additional +configuration to avoid a 10-second delay in establishing the first connection. +See [Configuring protocol +detection](../../features/protocol-detection/#configuring-protocol-detection) +for details. + +## More reading + +For more information on how the inject command works and all of the parameters +that can be set, see the [`linkerd inject` reference +page](../../reference/cli/inject/). + +For details on how autoinjection works, see the the [proxy injection +page](../../features/proxy-injection/). diff --git a/linkerd.io/content/2.11/tasks/automatically-rotating-control-plane-tls-credentials.md b/linkerd.io/content/2.11/tasks/automatically-rotating-control-plane-tls-credentials.md new file mode 100644 index 0000000000..68c8a15bfd --- /dev/null +++ b/linkerd.io/content/2.11/tasks/automatically-rotating-control-plane-tls-credentials.md @@ -0,0 +1,234 @@ ++++ +title = "Automatically Rotating Control Plane TLS Credentials" +description = "Use cert-manager to automatically rotate control plane TLS credentials." +aliases = [ "use_external_certs" ] ++++ + +Linkerd's [automatic mTLS](../../features/automatic-mtls/) feature uses a set of +TLS credentials to generate TLS certificates for proxies: a trust anchor, and +an issuer certificate and private key. While Linkerd automatically rotates the +TLS certificates for data plane proxies every 24 hours, it does not rotate the +TLS credentials used to issue these certificate. In this doc, we'll describe +how to automatically rotate the issuer certificate and private key, by using +an external solution. + +(Note that Linkerd's trust anchor [must still be manually +rotated](../manually-rotating-control-plane-tls-credentials/) on +long-lived clusters.) + +## Cert manager + +[Cert-manager](https://github.com/jetstack/cert-manager) is a popular project +for making TLS credentials from external sources available to Kubernetes +clusters. + +As a first step, [install cert-manager on your +cluster](https://docs.cert-manager.io/en/latest/getting-started/install/kubernetes.html). + +{{< note >}} +If you are installing cert-manager `>= 1.0`, +you will need to have kubernetes `>= 1.16`. +Legacy custom resource definitions in cert-manager for kubernetes `<= 1.15` +do not have a keyAlgorithm option, +so the certificates will be generated using RSA and be incompatible with linkerd. + +See [v0.16 to v1.0 upgrade notes](https://cert-manager.io/docs/installation/upgrading/upgrading-0.16-1.0/) +for more details on version requirements. +{{< /note >}} + +### Cert manager as an on-cluster CA + +In this case, rather than pulling credentials from an external +source, we'll configure it to act as an on-cluster +[CA](https://en.wikipedia.org/wiki/Certificate_authority) and have it re-issue +Linkerd's issuer certificate and private key on a periodic basis. + +First, create the namespace that cert-manager will use to store its +Linkerd-related resources. For simplicity, we suggest the default Linkerd +control plane namespace: + +```bash +kubectl create namespace linkerd +``` + +#### Save the signing key pair as a Secret + +Next, using the [`step`](https://smallstep.com/cli/) tool, create a signing key +pair and store it in a Kubernetes Secret in the namespace created above: + +```bash +step certificate create root.linkerd.cluster.local ca.crt ca.key \ + --profile root-ca --no-password --insecure && + kubectl create secret tls \ + linkerd-trust-anchor \ + --cert=ca.crt \ + --key=ca.key \ + --namespace=linkerd +``` + +For a longer-lived trust anchor certificate, pass the `--not-after` argument +to the step command with the desired value (e.g. `--not-after=87600h`). + +#### Create an Issuer referencing the secret + +With the Secret in place, we can create a cert-manager "Issuer" resource that +references it: + +```bash +cat <}} +Due to a [bug](https://github.com/jetstack/cert-manager/issues/2942) in +cert-manager, if you are using cert-manager version `0.15` with experimental +controllers, the certificate it issues are not compatible with with Linkerd +versions `<= stable-2.8.1`. + +Your `linkerd-identity` pods will likely crash with the following log output: + +```log +"Failed to initialize identity service: failed to read CA from disk: +unsupported block type: 'PRIVATE KEY'" +``` + +Some possible ways to resolve this issue are: + +- Upgrade Linkerd to the edge versions `>= edge-20.6.4` which contains +a [fix](https://github.com/linkerd/linkerd2/pull/4597/). +- Upgrade cert-manager to versions `>= 0.16`. + [(how to upgrade)](https://cert-manager.io/docs/installation/upgrading/upgrading-0.15-0.16/) +- Turn off cert-manager experimental controllers. + [(docs)](https://cert-manager.io/docs/release-notes/release-notes-0.15/#using-the-experimental-controllers) + +{{< /note >}} + +### Alternative CA providers + +Instead of using Cert Manager as CA, you can configure it to rely on a number +of other solutions such as [Vault](https://www.vaultproject.io). More detail on +how to setup the existing Cert Manager to use different type of issuers +can be found [here](https://cert-manager.io/docs/configuration/vault/). + +## Third party cert management solutions + +It is important to note that the mechanism that Linkerd provides is also +usable outside of cert-manager. Linkerd will read the `linkerd-identity-issuer` +Secret, and if it's of type `kubernetes.io/tls`, will use the contents as its +TLS credentials. This means that any solution that is able to rotate TLS +certificates by writing them to this secret can be used to provide dynamic +TLS certificate management. + +You could generate that secret with a command such as: + +```bash +kubectl create secret tls linkerd-identity-issuer --cert=issuer.crt --key=issuer.key --namespace=linkerd +``` + +Where `issuer.crt` and `issuer.key` would be the cert and private key of an +intermediary cert rooted at the trust root (`ca.crt`) referred above (check this +[guide](../generate-certificates/) to see how to generate them). + +Note that the root cert (`ca.crt`) needs to be included in that Secret as well. +You can just edit the generated Secret and include the `ca.crt` field with the +contents of the file base64-encoded. + +After setting up the `linkerd-identity-issuer` Secret, continue with the +following instructions to install and configure Linkerd to use it. + +## Using these credentials with CLI installation + +For CLI installation, the Linkerd control plane should be installed with the +`--identity-external-issuer` flag, which instructs Linkerd to read certificates +from the `linkerd-identity-issuer` secret. Whenever certificate and key stored +in the secret are updated, the `identity` service will automatically detect +this change and reload the new credentials. + +Voila! We have set up automatic rotation of Linkerd's control plane TLS +credentials. And if you want to monitor the update process, you can check the +`IssuerUpdated` events emitted by the service: + +```bash +kubectl get events --field-selector reason=IssuerUpdated -n linkerd +``` + +## Installing with Helm + +For Helm installation, rather than running `linkerd install`, set the +`identityTrustAnchorsPEM` to the value of `ca.crt` in the +`linkerd-identity-issuer` Secret: + +```bash +helm install linkerd2 \ + --set-file identityTrustAnchorsPEM=ca.crt \ + --set identity.issuer.scheme=kubernetes.io/tls \ + --set installNamespace=false \ + linkerd/linkerd2 \ + -n linkerd +``` + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} + +See [Automatically Rotating Webhook TLS +Credentials](../automatically-rotating-webhook-tls-credentials/) for how +to do something similar for webhook TLS credentials. diff --git a/linkerd.io/content/2.11/tasks/automatically-rotating-webhook-tls-credentials.md b/linkerd.io/content/2.11/tasks/automatically-rotating-webhook-tls-credentials.md new file mode 100644 index 0000000000..2a1e4b9b62 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/automatically-rotating-webhook-tls-credentials.md @@ -0,0 +1,323 @@ ++++ +title = "Automatically Rotating Webhook TLS Credentials" +description = "Use cert-manager to automatically rotate webhook TLS credentials." ++++ + +The Linkerd control plane contains several components, called webhooks, which +are called directly by Kubernetes itself. The traffic from Kubernetes to the +Linkerd webhooks is secured with TLS and therefore each of the webhooks requires +a secret containing TLS credentials. These certificates are different from the +ones that the Linkerd proxies use to secure pod-to-pod communication and use a +completely separate trust chain. For more information on rotating the TLS +credentials used by the Linkerd proxies, see +[Automatically Rotating Control Plane TLS Credentials](../use_external_certs/). + +By default, when Linkerd is installed +with the Linkerd CLI or with the Linkerd Helm chart, TLS credentials are +automatically generated for all of the webhooks. If these certificates expire +or need to be regenerated for any reason, performing a +[Linkerd upgrade](../upgrade/) (using the Linkerd CLI or using Helm) will +regenerate them. + +This workflow is suitable for most users. However, if you need these webhook +certificates to be rotated automatically on a regular basis, it is possible to +use cert-manager to automatically manage them. + +## Install Cert manager + +As a first step, [install cert-manager on your +cluster](https://docs.cert-manager.io/en/latest/getting-started/install/kubernetes.html) +and create the namespaces that cert-manager will use to store its +webhook-related resources. For simplicity, we suggest using the defaule +namespace linkerd uses: + +```bash +# control plane core +kubectl create namespace linkerd + +# viz (ignore if not using the viz extension) +kubectl create namespace linkerd-viz + +# viz (ignore if not using the jaeger extension) +kubectl create namespace linkerd-jaeger +``` + +## Save the signing key pair as a Secret + +Next, we will use the [`step`](https://smallstep.com/cli/) tool, to create a +signing key pair which will be used to sign each of the webhook certificates: + +```bash +step certificate create webhook.linkerd.cluster.local ca.crt ca.key \ + --profile root-ca --no-password --insecure --san webhook.linkerd.cluster.local + +kubectl create secret tls webhook-issuer-tls --cert=ca.crt --key=ca.key --namespace=linkerd + +# ignore if not using the viz extension +kubectl create secret tls webhook-issuer-tls --cert=ca.crt --key=ca.key --namespace=linkerd-viz + +# ignore if not using the jaeger extension +kubectl create secret tls webhook-issuer-tls --cert=ca.crt --key=ca.key --namespace=linkerd-jaeger +``` + +## Create Issuers referencing the secrets + +With the Secrets in place, we can create cert-manager "Issuer" resources that +reference them: + +```bash +cat < config.yml < config-viz.yml < config-jaeger.yml <}} +When installing Linkerd with Helm, you must also provide the issuer trust root +and issuer credentials as described in [Installing Linkerd with Helm](../install-helm/). +{{< /note >}} + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} + +See [Automatically Rotating Control Plane TLS +Credentials](../automatically-rotating-control-plane-tls-credentials/) +for details on how to do something similar for control plane credentials. diff --git a/linkerd.io/content/2.11/tasks/books.md b/linkerd.io/content/2.11/tasks/books.md new file mode 100644 index 0000000000..ad407545f4 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/books.md @@ -0,0 +1,472 @@ ++++ +title = "Debugging HTTP applications with per-route metrics" +description = "Follow a long-form example of debugging a failing HTTP application using per-route metrics." ++++ + +This demo is of a Ruby application that helps you manage your bookshelf. It +consists of multiple microservices and uses JSON over HTTP to communicate with +the other services. There are three services: + +- [webapp](https://github.com/BuoyantIO/booksapp/blob/master/webapp.rb): the + frontend + +- [authors](https://github.com/BuoyantIO/booksapp/blob/master/authors.rb): an + API to manage the authors in the system + +- [books](https://github.com/BuoyantIO/booksapp/blob/master/books.rb): an API + to manage the books in the system + +For demo purposes, the app comes with a simple traffic generator. The overall +topology looks like this: + +{{< fig src="/images/books/topology.png" title="Topology" >}} + +## Prerequisites + +To use this guide, you'll need to have Linkerd and its Viz extension installed +on your cluster. Follow the [Installing Linkerd Guide](../install/) if +you haven't already done this. + +## Install the app + +To get started, let's install the books app onto your cluster. In your local +terminal, run: + +```bash +kubectl create ns booksapp && \ + curl -sL https://run.linkerd.io/booksapp.yml \ + | kubectl -n booksapp apply -f - +``` + +This command creates a namespace for the demo, downloads its Kubernetes +resource manifest and uses `kubectl` to apply it to your cluster. The app +comprises the Kubernetes deployments and services that run in the `booksapp` +namespace. + +Downloading a bunch of containers for the first time takes a little while. +Kubernetes can tell you when all the services are running and ready for +traffic. Wait for that to happen by running: + +```bash +kubectl -n booksapp rollout status deploy webapp +``` + +You can also take a quick look at all the components that were added to your +cluster by running: + +```bash +kubectl -n booksapp get all +``` + +Once the rollout has completed successfully, you can access the app itself by +port-forwarding `webapp` locally: + +```bash +kubectl -n booksapp port-forward svc/webapp 7000 & +``` + +Open [http://localhost:7000/](http://localhost:7000/) in your browser to see the +frontend. + +{{< fig src="/images/books/frontend.png" title="Frontend" >}} + +Unfortunately, there is an error in the app: if you click *Add Book*, it will +fail 50% of the time. This is a classic case of non-obvious, intermittent +failure---the type that drives service owners mad because it is so difficult to +debug. Kubernetes itself cannot detect or surface this error. From Kubernetes's +perspective, it looks like everything's fine, but you know the application is +returning errors. + +{{< fig src="/images/books/failure.png" title="Failure" >}} + +## Add Linkerd to the service + +Now we need to add the Linkerd data plane proxies to the service. The easiest +option is to do something like this: + +```bash +kubectl get -n booksapp deploy -o yaml \ + | linkerd inject - \ + | kubectl apply -f - +``` + +This command retrieves the manifest of all deployments in the `booksapp` +namespace, runs them through `linkerd inject`, and then re-applies with +`kubectl apply`. The `linkerd inject` command annotates each resource to +specify that they should have the Linkerd data plane proxies added, and +Kubernetes does this when the manifest is reapplied to the cluster. Best of +all, since Kubernetes does a rolling deploy, the application stays running the +entire time. (See [Automatic Proxy Injection](../../features/proxy-injection/) for +more details on how this works.) + +## Debugging + +Let's use Linkerd to discover the root cause of this app's failures. To check +out the Linkerd dashboard, run: + +```bash +linkerd viz dashboard & +``` + +{{< fig src="/images/books/dashboard.png" title="Dashboard" >}} + +Select `booksapp` from the namespace dropdown and click on the +[Deployments](http://localhost:50750/namespaces/booksapp/deployments) workload. +You should see all the deployments in the `booksapp` namespace show up. There +will be success rate, requests per second, and latency percentiles. + +That’s cool, but you’ll notice that the success rate for `webapp` is not 100%. +This is because the traffic generator is submitting new books. You can do the +same thing yourself and push that success rate even lower. Click on `webapp` in +the Linkerd dashboard for a live debugging session. + +You should now be looking at the detail view for the `webapp` service. You’ll +see that `webapp` is taking traffic from `traffic` (the load generator), and it +has two outgoing dependencies: `authors` and `book`. One is the service for +pulling in author information and the other is the service for pulling in book +information. + +{{< fig src="/images/books/webapp-detail.png" title="Detail" >}} + +A failure in a dependent service may be exactly what’s causing the errors that +`webapp` is returning (and the errors you as a user can see when you click). We +can see that the `books` service is also failing. Let’s scroll a little further +down the page, we’ll see a live list of all traffic endpoints that `webapp` is +receiving. This is interesting: + +{{< fig src="/images/books/top.png" title="Top" >}} + +Aha! We can see that inbound traffic coming from the `webapp` service going to +the `books` service is failing a significant percentage of the time. That could +explain why `webapp` was throwing intermittent failures. Let’s click on the tap +(🔬) icon and then on the Start button to look at the actual request and +response stream. + +{{< fig src="/images/books/tap.png" title="Tap" >}} + +Indeed, many of these requests are returning 500’s. + +It was surprisingly easy to diagnose an intermittent issue that affected only a +single route. You now have everything you need to open a detailed bug report +explaining exactly what the root cause is. If the `books` service was your own, +you know exactly where to look in the code. + +## Service Profiles + +To understand the root cause, we used live traffic. For some issues this is +great, but what happens if the issue is intermittent and happens in the middle of +the night? [Service profiles](../../features/service-profiles/) provide Linkerd +with some additional information about your services. These define the routes +that you're serving and, among other things, allow for the collection of metrics +on a per route basis. With Prometheus storing these metrics, you'll be able to +sleep soundly and look up intermittent issues in the morning. + +One of the easiest ways to get service profiles setup is by using existing +[OpenAPI (Swagger)](https://swagger.io/docs/specification/about/) specs. This +demo has published specs for each of its services. You can create a service +profile for `webapp` by running: + +```bash +curl -sL https://run.linkerd.io/booksapp/webapp.swagger \ + | linkerd -n booksapp profile --open-api - webapp \ + | kubectl -n booksapp apply -f - +``` + +This command will do three things: + +1. Fetch the swagger specification for `webapp`. +1. Take the spec and convert it into a service profile by using the `profile` + command. +1. Apply this configuration to the cluster. + +Alongside `install` and `inject`, `profile` is also a pure text operation. Check +out the profile that is generated: + +```yaml +apiVersion: linkerd.io/v1alpha2 +kind: ServiceProfile +metadata: + creationTimestamp: null + name: webapp.booksapp.svc.cluster.local + namespace: booksapp +spec: + routes: + - condition: + method: GET + pathRegex: / + name: GET / + - condition: + method: POST + pathRegex: /authors + name: POST /authors + - condition: + method: GET + pathRegex: /authors/[^/]* + name: GET /authors/{id} + - condition: + method: POST + pathRegex: /authors/[^/]*/delete + name: POST /authors/{id}/delete + - condition: + method: POST + pathRegex: /authors/[^/]*/edit + name: POST /authors/{id}/edit + - condition: + method: POST + pathRegex: /books + name: POST /books + - condition: + method: GET + pathRegex: /books/[^/]* + name: GET /books/{id} + - condition: + method: POST + pathRegex: /books/[^/]*/delete + name: POST /books/{id}/delete + - condition: + method: POST + pathRegex: /books/[^/]*/edit + name: POST /books/{id}/edit +``` + +The `name` refers to the FQDN of your Kubernetes service, +`webapp.booksapp.svc.cluster.local` in this instance. Linkerd uses the `Host` +header of requests to associate service profiles with requests. When the proxy +sees a `Host` header of `webapp.booksapp.svc.cluster.local`, it will use that to +look up the service profile's configuration. + +Routes are simple conditions that contain the method (`GET` for example) and a +regex to match the path. This allows you to group REST style resources together +instead of seeing a huge list. The names for routes can be whatever you'd like. +For this demo, the method is appended to the route regex. + +To get profiles for `authors` and `books`, you can run: + +```bash +curl -sL https://run.linkerd.io/booksapp/authors.swagger \ + | linkerd -n booksapp profile --open-api - authors \ + | kubectl -n booksapp apply -f - +curl -sL https://run.linkerd.io/booksapp/books.swagger \ + | linkerd -n booksapp profile --open-api - books \ + | kubectl -n booksapp apply -f - +``` + +Verifying that this all works is easy when you use `linkerd viz tap`. Each live +request will show up with what `:authority` or `Host` header is being seen as +well as the `:path` and `rt_route` being used. Run: + +```bash +linkerd viz tap -n booksapp deploy/webapp -o wide | grep req +``` + +This will watch all the live requests flowing through `webapp` and look +something like: + +```bash +req id=0:1 proxy=in src=10.1.3.76:57152 dst=10.1.3.74:7000 tls=true :method=POST :authority=webapp.default:7000 :path=/books/2878/edit src_res=deploy/traffic src_ns=booksapp dst_res=deploy/webapp dst_ns=booksapp rt_route=POST /books/{id}/edit +``` + +As you can see: + +- `:authority` is the correct host +- `:path` correctly matches +- `rt_route` contains the name of the route + +These metrics are part of the [`linkerd viz routes`](../../reference/cli/viz/#routes) +command instead of [`linkerd viz stat`](../../reference/cli/viz/#stat). To see the +metrics that have accumulated so far, run: + +```bash +linkerd viz -n booksapp routes svc/webapp +``` + +This will output a table of all the routes observed and their golden metrics. +The `[DEFAULT]` route is a catch all for anything that does not match the +service profile. + +Profiles can be used to observe *outgoing* requests as well as *incoming* +requests. To do that, run: + +```bash +linkerd viz -n booksapp routes deploy/webapp --to svc/books +``` + +This will show all requests and routes that originate in the `webapp` deployment +and are destined to the `books` service. Similarly to using `tap` and `top` +views in the [debugging](#debugging) section, the root cause of errors in this +demo is immediately apparent: + +```bash +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /books/{id}.json books 100.00% 0.5rps 18ms 29ms 30ms +GET /books.json books 100.00% 1.1rps 7ms 12ms 18ms +GET /books/{id}.json books 100.00% 2.5rps 6ms 10ms 10ms +POST /books.json books 52.24% 2.2rps 23ms 34ms 39ms +PUT /books/{id}.json books 41.98% 1.4rps 73ms 97ms 99ms +[DEFAULT] books - - - - - +``` + +## Retries + +As it can take a while to update code and roll out a new version, let's +tell Linkerd that it can retry requests to the failing endpoint. This will +increase request latencies, as requests will be retried multiple times, but not +require rolling out a new version. + +In this application, the success rate of requests from the `books` deployment to +the `authors` service is poor. To see these metrics, run: + +```bash +linkerd viz -n booksapp routes deploy/books --to svc/authors +``` + +The output should look like: + +```bash +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /authors/{id}.json authors - - - - - +GET /authors.json authors - - - - - +GET /authors/{id}.json authors - - - - - +HEAD /authors/{id}.json authors 50.85% 3.9rps 5ms 10ms 17ms +POST /authors.json authors - - - - - +[DEFAULT] authors - - - - - +``` + +One thing that’s clear is that all requests from books to authors are to the +`HEAD /authors/{id}.json` route and those requests are failing about 50% of the +time. + +To correct this, let’s edit the authors service profile and make those +requests retryable by running: + +```bash +kubectl -n booksapp edit sp/authors.booksapp.svc.cluster.local +``` + +You'll want to add `isRetryable` to a specific route. It should look like: + +```yaml +spec: + routes: + - condition: + method: HEAD + pathRegex: /authors/[^/]*\.json + name: HEAD /authors/{id}.json + isRetryable: true ### ADD THIS LINE ### +``` + +After editing the service profile, Linkerd will begin to retry requests to +this route automatically. We see a nearly immediate improvement in success rate +by running: + +```bash +linkerd viz -n booksapp routes deploy/books --to svc/authors -o wide +``` + +This should look like: + +```bash +ROUTE SERVICE EFFECTIVE_SUCCESS EFFECTIVE_RPS ACTUAL_SUCCESS ACTUAL_RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /authors/{id}.json authors - - - - - 0ms +GET /authors.json authors - - - - - 0ms +GET /authors/{id}.json authors - - - - - 0ms +HEAD /authors/{id}.json authors 100.00% 2.8rps 58.45% 4.7rps 7ms 25ms 37ms +POST /authors.json authors - - - - - 0ms +[DEFAULT] authors - - - - - 0ms +``` + +You'll notice that the `-o wide` flag has added some columns to the `routes` +view. These show the difference between `EFFECTIVE_SUCCESS` and +`ACTUAL_SUCCESS`. The difference between these two show how well retries are +working. `EFFECTIVE_RPS` and `ACTUAL_RPS` show how many requests are being sent +to the destination service and and how many are being received by the client's +Linkerd proxy. + +With retries automatically happening now, success rate looks great but the p95 +and p99 latencies have increased. This is to be expected because doing retries +takes time. + +## Timeouts + +Linkerd can limit how long to wait before failing outgoing requests to another +service. These timeouts work by adding another key to a service profile's routes +configuration. + +To get started, let's take a look at the current latency for requests from +`webapp` to the `books` service: + +```bash +linkerd viz -n booksapp routes deploy/webapp --to svc/books +``` + +This should look something like: + +```bash +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /books/{id}.json books 100.00% 0.7rps 10ms 27ms 29ms +GET /books.json books 100.00% 1.3rps 9ms 34ms 39ms +GET /books/{id}.json books 100.00% 2.0rps 9ms 52ms 91ms +POST /books.json books 100.00% 1.3rps 45ms 140ms 188ms +PUT /books/{id}.json books 100.00% 0.7rps 80ms 170ms 194ms +[DEFAULT] books - - - - - +``` + +Requests to the `books` service's `PUT /books/{id}.json` route include retries +for when that service calls the `authors` service as part of serving those +requests, as described in the previous section. This improves success rate, at +the cost of additional latency. For the purposes of this demo, let's set a 25ms +timeout for calls to that route. Your latency numbers will vary depending on the +characteristics of your cluster. To edit the `books` service profile, run: + +```bash +kubectl -n booksapp edit sp/books.booksapp.svc.cluster.local +``` + +Update the `PUT /books/{id}.json` route to have a timeout: + +```yaml +spec: + routes: + - condition: + method: PUT + pathRegex: /books/[^/]*\.json + name: PUT /books/{id}.json + timeout: 25ms ### ADD THIS LINE ### +``` + +Linkerd will now return errors to the `webapp` REST client when the timeout is +reached. This timeout includes retried requests and is the maximum amount of +time a REST client would wait for a response. + +Run `routes` to see what has changed: + +```bash +linkerd viz -n booksapp routes deploy/webapp --to svc/books -o wide +``` + +With timeouts happening now, the metrics will change: + +```bash +ROUTE SERVICE EFFECTIVE_SUCCESS EFFECTIVE_RPS ACTUAL_SUCCESS ACTUAL_RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /books/{id}.json books 100.00% 0.7rps 100.00% 0.7rps 8ms 46ms 49ms +GET /books.json books 100.00% 1.3rps 100.00% 1.3rps 9ms 33ms 39ms +GET /books/{id}.json books 100.00% 2.2rps 100.00% 2.2rps 8ms 19ms 28ms +POST /books.json books 100.00% 1.3rps 100.00% 1.3rps 27ms 81ms 96ms +PUT /books/{id}.json books 86.96% 0.8rps 100.00% 0.7rps 75ms 98ms 100ms +[DEFAULT] books - - - - - +``` + +The latency numbers include time spent in the `webapp` application itself, so +it's expected that they exceed the 25ms timeout that we set for requests from +`webapp` to `books`. We can see that the timeouts are working by observing that +the effective success rate for our route has dropped below 100%. + +## Clean Up + +To remove the books app and the booksapp namespace from your cluster, run: + +```bash +curl -sL https://run.linkerd.io/booksapp.yml \ + | kubectl -n booksapp delete -f - \ + && kubectl delete ns booksapp +``` diff --git a/linkerd.io/content/2.11/tasks/canary-release.md b/linkerd.io/content/2.11/tasks/canary-release.md new file mode 100644 index 0000000000..92bcf045f9 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/canary-release.md @@ -0,0 +1,298 @@ ++++ +title = "Automated Canary Releases" +description = "Reduce deployment risk by combining Linkerd and Flagger to automate canary releases based on service metrics." ++++ + +Linkerd's [traffic split](../../features/traffic-split/) feature allows you to +dynamically shift traffic between services. This can be used to implement +lower-risk deployment strategies like blue-green deploys and canaries. + +But simply shifting traffic from one version of a service to the next is just +the beginning. We can combine traffic splitting with [Linkerd's automatic +*golden metrics* telemetry](../../features/telemetry/) and drive traffic decisions +based on the observed metrics. For example, we can gradually shift traffic from +an old deployment to a new one while continually monitoring its success rate. If +at any point the success rate drops, we can shift traffic back to the original +deployment and back out of the release. Ideally, our users remain happy +throughout, not noticing a thing! + +In this tutorial, we'll walk you through how to combine Linkerd with +[Flagger](https://flagger.app/), a progressive delivery tool that ties +Linkerd's metrics and traffic splitting together in a control loop, +allowing for fully-automated, metrics-aware canary deployments. + +## Prerequisites + +- To use this guide, you'll need to have Linkerd installed on your cluster, + along with its Viz extension. + Follow the [Installing Linkerd Guide](../install/) if you haven't + already done this. +- The installation of Flagger depends on `kubectl` 1.14 or newer. + +## Install Flagger + +While Linkerd will be managing the actual traffic routing, Flagger automates +the process of creating new Kubernetes resources, watching metrics and +incrementally sending users over to the new version. To add Flagger to your +cluster and have it configured to work with Linkerd, run: + +```bash +kubectl apply -k github.com/fluxcd/flagger/kustomize/linkerd +``` + +This command adds: + +- The canary + [CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) + that enables configuring how a rollout should occur. +- RBAC which grants Flagger permissions to modify all the resources that it + needs to, such as deployments and services. +- A controller configured to interact with the Linkerd control plane. + +To watch until everything is up and running, you can use `kubectl`: + +```bash +kubectl -n linkerd rollout status deploy/flagger +``` + +## Set up the demo + +This demo consists of three components: a load generator, a deployment and a +frontend. The deployment creates a pod that returns some information such as +name. You can use the responses to watch the incremental rollout as Flagger +orchestrates it. A load generator simply makes it easier to execute the rollout +as there needs to be some kind of active traffic to complete the operation. +Together, these components have a topology that looks like: + +{{< fig src="/images/canary/simple-topology.svg" + title="Topology" >}} + +To add these components to your cluster and include them in the Linkerd +[data plane](../../reference/architecture/#data-plane), run: + +```bash +kubectl create ns test && \ + kubectl apply -f https://run.linkerd.io/flagger.yml +``` + +Verify that everything has started up successfully by running: + +```bash +kubectl -n test rollout status deploy podinfo +``` + +Check it out by forwarding the frontend service locally and opening +[http://localhost:8080](http://localhost:8080) locally by running: + +```bash +kubectl -n test port-forward svc/frontend 8080 +``` + +{{< note >}} +Traffic shifting occurs on the *client* side of the connection and not the +server side. Any requests coming from outside the mesh will not be shifted and +will always be directed to the primary backend. A service of type `LoadBalancer` +will exhibit this behavior as the source is not part of the mesh. To shift +external traffic, add your ingress controller to the mesh. +{{< /note>}} + +## Configure the release + +Before changing anything, you need to configure how a release should be rolled +out on the cluster. The configuration is contained in a +[Canary](https://docs.flagger.app/tutorials/linkerd-progressive-delivery) +definition. To apply to your cluster, run: + +```bash +cat < 8080/TCP 96m +podinfo ClusterIP 10.7.252.86 9898/TCP 96m +podinfo-canary ClusterIP 10.7.245.17 9898/TCP 23m +podinfo-primary ClusterIP 10.7.249.63 9898/TCP 23m +``` + +At this point, the topology looks a little like: + +{{< fig src="/images/canary/initialized.svg" + title="Initialized" >}} + +{{< note >}} +This guide barely touches all the functionality provided by Flagger. Make sure +to read the [documentation](https://docs.flagger.app/) if you're interested in +combining canary releases with HPA, working off custom metrics or doing other +types of releases such as A/B testing. +{{< /note >}} + +## Start the rollout + +As a system, Kubernetes resources have two major sections: the spec and status. +When a controller sees a spec, it tries as hard as it can to make the status of +the current system match the spec. With a deployment, if any of the pod spec +configuration is changed, a controller will kick off a rollout. By default, the +deployment controller will orchestrate a [rolling +update](https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/). + +In this example, Flagger will notice that a deployment's spec changed and start +orchestrating the canary rollout. To kick this process off, you can update the +image to a new version by running: + +```bash +kubectl -n test set image deployment/podinfo \ + podinfod=quay.io/stefanprodan/podinfo:1.7.1 +``` + +Any kind of modification to the pod's spec such as updating an environment +variable or annotation would result in the same behavior as updating the image. + +On update, the canary deployment (`podinfo`) will be scaled up. Once ready, +Flagger will begin to update the [TrafficSplit CRD](../../features/traffic-split/) +incrementally. With a configured stepWeight of 10, each increment will increase +the weight of `podinfo` by 10. For each period, the success rate will be +observed and as long as it is over the threshold of 99%, Flagger will continue +the rollout. To watch this entire process, run: + +```bash +kubectl -n test get ev --watch +``` + +While an update is occurring, the resources and traffic will look like this at a +high level: + +{{< fig src="/images/canary/ongoing.svg" + title="Ongoing" >}} + +After the update is complete, this picture will go back to looking just like the +figure from the previous section. + +{{< note >}} +You can toggle the image tag between `1.7.1` and `1.7.0` to start the rollout +again. +{{< /note >}} + +### Resource + +The canary resource updates with the current status and progress. You can watch +by running: + +```bash +watch kubectl -n test get canary +``` + +Behind the scenes, Flagger is splitting traffic between the primary and canary +backends by updating the traffic split resource. To watch how this configuration +changes over the rollout, run: + +```bash +kubectl -n test get trafficsplit podinfo -o yaml +``` + +Each increment will increase the weight of `podinfo-canary` and decrease the +weight of `podinfo-primary`. Once the rollout is successful, the weight of +`podinfo-primary` will be set back to 100 and the underlying canary deployment +(`podinfo`) will be scaled down. + +### Metrics + +As traffic shifts from the primary deployment to the canary one, Linkerd +provides visibility into what is happening to the destination of requests. The +metrics show the backends receiving traffic in real time and measure the success +rate, latencies and throughput. From the CLI, you can watch this by running: + +```bash +watch linkerd viz -n test stat deploy --from deploy/load +``` + +For something a little more visual, you can use the dashboard. Start it by +running `linkerd viz dashboard` and then look at the detail page for the +[podinfo traffic +split](http://localhost:50750/namespaces/test/trafficsplits/podinfo). + +{{< fig src="/images/canary/traffic-split.png" + title="Dashboard" >}} + +### Browser + +Visit again [http://localhost:8080](http://localhost:8080). Refreshing the page +will show toggling between the new version and a different header color. +Alternatively, running `curl http://localhost:8080` will return a JSON response +that looks something like: + +```bash +{ + "hostname": "podinfo-primary-74459c7db8-lbtxf", + "version": "1.7.0", + "revision": "4fc593f42c7cd2e7319c83f6bfd3743c05523883", + "color": "blue", + "message": "greetings from podinfo v1.7.0", + "goos": "linux", + "goarch": "amd64", + "runtime": "go1.11.2", + "num_goroutine": "6", + "num_cpu": "8" +} +``` + +This response will slowly change as the rollout continues. + +## Cleanup + +To cleanup, remove the Flagger controller from your cluster and delete the +`test` namespace by running: + +```bash +kubectl delete -k github.com/fluxcd/flagger/kustomize/linkerd && \ + kubectl delete ns test +``` diff --git a/linkerd.io/content/2.11/tasks/configuring-proxy-concurrency.md b/linkerd.io/content/2.11/tasks/configuring-proxy-concurrency.md new file mode 100644 index 0000000000..b88f6ba5e8 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/configuring-proxy-concurrency.md @@ -0,0 +1,131 @@ ++++ +title = "Configuring Proxy Concurrency" +description = "Limit the Linkerd proxy's CPU usage." ++++ + +The Linkerd data plane's proxies are multithreaded, and are capable of running a +variable number of worker threads so that their resource usage matches the +application workload. + +In a vacuum, of course, proxies will exhibit the best throughput and lowest +latency when allowed to use as many CPU cores as possible. However, in practice, +there are other considerations to take into account. + +A real world deployment is _not_ a load test where clients and servers perform +no other work beyond saturating the proxy with requests. Instead, the service +mesh model has proxy instances deployed as sidecars to application containers. +Each proxy only handles traffic to and from the pod it is injected into. This +means that throughput and latency are limited by the application workload. If an +application container instance can only handle so many requests per second, it +may not actually matter that the proxy could handle more. In fact, giving the +proxy more CPU cores than it requires to keep up with the application may _harm_ +overall performance, as the application may have to compete with the proxy for +finite system resources. + +Therefore, it is more important for individual proxies to handle their traffic +efficiently than to configure all proxies to handle the maximum possible load. +The primary method of tuning proxy resource usage is limiting the number of +worker threads used by the proxy to forward traffic. There are multiple methods +for doing this. + +## Using the `proxy-cpu-limit` Annotation + +The simplest way to configure the proxy's thread pool is using the +`config.linkerd.io/proxy-cpu-limit` annotation. This annotation configures the +proxy injector to set an environment variable that controls the number of CPU +cores the proxy will use. + +When installing Linkerd using the [`linkerd install` CLI +command](../install/), the `--proxy-cpu-limit` argument sets this +annotation globally for all proxies injected by the Linkerd installation. For +example, + +```bash +linkerd install --proxy-cpu-limit 2 | kubectl apply -f - +``` + +For more fine-grained configuration, the annotation may be added to any +[injectable Kubernetes resource](../../proxy-injection/), such as a namespace, pod, +or deployment. + +For example, the following will configure any proxies in the `my-deployment` +deployment to use two CPU cores: + +```yaml +kind: Deployment +apiVersion: apps/v1 +metadata: + name: my-deployment + # ... +spec: + template: + metadata: + annotations: + config.linkerd.io/proxy-cpu-limit: '1' + # ... +``` + +{{< note >}} Unlike Kubernetes CPU limits and requests, which can be expressed +in milliCPUs, the `proxy-cpu-limit` annotation should be expressed in whole +numbers of CPU cores. Fractional values will be rounded up to the nearest whole +number. {{< /note >}} + +## Using Kubernetes CPU Limits and Requests + +Kubernetes provides +[CPU limits and CPU requests](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit) +to configure the resources assigned to any pod or container. These may also be +used to configure the Linkerd proxy's CPU usage. However, depending on how the +kubelet is configured, using Kubernetes resource limits rather than the +`proxy-cpu-limit` annotation may not be ideal. + +The kubelet uses one of two mechanisms for enforcing pod CPU limits. This is +determined by the +[`--cpu-manager-policy` kubelet option](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#configuration). +With the default CPU manager policy, +[`none`](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#none-policy), +the kubelet uses +[CFS quotas](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) to enforce +CPU limits. This means that the Linux kernel is configured to limit the amount +of time threads belonging to a given process are scheduled. Alternatively, the +CPU manager policy may be set to +[`static`](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy). +In this case, the kubelet will use Linux `cgroup`s to enforce CPU limits for +containers which meet certain criteria. + +When the environment variable configured by the `proxy-cpu-limit` annotation is +unset, the proxy will run a number of worker threads equal to the number of CPU +cores available. This means that with the default `none` CPU manager policy, the +proxy may spawn a large number of worker threads, but the Linux kernel will +limit how often they are scheduled. This is less efficient than simply reducing +the number of worker threads, as `proxy-cpu-limit` does: more time is spent on +context switches, and each worker thread will run less frequently, potentially +impacting latency. + +On the other hand, using +[cgroup cpusets](https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt) +will limit the number of CPU cores available to the process. In essence, it will +appear to the proxy that the system has fewer CPU cores than it actually does. +This will result in similar behavior to the `proxy-cpu-limit` annotation. + +However, it's worth noting that in order for this mechanism to be used, certain +criteria must be met: + +- The kubelet must be configured with the `static` CPU manager policy +- The pod must be in the + [Guaranteed QoS class](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed). + This means that all containers in the pod must have both a limit and a request + for memory and CPU, and the limit for each must have the same value as the + request. +- The CPU limit and CPU request must be an integer greater than or equal to 1. + +If you're not sure whether these criteria will all be met, it's best to use the +`proxy-cpu-limit` annotation in addition to any Kubernetes CPU limits and +requests. + +## Using Helm + +When using [Helm](../install-helm/), users must take care to set the +`proxy.cores` Helm variable in addition to `proxy.cpu.limit`, if +the criteria for cgroup-based CPU limits +[described above](#using-kubernetes-cpu-limits-and-requests) are not met. diff --git a/linkerd.io/content/2.11/tasks/configuring-retries.md b/linkerd.io/content/2.11/tasks/configuring-retries.md new file mode 100644 index 0000000000..942d500ddd --- /dev/null +++ b/linkerd.io/content/2.11/tasks/configuring-retries.md @@ -0,0 +1,81 @@ ++++ +title = "Configuring Retries" +description = "Configure Linkerd to automatically retry failing requests." ++++ + +In order for Linkerd to do automatic retries of failures, there are two +questions that need to be answered: + +- Which requests should be retried? +- How many times should the requests be retried? + +Both of these questions can be answered by specifying a bit of extra information +in the [service profile](../../features/service-profiles/) for the service you're +sending requests to. + +The reason why these pieces of configuration are required is because retries can +potentially be dangerous. Automatically retrying a request that changes state +(e.g. a request that submits a financial transaction) could potentially impact +your user's experience negatively. In addition, retries increase the load on +your system. A set of services that have requests being constantly retried +could potentially get taken down by the retries instead of being allowed time +to recover. + +Check out the [retries section](../books/#retries) of the books demo +for a tutorial of how to configure retries. + +## Retries + +For routes that are idempotent and don't have bodies, you can edit the service +profile and add `isRetryable` to the retryable route: + +```yaml +spec: + routes: + - name: GET /api/annotations + condition: + method: GET + pathRegex: /api/annotations + isRetryable: true ### ADD THIS LINE ### +``` + +## Retry Budgets + +A retry budget is a mechanism that limits the number of retries that can be +performed against a service as a percentage of original requests. This +prevents retries from overwhelming your system. By default, retries may add at +most an additional 20% to the request load (plus an additional 10 "free" +retries per second). These settings can be adjusted by setting a `retryBudget` +on your service profile. + +```yaml +spec: + retryBudget: + retryRatio: 0.2 + minRetriesPerSecond: 10 + ttl: 10s +``` + +## Monitoring Retries + +Retries can be monitored by using the `linkerd viz routes` command with the `--to` +flag and the `-o wide` flag. Since retries are performed on the client-side, +we need to use the `--to` flag to see metrics for requests that one resource +is sending to another (from the server's point of view, retries are just +regular requests). When both of these flags are specified, the `linkerd routes` +command will differentiate between "effective" and "actual" traffic. + +```bash +ROUTE SERVICE EFFECTIVE_SUCCESS EFFECTIVE_RPS ACTUAL_SUCCESS ACTUAL_RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +HEAD /authors/{id}.json authors 100.00% 2.8rps 58.45% 4.7rps 7ms 25ms 37ms +[DEFAULT] authors 0.00% 0.0rps 0.00% 0.0rps 0ms 0ms 0ms +``` + +Actual requests represent all requests that the client actually sends, including +original requests and retries. Effective requests only count the original +requests. Since an original request may trigger one or more retries, the actual +request volume is usually higher than the effective request volume when retries +are enabled. Since an original request may fail the first time, but a retry of +that request might succeed, the effective success rate is usually ([but not +always](../configuring-timeouts/#monitoring-timeouts)) higher than the +actual success rate. diff --git a/linkerd.io/content/2.11/tasks/configuring-timeouts.md b/linkerd.io/content/2.11/tasks/configuring-timeouts.md new file mode 100644 index 0000000000..69804b9ca1 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/configuring-timeouts.md @@ -0,0 +1,40 @@ ++++ +title = "Configuring Timeouts" +description = "Configure Linkerd to automatically fail requests that take too long." ++++ + +To limit how long Linkerd will wait before failing an outgoing request to +another service, you can configure timeouts. These work by adding a little bit +of extra information to the [service profile](../../features/service-profiles/) for +the service you're sending requests to. + +Each route may define a timeout which specifies the maximum amount of time to +wait for a response (including retries) to complete after the request is sent. +If this timeout is reached, Linkerd will cancel the request, and return a 504 +response. If unspecified, the default timeout is 10 seconds. + +```yaml +spec: + routes: + - condition: + method: HEAD + pathRegex: /authors/[^/]*\.json + name: HEAD /authors/{id}.json + timeout: 300ms +``` + +Check out the [timeouts section](../books/#timeouts) of the books demo for +a tutorial of how to configure timeouts. + +## Monitoring Timeouts + +Requests which reach the timeout will be canceled, return a 504 Gateway Timeout +response, and count as a failure for the purposes of [effective success +rate](../configuring-retries/#monitoring-retries). Since the request was +canceled before any actual response was received, a timeout will not count +towards the actual request volume at all. This means that effective request +rate can be higher than actual request rate when timeouts are configured. +Furthermore, if a response is received just as the timeout is exceeded, it is +possible for the request to be counted as an actual success but an effective +failure. This can result in effective success rate being lower than actual +success rate. diff --git a/linkerd.io/content/2.11/tasks/customize-install.md b/linkerd.io/content/2.11/tasks/customize-install.md new file mode 100644 index 0000000000..5060ec8c9f --- /dev/null +++ b/linkerd.io/content/2.11/tasks/customize-install.md @@ -0,0 +1,145 @@ ++++ +title = "Customizing Linkerd's Configuration with Kustomize" +description = "Use Kustomize to modify Linkerd's configuration in a programmatic way." ++++ + +Instead of forking the Linkerd install and upgrade process, +[Kustomize](https://kustomize.io/) can be used to patch the output of `linkerd +install` in a consistent way. This allows customization of the install to add +functionality specific to installations. + +To get started, save the output of `install` to a YAML file. This will be the +base resource that Kustomize uses to patch and generate what is added to your +cluster. + +```bash +linkerd install > linkerd.yaml +``` + +{{< note >}} +When upgrading, make sure you populate this file with the content from `linkerd +upgrade`. Using the latest `kustomize` releases, it would be possible to +automate this with an [exec +plugin](https://github.com/kubernetes-sigs/kustomize/tree/master/docs/plugins#exec-plugins). +{{< /note >}} + +Next, create a `kustomization.yaml` file. This file will contain the +instructions for Kustomize listing the base resources and the transformations to +do on those resources. Right now, this looks pretty empty: + +```yaml +resources: +- linkerd.yaml +``` + +Now, let's look at how to do some example customizations. + +{{< note >}} +Kustomize allows as many patches, transforms and generators as you'd like. These +examples show modifications one at a time but it is possible to do as many as +required in a single `kustomization.yaml` file. +{{< /note >}} + +## Add PriorityClass + +There are a couple components in the control plane that can benefit from being +associated with a critical `PriorityClass`. While this configuration isn't +currently supported as a flag to `linkerd install`, it is not hard to add by +using Kustomize. + +First, create a file named `priority-class.yaml` that will create define a +`PriorityClass` resource. + +```yaml +apiVersion: scheduling.k8s.io/v1 +description: Used for critical linkerd pods that must run in the cluster, but + can be moved to another node if necessary. +kind: PriorityClass +metadata: + name: linkerd-critical +value: 1000000000 +``` + +{{< note >}} +`1000000000` is the max. allowed user-defined priority, adjust +accordingly. +{{< /note >}} + +Next, create a file named `patch-priority-class.yaml` that will contain the +overlay. This overlay will explain what needs to be modified. + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: linkerd-identity +spec: + template: + spec: + priorityClassName: linkerd-critical +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: linkerd-controller +spec: + template: + spec: + priorityClassName: linkerd-critical +``` + +Then, add this as a strategic merge option to `kustomization.yaml`: + +```yaml +resources: +- priority-class.yaml +- linkerd.yaml +patchesStrategicMerge: +- patch-priority-class.yaml +``` + +Applying this to your cluster requires taking the output of `kustomize build` +and piping it to `kubectl apply`. For example you can run: + +```bash +kubectl kustomize build . | kubectl apply -f - +``` + +## Modify Grafana Configuration + +Interested in enabling authentication for Grafana? It is possible to +modify the `ConfigMap` as a one off to do this. Unfortunately, the changes will +end up being reverted every time `linkerd upgrade` happens. Instead, create a +file named `grafana.yaml` and add your modifications: + +```yaml +kind: ConfigMap +apiVersion: v1 +metadata: + name: grafana-config +data: + grafana.ini: |- + instance_name = grafana + + [server] + root_url = %(protocol)s://%(domain)s:/grafana/ + + [analytics] + check_for_updates = false +``` + +Then, add this as a strategic merge option to `kustomization.yaml`: + +```yaml +resources: +- linkerd.yaml +patchesStrategicMerge: +- grafana.yaml +``` + +Finally, apply this to your cluster by generating YAML with `kustomize build` +and piping the output to `kubectl apply`. + +```bash +kubectl kustomize build . | kubectl apply -f - +``` diff --git a/linkerd.io/content/2.11/tasks/debugging-502s.md b/linkerd.io/content/2.11/tasks/debugging-502s.md new file mode 100644 index 0000000000..c92ae14906 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/debugging-502s.md @@ -0,0 +1,75 @@ ++++ +title = "Debugging 502s" +description = "Determine why Linkerd is returning 502 responses." ++++ + +When the Linkerd proxy encounters connection errors while processing a +request, it will typically return an HTTP 502 (Bad Gateway) response. It can be +very difficult to figure out why these errors are happening because of the lack +of information available. + +## Why do these errors only occur when Linkerd is injected? + +Linkerd turns connection errors into HTTP 502 responses. This can make issues +which were previously undetected suddenly visible. This is a good thing. +Linkerd also changes the way that connections to your application are managed: +it re-uses persistent connections and establishes an additional layer of +connection tracking. Managing connections in this way can sometimes expose +underlying application or infrastructure issues such as misconfigured connection +timeouts which can manifest as connection errors. + +## Why can't Linkerd give a more informative error message? + +From the Linkerd proxy's perspective, it just sees its connections to the +application refused or closed without explanation. This makes it nearly +impossible for Linkerd to report any error message in the 502 response. However, +if these errors coincide with the introduction of Linkerd, it does suggest that +the problem is related to connection re-use or connection tracking. Here are +some common reasons why the application may be refusing or terminating +connections. + +## Common Causes of Connection Errors + +### Connection Idle Timeouts + +Some servers are configured with a connection idle timeout (for example, [this +timeout in the Go HTTP +server](https://golang.org/src/net/http/server.go#L2535]). This means that the +server will close any connections which do not receive any traffic in the +specified time period. If any requests are already in transit when the +connection shutdown is initiated, those requests will fail. This scenario is +likely to occur if you have traffic with a regular period (such as liveness +checks, for example) and an idle timeout equal to that period. + +To remedy this, ensure that your server's idle timeouts are sufficiently long so +that they do not close connections which are actively in use. + +### Half-closed Connection Timeouts + +During the shutdown of a TCP connection, each side of the connection must be +closed independently. When one side is closed but the other is not, the +connection is said to be "half-closed". It is valid for the connection to be in +this state, however, the operating system's connection tracker can lose track of +connections which remain half-closed for long periods of time. This can lead to +responses not being delivered and to port conflicts when establishing new +connections which manifest as 502 responses. + +You can use a [script to detect half-closed +connections](https://gist.github.com/adleong/0203b0864af2c29ddb821dd48f339f49) +on your Kubernetes cluster. If you detect a large number of half-closed +connections, you have a couple of ways to remedy the situation. + +One solution would be to update your application to not leave connections +half-closed for long periods of time or to stop using software that does this. +Unfortunately, this is not always an option. + +Another option is to increase the connection tracker's timeout for half-closed +connections. The default value of this timeout is platform dependent but is +typically 1 minute or 1 hour. You can view the current value by looking at the +file `/proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait` in any +injected container. To increase this value, you can use the +`--close-wait-timeout` flag with `linkerd inject`. Note, however, that setting +this flag will also set the `privileged` field of the proxy init container to +true. Setting this timeout to 1 hour is usually sufficient and matches the +[value used by +kube-proxy](https://github.com/kubernetes/kubernetes/issues/32551). diff --git a/linkerd.io/content/2.11/tasks/debugging-your-service.md b/linkerd.io/content/2.11/tasks/debugging-your-service.md new file mode 100644 index 0000000000..19a4297f7c --- /dev/null +++ b/linkerd.io/content/2.11/tasks/debugging-your-service.md @@ -0,0 +1,64 @@ ++++ +title = "Debugging gRPC applications with request tracing" +description = "Follow a long-form example of debugging a failing gRPC application using live request tracing." +aliases = [ + "/debugging-an-app/", + "../debugging-an-app/" +] ++++ + +The demo application emojivoto has some issues. Let's use that and Linkerd to +diagnose an application that fails in ways which are a little more subtle than +the entire service crashing. This guide assumes that you've followed the steps +in the [Getting Started](../../getting-started/) guide and have Linkerd and the +demo application running in a Kubernetes cluster. If you've not done that yet, +go get started and come back when you're done! + +If you glance at the Linkerd dashboard (by running the `linkerd viz dashboard` +command), you should see all the resources in the `emojivoto` namespace, +including the deployments. Each deployment running Linkerd shows success rate, +requests per second and latency percentiles. + +{{< fig src="/images/debugging/stat.png" title="Top Level Metrics" >}} + +That's pretty neat, but the first thing you might notice is that the success +rate is well below 100%! Click on `web` and let's dig in. + +{{< fig src="/images/debugging/octopus.png" title="Deployment Detail" >}} + +You should now be looking at the Deployment page for the web deployment. The first +thing you'll see here is that the web deployment is taking traffic from `vote-bot` +(a deployment included with emojivoto to continually generate a low level of +live traffic). The web deployment also has two outgoing dependencies, `emoji` +and `voting`. + +While the emoji deployment is handling every request from web successfully, it +looks like the voting deployment is failing some requests! A failure in a dependent +deployment may be exactly what is causing the errors that web is returning. + +Let's scroll a little further down the page, we'll see a live list of all +traffic that is incoming to *and* outgoing from `web`. This is interesting: + +{{< fig src="/images/debugging/web-top.png" title="Top" >}} + +There are two calls that are not at 100%: the first is vote-bot's call to the +`/api/vote` endpoint. The second is the `VoteDoughnut` call from the web +deployment to its dependent deployment, `voting`. Very interesting! Since +`/api/vote` is an incoming call, and `VoteDoughnut` is an outgoing call, this is +a good clue that this endpoint is what's causing the problem! + +Finally, to dig a little deeper, we can click on the `tap` icon in the far right +column. This will take us to the live list of requests that match only this +endpoint. You'll see `Unknown` under the `GRPC status` column. This is because +the requests are failing with a +[gRPC status code 2](https://godoc.org/google.golang.org/grpc/codes#Code), +which is a common error response as you can see from +[the code][code]. Linkerd is aware of gRPC's response classification without any +other configuration! + +{{< fig src="/images/debugging/web-tap.png" title="Tap" >}} + +At this point, we have everything required to get the endpoint fixed and restore +the overall health of our applications. + +[code]: https://github.com/BuoyantIO/emojivoto/blob/67faa83af33db647927946a672fc63ab7ce869aa/emojivoto-voting-svc/api/api.go#L21 diff --git a/linkerd.io/content/2.11/tasks/distributed-tracing.md b/linkerd.io/content/2.11/tasks/distributed-tracing.md new file mode 100644 index 0000000000..81092a3865 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/distributed-tracing.md @@ -0,0 +1,273 @@ ++++ +title = "Distributed tracing with Linkerd" +description = "Use Linkerd to help instrument your application with distributed tracing." ++++ + +Using distributed tracing in practice can be complex, for a high level +explanation of what you get and how it is done, we've assembled a [list of +myths](https://linkerd.io/2019/08/09/service-mesh-distributed-tracing-myths/). + +This guide will walk you through configuring and enabling tracing for +[emojivoto](../../getting-started/#step-5-install-the-demo-app). Jump to the end +for some recommendations on the best way to make use of distributed tracing with +Linkerd. + +To use distributed tracing, you'll need to: + +- Install the Linkerd-Jaeger extension. +- Modify your application to emit spans. + +In the case of emojivoto, once all these steps are complete there will be a +topology that looks like: + +{{< fig src="/images/tracing/tracing-topology.svg" + title="Topology" >}} + +## Prerequisites + +- To use this guide, you'll need to have Linkerd installed on your cluster. + Follow the [Installing Linkerd Guide](../install/) if you haven't + already done this. + +## Install the Linkerd-Jaeger extension + +The first step of getting distributed tracing setup is installing the +Linkerd-Jaeger extension onto your cluster. This extension consists of a +collector, a Jaeger backend, and a Jaeger-injector. The collector consumes spans +emitted from the mesh and your applications and sends them to the Jaeger backend +which stores them and serves a dashboard to view them. The Jaeger-injector is +responsible for configuring the Linkerd proxies to emit spans. + +To install the Linkerd-Jaeger extension, run the command: + +```bash +linkerd jaeger install | kubectl apply -f - +``` + +You can verify that the Linkerd-Jaeger extension was installed correctly by +running: + +```bash +linkerd jaeger check +``` + +## Install Emojivoto + + Add emojivoto to your cluster and inject it with the Linkerd proxy: + + ```bash + linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f - + ``` + +Before moving onto the next step, make sure everything is up and running with +`kubectl`: + +```bash +kubectl -n emojivoto rollout status deploy/web +``` + +## Modify the application + +Unlike most features of a service mesh, distributed tracing requires modifying +the source of your application. Tracing needs some way to tie incoming requests +to your application together with outgoing requests to dependent services. To do +this, some headers are added to each request that contain a unique ID for the +trace. Linkerd uses the [b3 +propagation](https://github.com/openzipkin/b3-propagation) format to tie these +things together. + +We've already modified emojivoto to instrument its requests with this +information, this +[commit](https://github.com/BuoyantIO/emojivoto/commit/47a026c2e4085f4e536c2735f3ff3788b0870072) +shows how this was done. For most programming languages, it simply requires the +addition of a client library to take care of this. Emojivoto uses the OpenCensus +client, but others can be used. + +To enable tracing in emojivoto, run: + +```bash +kubectl -n emojivoto set env --all deploy OC_AGENT_HOST=collector.linkerd-jaeger:55678 +``` + +This command will add an environment variable that enables the applications to +propagate context and emit spans. + +## Explore Jaeger + +With `vote-bot` starting traces for every request, spans should now be showing +up in Jaeger. To get to the UI, run: + +```bash +linkerd jaeger dashboard +``` + +{{< fig src="/images/tracing/jaeger-empty.png" + title="Jaeger" >}} + +You can search for any service in the dropdown and click Find Traces. `vote-bot` +is a great way to get started. + +{{< fig src="/images/tracing/jaeger-search.png" + title="Search" >}} + +Clicking on a specific trace will provide all the details, you'll be able to see +the spans for every proxy! + +{{< fig src="/images/tracing/example-trace.png" + title="Search" >}} + +There sure are a lot of `linkerd-proxy` spans in that output. Internally, the +proxy has a server and client side. When a request goes through the proxy, it is +received by the server and then issued by the client. For a single request that +goes between two meshed pods, there will be a total of 4 spans. Two will be on +the source side as the request traverses that proxy and two will be on the +destination side as the request is received by the remote proxy. + +## Integration with the Dashboard + +After having set up the Linkerd-Jaeger extension, as the proxy adds application +meta-data as trace attributes, users can directly jump into related resources +traces directly from the linkerd-web dashboard by clicking the Jaeger icon in +the Metrics Table, as shown below: + +{{< fig src="/images/tracing/linkerd-jaeger-ui.png" + title="Linkerd-Jaeger" >}} + +To obtain that functionality you need to install (or upgrade) the Linkerd-Viz +extension specifying the service exposing the Jaeger UI. By default, this would +be something like this: + +```bash +linkerd viz install --set jaegerUrl=jaeger.linkerd-jaeger:16686 +``` + +## Cleanup + +To cleanup, uninstall the Linkerd-Jaeger extension along with emojivoto by running: + +```bash +linkerd jaeger uninstall | kubectl delete -f - +kubectl delete ns emojivoto +``` + +## Bring your own Jaeger + +If you have an existing Jaeger installation, you can configure the OpenCensus +collector to send traces to it instead of the Jaeger instance built into the +Linkerd-Jaeger extension. + +```bash +linkerd jaeger install --set collector.jaegerAddr='http://my-jaeger-collector.my-jaeger-ns:14268/api/traces' | kubectl apply -f - +``` + +It is also possible to manually edit the OpenCensus configuration to have it +export to any backend which it supports. See the +[OpenCensus documentation](https://opencensus.io/service/exporters/) for a full +list. + +## Troubleshooting + +### I don't see any spans for the proxies + +The Linkerd proxy uses the [b3 +propagation](https://github.com/openzipkin/b3-propagation) format. Some client +libraries, such as Jaeger, use different formats by default. You'll want to +configure your client library to use the b3 format to have the proxies +participate in traces. + +## Recommendations + +### Ingress + +The ingress is an especially important component for distributed tracing because +it creates the root span of each trace and is responsible for deciding if that +trace should be sampled or not. Having the ingress make all sampling decisions +ensures that either an entire trace is sampled or none of it is, and avoids +creating "partial traces". + +Distributed tracing systems all rely on services to propagate metadata about the +current trace from requests that they receive to requests that they send. This +metadata, called the trace context, is usually encoded in one or more request +headers. There are many different trace context header formats and while we hope +that the ecosystem will eventually converge on open standards like [W3C +tracecontext](https://www.w3.org/TR/trace-context/), we only use the [b3 +format](https://github.com/openzipkin/b3-propagation) today. Being one of the +earliest widely used formats, it has the widest support, especially among +ingresses like Nginx. + +This reference architecture includes a simple Nginx config that samples 50% of +traces and emits trace data to the collector (using the Zipkin protocol). Any +ingress controller can be used here in place of Nginx as long as it: + +- Supports probabilistic sampling +- Encodes trace context in the b3 format +- Emits spans in a protocol supported by the OpenCensus collector + +If using helm to install ingress-nginx, you can configure tracing by using: + +```yaml +controller: + config: + enable-opentracing: "true" + zipkin-collector-host: linkerd-collector.linkerd +``` + +### Client Library + +While it is possible for services to manually propagate trace propagation +headers, it's usually much easier to use a library which does three things: + +- Propagates the trace context from incoming request headers to outgoing request + headers +- Modifies the trace context (i.e. starts a new span) +- Transmits this data to a trace collector + +We recommend using OpenCensus in your service and configuring it with: + +- [b3 propagation](https://github.com/openzipkin/b3-propagation) (this is the + default) +- [the OpenCensus agent + exporter](https://opencensus.io/exporters/supported-exporters/go/ocagent/) + +The OpenCensus agent exporter will export trace data to the OpenCensus collector +over a gRPC API. The details of how to configure OpenCensus will vary language +by language, but there are [guides for many popular +languages](https://opencensus.io/quickstart/). You can also see an end-to-end +example of this in Go with our example application, +[Emojivoto](https://github.com/adleong/emojivoto). + +You may notice that the OpenCensus project is in maintenance mode and will +become part of [OpenTelemetry](https://opentelemetry.io/). Unfortunately, +OpenTelemetry is not yet production ready and so OpenCensus remains our +recommendation for the moment. + +It is possible to use many other tracing client libraries as well. Just make +sure the b3 propagation format is being used and the client library can export +its spans in a format the collector has been configured to receive. + +## Collector: OpenCensus + +The OpenCensus collector receives trace data from the OpenCensus agent exporter +and potentially does translation and filtering before sending that data to +Jaeger. Having the OpenCensus exporter send to the OpenCensus collector gives us +a lot of flexibility: we can switch to any backend that OpenCensus supports +without needing to interrupt the application. + +## Backend: Jaeger + +Jaeger is one of the most widely used tracing backends and for good reason: it +is easy to use and does a great job of visualizing traces. However, [any backend +supported by OpenCensus](https://opencensus.io/service/exporters/) can be used +instead. + +## Linkerd + +If your application is injected with Linkerd, the Linkerd proxy will participate +in the traces and will also emit trace data to the OpenCensus collector. This +enriches the trace data and allows you to see exactly how much time requests are +spending in the proxy and on the wire. + +While Linkerd can only actively participate in traces that use the b3 +propagation format, Linkerd will always forward unknown request headers +transparently, which means it will never interfere with traces that use other +propagation formats. diff --git a/linkerd.io/content/2.11/tasks/exporting-metrics.md b/linkerd.io/content/2.11/tasks/exporting-metrics.md new file mode 100644 index 0000000000..42e6b2db87 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/exporting-metrics.md @@ -0,0 +1,165 @@ ++++ +title = "Exporting Metrics" +description = "Integrate Linkerd's Prometheus with your existing metrics infrastructure." +aliases = [ + "../prometheus/", + "../observability/prometheus/", + "../observability/exporting-metrics/" +] ++++ + +By design, Linkerd only keeps metrics data for a short, fixed window of time +(currently, 6 hours). This means that if Linkerd's metrics data is valuable to +you, you will probably want to export it into a full-fledged metrics store. + +Internally, Linkerd stores its metrics in a Prometheus instance that runs as +part of the Viz extension. The following tutorial requires the viz extension +to be installed with prometheus enabled. There are several basic approaches +to exporting metrics data from Linkerd: + +- [Federating data to your own Prometheus cluster](#federation) +- [Using a Prometheus integration](#integration) +- [Extracting data via Prometheus's APIs](#api) +- [Gather data from the proxies directly](#proxy) + +## Using the Prometheus federation API {#federation} + +If you are using Prometheus as your own metrics store, we recommend taking +advantage of Prometheus's *federation* API, which is designed exactly for the +use case of copying data from one Prometheus to another. + +Simply add the following item to your `scrape_configs` in your Prometheus config +file (replace `{{.Namespace}}` with the namespace where the Linkerd Viz +extension is running): + +```yaml +- job_name: 'linkerd' + kubernetes_sd_configs: + - role: pod + namespaces: + names: ['{{.Namespace}}'] + + relabel_configs: + - source_labels: + - __meta_kubernetes_pod_container_name + action: keep + regex: ^prometheus$ + + honor_labels: true + metrics_path: '/federate' + + params: + 'match[]': + - '{job="linkerd-proxy"}' + - '{job="linkerd-controller"}' +``` + +Alternatively, if you prefer to use Prometheus' ServiceMonitors to configure +your Prometheus, you can use this ServiceMonitor YAML (replace `{{.Namespace}}` +with the namespace where Linkerd Viz extension is running): + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + labels: + k8s-app: linkerd-prometheus + release: monitoring + name: linkerd-federate + namespace: {{.Namespace}} +spec: + endpoints: + - interval: 30s + scrapeTimeout: 30s + params: + match[]: + - '{job="linkerd-proxy"}' + - '{job="linkerd-controller"}' + path: /federate + port: admin-http + honorLabels: true + relabelings: + - action: keep + regex: '^prometheus$' + sourceLabels: + - '__meta_kubernetes_pod_container_name' + jobLabel: app + namespaceSelector: + matchNames: + - {{.Namespace}} + selector: + matchLabels: + component: prometheus +``` + +That's it! Your Prometheus cluster is now configured to federate Linkerd's +metrics from Linkerd's internal Prometheus instance. + +Once the metrics are in your Prometheus, Linkerd's proxy metrics will have the +label `job="linkerd-proxy"` and Linkerd's control plane metrics will have the +label `job="linkerd-controller"`. For more information on specific metric and +label definitions, have a look at [Proxy Metrics](../../reference/proxy-metrics/). + +For more information on Prometheus' `/federate` endpoint, have a look at the +[Prometheus federation docs](https://prometheus.io/docs/prometheus/latest/federation/). + +## Using a Prometheus integration {#integration} + +If you are not using Prometheus as your own long-term data store, you may be +able to leverage one of Prometheus's [many +integrations](https://prometheus.io/docs/operating/integrations/) to +automatically extract data from Linkerd's Prometheus instance into the data +store of your choice. Please refer to the Prometheus documentation for details. + +## Extracting data via Prometheus's APIs {#api} + +If neither Prometheus federation nor Prometheus integrations are options for +you, it is possible to call Prometheus's APIs to extract data from Linkerd. + +For example, you can call the federation API directly via a command like: + +```bash +curl -G \ + --data-urlencode 'match[]={job="linkerd-proxy"}' \ + --data-urlencode 'match[]={job="linkerd-controller"}' \ + http://prometheus.linkerd-viz.svc.cluster.local:9090/federate +``` + +{{< note >}} +If your data store is outside the Kubernetes cluster, it is likely that +you'll want to set up +[ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) +at a domain name of your choice with authentication. +{{< /note >}} + +Similar to the `/federate` API, Prometheus provides a JSON query API to +retrieve all metrics: + +```bash +curl http://prometheus.linkerd-viz.svc.cluster.local:9090/api/v1/query?query=request_total +``` + +## Gathering data from the Linkerd proxies directly {#proxy} + +Finally, if you want to avoid Linkerd's Prometheus entirely, you can query the +Linkerd proxies directly on their `/metrics` endpoint. + +For example, to view `/metrics` from a single Linkerd proxy, running in the +`linkerd` namespace: + +```bash +kubectl -n linkerd port-forward \ + $(kubectl -n linkerd get pods \ + -l linkerd.io/control-plane-ns=linkerd \ + -o jsonpath='{.items[0].metadata.name}') \ + 4191:4191 +``` + +and then: + +```bash +curl localhost:4191/metrics +``` + +Alternatively, `linkerd diagnostics proxy-metrics` can be used to retrieve +proxy metrics for a given workload. diff --git a/linkerd.io/content/2.11/tasks/exposing-dashboard.md b/linkerd.io/content/2.11/tasks/exposing-dashboard.md new file mode 100644 index 0000000000..e126110282 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/exposing-dashboard.md @@ -0,0 +1,248 @@ ++++ +title = "Exposing the Dashboard" +description = "Make it easy for others to access Linkerd and Grafana dashboards without the CLI." +aliases = [ + "../dns-rebinding/", +] ++++ + +Instead of using `linkerd viz dashboard` every time you'd like to see what's +going on, you can expose the dashboard via an ingress. This will also expose +Grafana. + +{{< pagetoc >}} + +## Nginx + +### Nginx with basic auth + +A sample ingress definition is: + +```yaml +apiVersion: v1 +kind: Secret +type: Opaque +metadata: + name: web-ingress-auth + namespace: linkerd-viz +data: + auth: YWRtaW46JGFwcjEkbjdDdTZnSGwkRTQ3b2dmN0NPOE5SWWpFakJPa1dNLgoK +--- +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: linkerd-viz + annotations: + nginx.ingress.kubernetes.io/upstream-vhost: $service_name.$namespace.svc.cluster.local:8084 + nginx.ingress.kubernetes.io/configuration-snippet: | + proxy_set_header Origin ""; + proxy_hide_header l5d-remote-ip; + proxy_hide_header l5d-server-id; + nginx.ingress.kubernetes.io/auth-type: basic + nginx.ingress.kubernetes.io/auth-secret: web-ingress-auth + nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required' +spec: + ingressClassName: nginx + rules: + - host: dashboard.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web + port: + number: 8084 +``` + +This exposes the dashboard at `dashboard.example.com` and protects it with basic +auth using admin/admin. Take a look at the [ingress-nginx][nginx-auth] +documentation for details on how to change the username and password. + +### Nginx with oauth2-proxy + +A more secure alternative to basic auth is using an authentication proxy, such +as [oauth2-proxy](https://oauth2-proxy.github.io/oauth2-proxy/). + +For reference on how to deploy and configure oauth2-proxy in kubernetes, see +this [blog post by Don +Bowman](https://blog.donbowman.ca/2019/02/14/using-single-sign-on-oauth2-across-many-sites-in-kubernetes/). + +tl;dr: If you deploy oauth2-proxy via the [helm +chart](https://github.com/helm/charts/tree/master/stable/oauth2-proxy), the +following values are required: + +```yaml +config: + existingSecret: oauth2-proxy + configFile: |- + email_domains = [ "example.com" ] + upstreams = [ "file:///dev/null" ] + +ingress: + enabled: true + annotations: + kubernetes.io/ingress.class: nginx + path: /oauth2 +ingress: + hosts: + - linkerd.example.com +``` + +Where the `oauth2-proxy` secret would contain the required [oauth2 +config](https://oauth2-proxy.github.io/oauth2-proxy/docs/configuration/oauth_provider) +such as, `client-id` `client-secret` and `cookie-secret`. + +Once setup, a sample ingress would be: + +```yaml +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web + namespace: linkerd-viz + annotations: + nginx.ingress.kubernetes.io/upstream-vhost: $service_name.$namespace.svc.cluster.local:8084 + nginx.ingress.kubernetes.io/configuration-snippet: | + proxy_set_header Origin ""; + proxy_hide_header l5d-remote-ip; + proxy_hide_header l5d-server-id; + nginx.ingress.kubernetes.io/auth-signin: https://$host/oauth2/start?rd=$escaped_request_uri + nginx.ingress.kubernetes.io/auth-url: https://$host/oauth2/auth +spec: + ingressClassName: nginx + rules: + - host: linkerd.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web + port: + number: 8084 +``` + +## Traefik + +A sample ingress definition is: + +```yaml +apiVersion: v1 +kind: Secret +type: Opaque +metadata: + name: web-ingress-auth + namespace: linkerd-viz +data: + auth: YWRtaW46JGFwcjEkbjdDdTZnSGwkRTQ3b2dmN0NPOE5SWWpFakJPa1dNLgoK +--- +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: linkerd-viz + annotations: + ingress.kubernetes.io/custom-request-headers: l5d-dst-override:web.linkerd-viz.svc.cluster.local:8084 + traefik.ingress.kubernetes.io/auth-type: basic + traefik.ingress.kubernetes.io/auth-secret: web-ingress-auth +spec: + ingressClassName: traefik + rules: + - host: dashboard.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web + port: + number: 8084 +``` + +This exposes the dashboard at `dashboard.example.com` and protects it with basic +auth using admin/admin. Take a look at the [Traefik][traefik-auth] documentation +for details on how to change the username and password. + +## Ambassador + +Ambassador works by defining a [mapping +](https://www.getambassador.io/docs/latest/topics/using/intro-mappings/) as an +annotation on a service. + +The below annotation exposes the dashboard at `dashboard.example.com`. + +```yaml + annotations: + getambassador.io/config: |- + --- + apiVersion: getambassador.io/v2 + kind: Mapping + name: web-mapping + host: dashboard.example.com + prefix: / + host_rewrite: web.linkerd-viz.svc.cluster.local:8084 + service: web.linkerd-viz.svc.cluster.local:8084 +``` + +## DNS Rebinding Protection + +To prevent [DNS-rebinding](https://en.wikipedia.org/wiki/DNS_rebinding) attacks, +the dashboard rejects any request whose `Host` header is not `localhost`, +`127.0.0.1` or the service name `web.linkerd-viz.svc`. + +Note that this protection also covers the [Grafana +dashboard](../../reference/architecture/#grafana). + +The ingress-nginx config above uses the +`nginx.ingress.kubernetes.io/upstream-vhost` annotation to properly set the +upstream `Host` header. Traefik on the other hand doesn't offer that option, so +you'll have to manually set the required `Host` as explained below. + +### Tweaking Host Requirement + +If your HTTP client (Ingress or otherwise) doesn't allow to rewrite the `Host` +header, you can change the validation regexp that the dashboard server uses, +which is fed into the `web` deployment via the `enforced-host` container +argument. + +If you're managing Linkerd with Helm, then you can set the host using the +`enforcedHostRegexp` value. + +Another way of doing that is through Kustomize, as explained in [Customizing +Installation](../customize-install/), using an overlay like this one: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: web +spec: + template: + spec: + containers: + - name: web + args: + - -linkerd-controller-api-addr=linkerd-controller-api.linkerd.svc.cluster.local:8085 + - -linkerd-metrics-api-addr=metrics-api.linkerd-viz.svc.cluster.local:8085 + - -cluster-domain=cluster.local + - -grafana-addr=grafana.linkerd-viz.svc.cluster.local:3000 + - -controller-namespace=linkerd + - -viz-namespace=linkerd-viz + - -log-level=info + - -enforced-host=^dashboard\.example\.com$ +``` + +If you want to completely disable the `Host` header check, simply use a +catch-all regexp `.*` for `-enforced-host`. + +[nginx-auth]: +https://github.com/kubernetes/ingress-nginx/blob/master/docs/examples/auth/basic/README.md +[traefik-auth]: https://docs.traefik.io/middlewares/basicauth/ diff --git a/linkerd.io/content/2.11/tasks/extensions.md b/linkerd.io/content/2.11/tasks/extensions.md new file mode 100644 index 0000000000..3f83c06152 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/extensions.md @@ -0,0 +1,77 @@ ++++ +title = "Using extensions" +description = "Add functionality to Linkerd with optional extensions." ++++ + +Linkerd extensions are components which can be added to a Linkerd installation +to enable additional functionality. By default, the following extensions are +available: + +* [viz](../../features/dashboard/): Metrics and visibility features +* [jaeger](../distributed-tracing/): Distributed tracing +* [multicluster](../multicluster/): Cross-cluster routing + +But other extensions are also possible. Read on for more! + +## Installing extensions + +Before installing any extensions, make sure that you have already [installed +Linkerd](../install/) and validated your cluster with `linkerd check`. + +Then, you can install the extension with the extension's `install` command. For +example, to install the `viz` extension, you can use: + +```bash +linkerd viz install | kubectl apply -f - +``` + +For built-in extensions, such as `viz`, `jaeger`, and `multicluster`, that's +all you need to do. Of course, these extensions can also be installed by with +Helm by installing that extension's Helm chart. + +## Installing third-party extensions + +Third-party extensions are also possible, with one additional step: you must +download the extension's CLI and put it in your path. This will allow you to +invoke the extension CLI through the Linkerd CLI: any invocation of `linkerd +foo` will automatically invoke the `linkerd-foo` binary, if it is found on your +path. + +For example, [Buoyant Cloud](https://buoyant.io/cloud) is a free, hosted +metrics dashboard for Linkerd that can be installed alongside the `viz` +extension, but doesn't require it. To install this extension, run: + +```bash +## optional +curl -sL buoyant.cloud/install | sh +linkerd buoyant install | kubectl apply -f - # hosted metrics dashboard +``` + +Once the extension is installed, run `linkerd check` to ensure Linkerd and all +installed extensions are healthy or run `linkerd foo check` to perform health +checks for that extension only. + +## Listing extensions + +Every extension creates a Kubernetes namespace with the `linkerd.io/extension` +label. Thus, you can list all extensions installed on your cluster by running: + +```bash +kubectl get ns -l linkerd.io/extension +``` + +## Upgrading extensions + +Unless otherwise stated, extensions do not persist any configuration in the +cluster. To upgrade an extension, run the install again with a newer version +of the extension CLI or with a different set of configuration flags. + +## Uninstalling extensions + +All extensions have an `uninstall` command that should be used to gracefully +clean up all resources owned by an extension. For example, to uninstall the +foo extension, run: + +```bash +linkerd foo uninstall | kubectl delete -f - +``` diff --git a/linkerd.io/content/2.11/tasks/external-prometheus.md b/linkerd.io/content/2.11/tasks/external-prometheus.md new file mode 100644 index 0000000000..d64c5d4e87 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/external-prometheus.md @@ -0,0 +1,160 @@ ++++ +title = "Bringing your own Prometheus" +description = "Use an existing Prometheus instance with Linkerd." ++++ + +Even though [the linkerd-viz extension](../../features/dashboard/) comes with +its own Prometheus instance, there can be cases where using an external +instance makes more sense for various reasons. + +This tutorial shows how to configure an external Prometheus instance to scrape both +the control plane as well as the proxy's metrics in a format that is consumable +both by a user as well as Linkerd control plane components like web, etc. + +There are two important points to tackle here. + +- Configuring external Prometheus instance to get the Linkerd metrics. +- Configuring the linkerd-viz extension to use that Prometheus. + +## Prometheus Scrape Configuration + +The following scrape configuration has to be applied to the external +Prometheus instance. + +{{< note >}} +The below scrape configuration is a [subset of the full `linkerd-prometheus` +scrape +configuration](https://github.com/linkerd/linkerd2/blob/main/viz/charts/linkerd-viz/templates/prometheus.yaml#L47-L151). +{{< /note >}} + +Before applying, it is important to replace templated values (present in `{{}}`) +with direct values for the below configuration to work. + +```yaml + - job_name: 'linkerd-controller' + kubernetes_sd_configs: + - role: pod + namespaces: + names: + - '{{.Values.linkerdNamespace}}' + - '{{.Values.namespace}}' + relabel_configs: + - source_labels: + - __meta_kubernetes_pod_container_port_name + action: keep + regex: admin-http + - source_labels: [__meta_kubernetes_pod_container_name] + action: replace + target_label: component + + - job_name: 'linkerd-service-mirror' + kubernetes_sd_configs: + - role: pod + relabel_configs: + - source_labels: + - __meta_kubernetes_pod_label_linkerd_io_control_plane_component + - __meta_kubernetes_pod_container_port_name + action: keep + regex: linkerd-service-mirror;admin-http$ + - source_labels: [__meta_kubernetes_pod_container_name] + action: replace + target_label: component + + - job_name: 'linkerd-proxy' + kubernetes_sd_configs: + - role: pod + relabel_configs: + - source_labels: + - __meta_kubernetes_pod_container_name + - __meta_kubernetes_pod_container_port_name + - __meta_kubernetes_pod_label_linkerd_io_control_plane_ns + action: keep + regex: ^{{default .Values.proxyContainerName "linkerd-proxy" .Values.proxyContainerName}};linkerd-admin;{{.Values.linkerdNamespace}}$ + - source_labels: [__meta_kubernetes_namespace] + action: replace + target_label: namespace + - source_labels: [__meta_kubernetes_pod_name] + action: replace + target_label: pod + # special case k8s' "job" label, to not interfere with prometheus' "job" + # label + # __meta_kubernetes_pod_label_linkerd_io_proxy_job=foo => + # k8s_job=foo + - source_labels: [__meta_kubernetes_pod_label_linkerd_io_proxy_job] + action: replace + target_label: k8s_job + # drop __meta_kubernetes_pod_label_linkerd_io_proxy_job + - action: labeldrop + regex: __meta_kubernetes_pod_label_linkerd_io_proxy_job + # __meta_kubernetes_pod_label_linkerd_io_proxy_deployment=foo => + # deployment=foo + - action: labelmap + regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+) + # drop all labels that we just made copies of in the previous labelmap + - action: labeldrop + regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+) + # __meta_kubernetes_pod_label_linkerd_io_foo=bar => + # foo=bar + - action: labelmap + regex: __meta_kubernetes_pod_label_linkerd_io_(.+) + # Copy all pod labels to tmp labels + - action: labelmap + regex: __meta_kubernetes_pod_label_(.+) + replacement: __tmp_pod_label_$1 + # Take `linkerd_io_` prefixed labels and copy them without the prefix + - action: labelmap + regex: __tmp_pod_label_linkerd_io_(.+) + replacement: __tmp_pod_label_$1 + # Drop the `linkerd_io_` originals + - action: labeldrop + regex: __tmp_pod_label_linkerd_io_(.+) + # Copy tmp labels into real labels + - action: labelmap + regex: __tmp_pod_label_(.+) +``` + +The running configuration of the builtin prometheus can be used as a reference. + +```bash +kubectl -n linkerd-viz get configmap prometheus-config -o yaml +``` + +## Linkerd-Viz Extension Configuration + +Linkerd's viz extension components like `metrics-api`, etc depend +on the Prometheus instance to power the dashboard and CLI. + +The `prometheusUrl` field gives you a single place through +which all these components can be configured to an external Prometheus URL. +This is allowed both through the CLI and Helm. + +### CLI + +This can be done by passing a file with the above field to the `values` flag, +which is available through `linkerd viz install` command. + +```yaml +prometheusUrl: existing-prometheus.xyz:9090 +``` + +Once applied, this configuration is not persistent across installs. +The same has to be passed again by the user during re-installs, upgrades, etc. + +When using an external Prometheus and configuring the `prometheusUrl` +field, Linkerd's Prometheus will still be included in installation. +If you wish to disable it, be sure to include the +following configuration as well: + +```yaml +prometheus: + enabled: false +``` + +### Helm + +The same configuration can be applied through `values.yaml` when using Helm. +Once applied, Helm makes sure that the configuration is +persistent across upgrades. + +More information on installation through Helm can be found +[here](../install-helm/) diff --git a/linkerd.io/content/2.11/tasks/fault-injection.md b/linkerd.io/content/2.11/tasks/fault-injection.md new file mode 100644 index 0000000000..5bc7726084 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/fault-injection.md @@ -0,0 +1,197 @@ ++++ +title = "Injecting Faults" +description = "Practice chaos engineering by injecting faults into services with Linkerd." ++++ + +It is easy to inject failures into applications by using the [Traffic Split +API](https://github.com/deislabs/smi-spec/blob/master/traffic-split.md) of the +[Service Mesh Interface](https://smi-spec.io/). TrafficSplit allows you to +redirect a percentage of traffic to a specific backend. This backend is +completely flexible and can return whatever responses you want - 500s, timeouts +or even crazy payloads. + +The [books demo](../books/) is a great way to show off this behavior. The +overall topology looks like: + +{{< fig src="/images/books/topology.png" title="Topology" >}} + +In this guide, you will split some of the requests from `webapp` to `books`. +Most requests will end up at the correct `books` destination, however some of +them will be redirected to a faulty backend. This backend will return 500s for +every request and inject faults into the `webapp` service. No code changes are +required and as this method is configuration driven, it is a process that can be +added to integration tests and CI pipelines. If you are really living the chaos +engineering lifestyle, fault injection could even be used in production. + +## Prerequisites + +To use this guide, you'll need to have Linkerd installed on your cluster, along +with its Viz extension. Follow the [Installing Linkerd Guide](../install/) +if you haven't already done this. + +## Setup the service + +First, add the [books](../books/) sample application to your cluster: + +```bash +kubectl create ns booksapp && \ + linkerd inject https://run.linkerd.io/booksapp.yml | \ + kubectl -n booksapp apply -f - +``` + +As this manifest is used as a demo elsewhere, it has been configured with an +error rate. To show how fault injection works, the error rate needs to be +removed so that there is a reliable baseline. To increase success rate for +booksapp to 100%, run: + +```bash +kubectl -n booksapp patch deploy authors \ + --type='json' \ + -p='[{"op":"remove", "path":"/spec/template/spec/containers/0/env/2"}]' +``` + +After a little while, the stats will show 100% success rate. You can verify this +by running: + +```bash +linkerd viz -n booksapp stat deploy +``` + +The output will end up looking at little like: + +```bash +NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN +authors 1/1 100.00% 7.1rps 4ms 26ms 33ms 6 +books 1/1 100.00% 8.6rps 6ms 73ms 95ms 6 +traffic 1/1 - - - - - - +webapp 3/3 100.00% 7.9rps 20ms 76ms 95ms 9 +``` + +## Create the faulty backend + +Injecting faults into booksapp requires a service that is configured to return +errors. To do this, you can start NGINX and configure it to return 500s by +running: + +```bash +cat <}} +In this instance, you are looking at the *service* instead of the deployment. If +you were to run this command and look at `deploy/books`, the success rate would +still be 100%. The reason for this is that `error-injector` is a completely +separate deployment and traffic is being shifted at the service level. The +requests never reach the `books` pods and are instead rerouted to the error +injector's pods. +{{< /note >}} + +## Cleanup + +To remove everything in this guide from your cluster, run: + +```bash +kubectl delete ns booksapp +``` diff --git a/linkerd.io/content/2.11/tasks/generate-certificates.md b/linkerd.io/content/2.11/tasks/generate-certificates.md new file mode 100644 index 0000000000..a1d8aa01f0 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/generate-certificates.md @@ -0,0 +1,86 @@ ++++ +title = "Generating your own mTLS root certificates" +description = "Generate your own mTLS root certificate instead of letting Linkerd do it for you." ++++ + +In order to support [mTLS connections between meshed +pods](../../features/automatic-mtls/), Linkerd needs a trust anchor certificate and +an issuer certificate with its corresponding key. + +When installing with `linkerd install`, these certificates are automatically +generated. Alternatively, you can specify your own with the `--identity-*` flags +(see the [linkerd install reference](../../reference/cli/install/)). + +On the other hand when using Helm to install Linkerd, it's not possible to +automatically generate them and you're required to provide them. + +You can generate these certificates using a tool like openssl or +[step](https://smallstep.com/cli/). All certificates must use the ECDSA P-256 +algorithm which is the default for `step`. To generate ECDSA P-256 certificates +with openssl, you can use the `openssl ecparam -name prime256v1` command. In +this tutorial, we'll walk you through how to to use the `step` CLI to do this. + +## Generating the certificates with `step` + +### Trust anchor certificate + +First generate the root certificate with its private key (using `step` version +0.10.1): + +```bash +step certificate create root.linkerd.cluster.local ca.crt ca.key \ +--profile root-ca --no-password --insecure +``` + +This generates the `ca.crt` and `ca.key` files. The `ca.crt` file is what you +need to pass to the `--identity-trust-anchors-file` option when installing +Linkerd with the CLI, and the `identityTrustAnchorsPEM` value when installing +Linkerd with Helm. + +Note we use `--no-password --insecure` to avoid encrypting those files with a +passphrase. + +For a longer-lived trust anchor certificate, pass the `--not-after` argument +to the step command with the desired value (e.g. `--not-after=87600h`). + +### Issuer certificate and key + +Then generate the intermediate certificate and key pair that will be used to +sign the Linkerd proxies' CSR. + +```bash +step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \ +--profile intermediate-ca --not-after 8760h --no-password --insecure \ +--ca ca.crt --ca-key ca.key +``` + +This will generate the `issuer.crt` and `issuer.key` files. + +## Passing the certificates to Linkerd + +You can finally provide these files when installing Linkerd with the CLI: + +```bash +linkerd install \ + --identity-trust-anchors-file ca.crt \ + --identity-issuer-certificate-file issuer.crt \ + --identity-issuer-key-file issuer.key \ + | kubectl apply -f - +``` + +Or when installing with Helm: + +```bash +helm install linkerd2 \ + --set-file identityTrustAnchorsPEM=ca.crt \ + --set-file identity.issuer.tls.crtPEM=issuer.crt \ + --set-file identity.issuer.tls.keyPEM=issuer.key \ + --set identity.issuer.crtExpiry=$(date -d '+8760 hour' +"%Y-%m-%dT%H:%M:%SZ") \ + linkerd/linkerd2 +``` + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} diff --git a/linkerd.io/content/2.11/tasks/getting-per-route-metrics.md b/linkerd.io/content/2.11/tasks/getting-per-route-metrics.md new file mode 100644 index 0000000000..b5f92885b9 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/getting-per-route-metrics.md @@ -0,0 +1,93 @@ ++++ +title = "Getting Per-Route Metrics" +description = "Configure per-route metrics for your application." ++++ + +To get per-route metrics, you must first create a +[service profile](../../features/service-profiles/). Once a service +profile has been created, Linkerd will add labels to the Prometheus metrics that +associate a specific request to a specific route. + +For a tutorial that shows this functionality off, check out the +[books demo](../books/#service-profiles). + +You can view per-route metrics in the CLI by running `linkerd viz routes`: + +```bash +$ linkerd viz routes svc/webapp +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +GET / webapp 100.00% 0.6rps 25ms 30ms 30ms +GET /authors/{id} webapp 100.00% 0.6rps 22ms 29ms 30ms +GET /books/{id} webapp 100.00% 1.2rps 18ms 29ms 30ms +POST /authors webapp 100.00% 0.6rps 32ms 46ms 49ms +POST /authors/{id}/delete webapp 100.00% 0.6rps 45ms 87ms 98ms +POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms +POST /books webapp 50.76% 2.2rps 26ms 38ms 40ms +POST /books/{id}/delete webapp 100.00% 0.6rps 24ms 29ms 30ms +POST /books/{id}/edit webapp 60.71% 0.9rps 75ms 98ms 100ms +[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms +``` + +The `[DEFAULT]` route is a catch-all, anything that does not match the regexes +specified in your service profile will end up there. + +It is also possible to look the metrics up by other resource types, such as: + +```bash +$ linkerd viz routes deploy/webapp +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +[DEFAULT] kubernetes 0.00% 0.0rps 0ms 0ms 0ms +GET / webapp 100.00% 0.5rps 27ms 38ms 40ms +GET /authors/{id} webapp 100.00% 0.6rps 18ms 29ms 30ms +GET /books/{id} webapp 100.00% 1.1rps 17ms 28ms 30ms +POST /authors webapp 100.00% 0.5rps 25ms 30ms 30ms +POST /authors/{id}/delete webapp 100.00% 0.5rps 58ms 96ms 99ms +POST /authors/{id}/edit webapp 0.00% 0.0rps 0ms 0ms 0ms +POST /books webapp 45.58% 2.5rps 33ms 82ms 97ms +POST /books/{id}/delete webapp 100.00% 0.6rps 33ms 48ms 50ms +POST /books/{id}/edit webapp 55.36% 0.9rps 79ms 160ms 192ms +[DEFAULT] webapp 0.00% 0.0rps 0ms 0ms 0ms +``` + +Then, it is possible to filter all the way down to requests going from a +specific resource to other services: + +```bash +$ linkerd viz routes deploy/webapp --to svc/books +ROUTE SERVICE SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +DELETE /books/{id}.json books 100.00% 0.5rps 18ms 29ms 30ms +GET /books.json books 100.00% 1.1rps 7ms 12ms 18ms +GET /books/{id}.json books 100.00% 2.5rps 6ms 10ms 10ms +POST /books.json books 52.24% 2.2rps 23ms 34ms 39ms +PUT /books/{id}.json books 41.98% 1.4rps 73ms 97ms 99ms +[DEFAULT] books 0.00% 0.0rps 0ms 0ms 0ms +``` + +## Troubleshooting + +If you're not seeing any metrics, there are two likely culprits. In both cases, +`linkerd viz tap` can be used to understand the problem. For the resource that +the service points to, run: + +```bash +linkerd viz tap deploy/webapp -o wide | grep req +``` + +A sample output is: + +```bash +req id=3:1 proxy=in src=10.4.0.14:58562 dst=10.4.1.4:7000 tls=disabled :method=POST :authority=webapp:7000 :path=/books/24783/edit src_res=deploy/traffic src_ns=default dst_res=deploy/webapp dst_ns=default rt_route=POST /books/{id}/edit +``` + +This will select only the requests observed and show the `:authority` and +`rt_route` that was used for each request. + +- Linkerd discovers the right service profile to use via `:authority` or + `Host` headers. The name of your service profile must match these headers. + There are many reasons why these would not match, see + [ingress](../../features/ingress/) for one reason. Another would be clients that + use IPs directly such as Prometheus. +- Getting regexes to match can be tough and the ordering is important. Pay + attention to `rt_route`. If it is missing entirely, compare the `:path` to + the regex you'd like for it to match, and use a + [tester](https://regex101.com/) with the Golang flavor of regex. diff --git a/linkerd.io/content/2.11/tasks/gitops.md b/linkerd.io/content/2.11/tasks/gitops.md new file mode 100644 index 0000000000..2e5548168a --- /dev/null +++ b/linkerd.io/content/2.11/tasks/gitops.md @@ -0,0 +1,534 @@ ++++ +title = "Using GitOps with Linkerd with Argo CD" +description = "Use Argo CD to manage Linkerd installation and upgrade lifecycle." ++++ + +GitOps is an approach to automate the management and delivery of your Kubernetes +infrastructure and applications using Git as a single source of truth. It +usually utilizes some software agents to detect and reconcile any divergence +between version-controlled artifacts in Git with what's running in a cluster. + +This guide will show you how to set up +[Argo CD](https://argoproj.github.io/argo-cd/) to manage the installation and +upgrade of Linkerd using a GitOps workflow. + +Specifically, this guide provides instructions on how to securely generate and +manage Linkerd's mTLS private keys and certificates using +[Sealed Secrets](https://github.com/bitnami-labs/sealed-secrets) and +[cert-manager](https://cert-manager.io). It will also show you how to integrate +the [auto proxy injection](../../features/proxy-injection/) feature into your +workflow. Finally, this guide conclude with steps to upgrade Linkerd to a newer +version following a GitOps workflow. + +{{< fig alt="Linkerd GitOps workflow" + title="Linkerd GitOps workflow" + src="/images/gitops/architecture.png" >}} + +The software and tools used in this guide are selected for demonstration +purposes only. Feel free to choose others that are most suited for your +requirements. + +You will need to clone this +[example repository](https://github.com/linkerd/linkerd-examples) to your local +machine and replicate it in your Kubernetes cluster following the steps defined +in the next section. + +## Set up the repositories + +Clone the example repository to your local machine: + +```sh +git clone https://github.com/linkerd/linkerd-examples.git +``` + +This repository will be used to demonstrate Git operations like `add`, `commit` +and `push` later in this guide. + +Add a new remote endpoint to the repository to point to the in-cluster Git +server, which will be set up in the next section: + +```sh +cd linkerd-examples + +git remote add git-server git://localhost/linkerd-examples.git +``` + +{{< note >}} +To simplify the steps in this guide, we will be interacting with the in-cluster +Git server via port-forwarding. Hence, the remote endpoint that we just created +targets your localhost. +{{< /note >}} + +Deploy the Git server to the `scm` namespace in your cluster: + +```sh +kubectl apply -f gitops/resources/git-server.yaml +``` + +Later in this guide, Argo CD will be configured to watch the repositories hosted +by this Git server. + +{{< note >}} +This Git server is configured to run as a +[daemon](https://git-scm.com/book/en/v2/Git-on-the-Server-Git-Daemon) over the +`git` protocol, with unauthenticated access to the Git data. This setup is not +recommended for production use. +{{< /note >}} + +Confirm that the Git server is healthy: + +```sh +kubectl -n scm rollout status deploy/git-server +``` + +Clone the example repository to your in-cluster Git server: + +```sh +git_server=`kubectl -n scm get po -l app=git-server -oname | awk -F/ '{ print $2 }'` + +kubectl -n scm exec "${git_server}" -- \ + git clone --bare https://github.com/linkerd/linkerd-examples.git +``` + +Confirm that the remote repository is successfully cloned: + +```sh +kubectl -n scm exec "${git_server}" -- ls -al /git/linkerd-examples.git +``` + +Confirm that you can push from the local repository to the remote repository +via port-forwarding: + +```sh +kubectl -n scm port-forward "${git_server}" 9418 & + +git push git-server master +``` + +## Deploy Argo CD + +Install Argo CD: + +```sh +kubectl create ns argocd + +kubectl -n argocd apply -f \ + https://raw.githubusercontent.com/argoproj/argo-cd/v1.6.1/manifests/install.yaml +``` + +Confirm that all the pods are ready: + +```sh +for deploy in "application-controller" "dex-server" "redis" "repo-server" "server"; \ + do kubectl -n argocd rollout status deploy/argocd-${deploy}; \ +done +``` + +Use port-forward to access the Argo CD dashboard: + +```sh +kubectl -n argocd port-forward svc/argocd-server 8080:443 \ + > /dev/null 2>&1 & +``` + +The Argo CD dashboard is now accessible at +[https://localhost:8080](https://localhost:8080/), using the default `admin` +username and +[password](https://argoproj.github.io/argo-cd/getting_started/#4-login-using-the-cli). + +{{< note >}} +The default admin password is the auto-generated name of the Argo CD API server +pod. You can use the `argocd account update-password` command to change it. +{{< /note >}} + +Authenticate the Argo CD CLI: + +```sh +argocd_server=`kubectl -n argocd get pods -l app.kubernetes.io/name=argocd-server -o name | cut -d'/' -f 2` + +argocd login 127.0.0.1:8080 \ + --username=admin \ + --password="${argocd_server}" \ + --insecure +``` + +## Configure project access and permissions + +Set up the `demo` +[project](https://argoproj.github.io/argo-cd/user-guide/projects/) to group our +[applications](https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#applications): + +```sh +kubectl apply -f gitops/project.yaml +``` + +This project defines the list of permitted resource kinds and target clusters +that our applications can work with. + +Confirm that the project is deployed correctly: + +```sh +argocd proj get demo +``` + +On the dashboard: + +{{< fig alt="New project in Argo CD dashboard" + title="New project in Argo CD dashboard" + src="/images/gitops/dashboard-project.png" >}} + +### Deploy the applications + +Deploy the `main` application which serves as the "parent" application that of +all the other applications: + +```sh +kubectl apply -f gitops/main.yaml +``` + +{{< note >}} +The "app of apps" pattern is commonly used in Argo CD workflows to bootstrap +applications. See the Argo CD documentation for more +[information](https://argoproj.github.io/argo-cd/operator-manual/cluster-bootstrapping/#app-of-apps-pattern). +{{< /note >}} + +Confirm that the `main` application is deployed successfully: + +```sh +argocd app get main +``` + +Sync the `main` application: + +```sh +argocd app sync main +``` + +{{< fig alt="Synchronize the main application" + title="Synchronize the main application" + src="/images/gitops/dashboard-applications-main-sync.png" >}} + +Notice that only the `main` application is synchronized. + +Next, we will synchronize the remaining applications individually. + +### Deploy cert-manager + +Synchronize the `cert-manager` application: + +```sh +argocd app sync cert-manager +``` + +{{< note >}} +This guide uses cert-manager 0.15.0 due to an issue with cert-manager 0.16.0 +and kubectl <1.19 and Helm 3.2, which Argo CD uses. See the upgrade notes +[here](https://cert-manager.io/docs/installation/upgrading/upgrading-0.15-0.16/#helm). +{{< /note >}} + +Confirm that cert-manager is running: + +```sh +for deploy in "cert-manager" "cert-manager-cainjector" "cert-manager-webhook"; \ + do kubectl -n cert-manager rollout status deploy/${deploy}; \ +done +``` + +{{< fig alt="Synchronize the cert-manager application" + title="Synchronize the cert-manager application" + center="true" + src="/images/gitops/dashboard-cert-manager-sync.png" >}} + +### Deploy Sealed Secrets + +Synchronize the `sealed-secrets` application: + +```sh +argocd app sync sealed-secrets +``` + +Confirm that sealed-secrets is running: + +```sh +kubectl -n kube-system rollout status deploy/sealed-secrets +``` + +{{< fig alt="Synchronize the sealed-secrets application" + title="Synchronize the sealed-secrets application" + center="true" + src="/images/gitops/dashboard-sealed-secrets-sync.png" >}} + +### Create mTLS trust anchor + +Before proceeding with deploying Linkerd, we will need to create the mTLS trust +anchor. Then we will also set up the `linkerd-bootstrap` application to manage +the trust anchor certificate. + +Create a new mTLS trust anchor private key and certificate: + +```sh +step certificate create root.linkerd.cluster.local sample-trust.crt sample-trust.key \ + --profile root-ca \ + --no-password \ + --not-after 43800h \ + --insecure +``` + +Confirm the details (encryption algorithm, expiry date, SAN etc.) of the new +trust anchor: + +```sh +step certificate inspect sample-trust.crt +``` + +Create a `SealedSecret` resource to store the encrypted trust anchor: + +```sh +kubectl -n linkerd create secret tls linkerd-trust-anchor \ + --cert sample-trust.crt \ + --key sample-trust.key \ + --dry-run=client -oyaml | \ +kubeseal --controller-name=sealed-secrets -oyaml - | \ +kubectl patch -f - \ + -p '{"spec": {"template": {"type":"kubernetes.io/tls", "metadata": {"labels": {"linkerd.io/control-plane-component":"identity", "linkerd.io/control-plane-ns":"linkerd"}, "annotations": {"linkerd.io/created-by":"linkerd/cli stable-2.8.1", "linkerd.io/identity-issuer-expiry":"2021-07-19T20:51:01Z"}}}}}' \ + --dry-run=client \ + --type=merge \ + --local -oyaml > gitops/resources/linkerd/trust-anchor.yaml +``` + +This will overwrite the existing `SealedSecret` resource in your local +`gitops/resources/linkerd/trust-anchor.yaml` file. We will push this change to +the in-cluster Git server. + +Confirm that only the `spec.encryptedData` is changed: + +```sh +git diff gitops/resources/linkerd/trust-anchor.yaml +``` + +Commit and push the new trust anchor secret to your in-cluster Git server: + +```sh +git add gitops/resources/linkerd/trust-anchor.yaml + +git commit -m "update encrypted trust anchor" + +git push git-server master +``` + +Confirm the commit is successfully pushed: + +```sh +kubectl -n scm exec "${git_server}" -- git --git-dir linkerd-examples.git log -1 +``` + +## Deploy linkerd-bootstrap + +Synchronize the `linkerd-bootstrap` application: + +```sh +argocd app sync linkerd-bootstrap +``` + +{{< note >}} +If the issuer and certificate resources appear in a degraded state, it's likely +that the SealedSecrets controller failed to decrypt the sealed +`linkerd-trust-anchor` secret. Check the SealedSecrets controller for error logs. + +For debugging purposes, the sealed resource can be retrieved using the +`kubectl -n linkerd get sealedsecrets linkerd-trust-anchor -oyaml` command. +Ensure that this resource matches the +`gitops/resources/linkerd/trust-anchor.yaml` file you pushed to the in-cluster +Git server earlier. +{{< /note >}} + +{{< fig alt="Synchronize the linkerd-bootstrap application" + title="Synchronize the linkerd-bootstrap application" + src="/images/gitops/dashboard-linkerd-bootstrap-sync.png" >}} + +SealedSecrets should have created a secret containing the decrypted trust +anchor. Retrieve the decrypted trust anchor from the secret: + +```sh +trust_anchor=`kubectl -n linkerd get secret linkerd-trust-anchor -ojsonpath="{.data['tls\.crt']}" | base64 -d -w 0 -` +``` + +Confirm that it matches the decrypted trust anchor certificate you created +earlier in your local `sample-trust.crt` file: + +```sh +diff -b \ + <(echo "${trust_anchor}" | step certificate inspect -) \ + <(step certificate inspect sample-trust.crt) +``` + +### Deploy Linkerd + +Now we are ready to install Linkerd. The decrypted trust anchor we just +retrieved will be passed to the installation process using the +`identityTrustAnchorsPEM` parameter. + +Prior to installing Linkerd, note that the `global.identityTrustAnchorsPEM` +parameter is set to an "empty" certificate string: + +```sh +argocd app get linkerd -ojson | \ + jq -r '.spec.source.helm.parameters[] | select(.name == "identityTrustAnchorsPEM") | .value' +``` + +{{< fig alt="Empty default trust anchor" + title="Empty default trust anchor" + src="/images/gitops/dashboard-trust-anchor-empty.png" >}} + +We will override this parameter in the `linkerd` application with the value of +`${trust_anchor}`. + +Locate the `identityTrustAnchorsPEM` variable in your local +`gitops/argo-apps/linkerd.yaml` file, and set its `value` to that of +`${trust_anchor}`. + +Ensure that the multi-line string is indented correctly. E.g., + +```yaml + source: + chart: linkerd2 + repoURL: https://helm.linkerd.io/stable + targetRevision: 2.8.0 + helm: + parameters: + - name: identityTrustAnchorsPEM + value: | + -----BEGIN CERTIFICATE----- + MIIBlTCCATygAwIBAgIRAKQr9ASqULvXDeyWpY1LJUQwCgYIKoZIzj0EAwIwKTEn + MCUGA1UEAxMeaWRlbnRpdHkubGlua2VyZC5jbHVzdGVyLmxvY2FsMB4XDTIwMDkx + ODIwMTAxMFoXDTI1MDkxNzIwMTAxMFowKTEnMCUGA1UEAxMeaWRlbnRpdHkubGlu + a2VyZC5jbHVzdGVyLmxvY2FsMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE+PUp + IR74PsU+geheoyseycyquYyes5eeksIb5FDm8ptOXQ2xPcBpvesZkj6uIyS3k4qV + E0S9VtMmHNeycL7446NFMEMwDgYDVR0PAQH/BAQDAgEGMBIGA1UdEwEB/wQIMAYB + Af8CAQEwHQYDVR0OBBYEFHypCh7hiSLNxsKhMylQgqD9t7NNMAoGCCqGSM49BAMC + A0cAMEQCIEWhI86bXWEd4wKTnG07hBfBuVCT0bxopaYnn3wRFx7UAiAwXyh5uaVg + MwCC5xL+PM+bm3PRqtrmI6TocWH07GbMxg== + -----END CERTIFICATE----- +``` + +Confirm that only one `spec.source.helm.parameters.value` field is changed: + +```sh +git diff gitops/argo-apps/linkerd.yaml +``` + +Commit and push the changes to the Git server: + +```sh +git add gitops/argo-apps/linkerd.yaml + +git commit -m "set identityTrustAnchorsPEM parameter" + +git push git-server master +``` + +Synchronize the `main` application: + +```sh +argocd app sync main +``` + +Confirm that the new trust anchor is picked up by the `linkerd` application: + +```sh +argocd app get linkerd -ojson | \ + jq -r '.spec.source.helm.parameters[] | select(.name == "identityTrustAnchorsPEM") | .value' +``` + +{{< fig alt="Override mTLS trust anchor" + title="Override mTLS trust anchor" + src="/images/gitops/dashboard-trust-anchor-override.png" >}} + +Synchronize the `linkerd` application: + +```sh +argocd app sync linkerd +``` + +Check that Linkerd is ready: + +```sh +linkerd check +``` + +{{< fig alt="Synchronize Linkerd" + title="Synchronize Linkerd" + src="/images/gitops/dashboard-linkerd-sync.png" >}} + +### Test with emojivoto + +Deploy emojivoto to test auto proxy injection: + +```sh +argocd app sync emojivoto +``` + +Check that the applications are healthy: + +```sh +for deploy in "emoji" "vote-bot" "voting" "web" ; \ + do kubectl -n emojivoto rollout status deploy/${deploy}; \ +done +``` + +{{< fig alt="Synchronize emojivoto" + title="Synchronize emojivoto" + src="/images/gitops/dashboard-emojivoto-sync.png" >}} + +### Upgrade Linkerd to 2.8.1 + +Use your editor to change the `spec.source.targetRevision` field to `2.8.1` in +the `gitops/argo-apps/linkerd.yaml` file: + +Confirm that only the `targetRevision` field is changed: + +```sh +git diff gitops/argo-apps/linkerd.yaml +``` + +Commit and push this change to the Git server: + +```sh +git add gitops/argo-apps/linkerd.yaml + +git commit -m "upgrade Linkerd to 2.8.1" + +git push git-server master +``` + +Synchronize the `main` application: + +```sh +argocd app sync main +``` + +Synchronize the `linkerd` application: + +```sh +argocd app sync linkerd +``` + +Confirm that the upgrade completed successfully: + +```sh +linkerd check +``` + +Confirm the new version of the control plane: + +```sh +linkerd version +``` + +### Clean up + +All the applications can be removed by removing the `main` application: + +```sh +argocd app delete main --cascade=true +``` diff --git a/linkerd.io/content/2.11/tasks/graceful-shutdown.md b/linkerd.io/content/2.11/tasks/graceful-shutdown.md new file mode 100644 index 0000000000..5ecb53dda7 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/graceful-shutdown.md @@ -0,0 +1,61 @@ ++++ +title = "Graceful Pod Shutdown" +description = "Gracefully handle pod shutdown signal." ++++ + +When Kubernetes begins to terminate a pod, it starts by sending all containers +in that pod a TERM signal. When the Linkerd proxy sidecar receives this signal, +it will immediately begin a graceful shutdown where it refuses all new requests +and allows existing requests to complete before shutting down. + +This means that if the pod's main container attempts to make any new network +calls after the proxy has received the TERM signal, those network calls will +fail. This also has implications for clients of the terminating pod and for +job resources. + +## Slow Updating Clients + +Before Kubernetes terminates a pod, it first removes that pod from the endpoints +resource of any services that pod is a member of. This means that clients of +that service should stop sending traffic to the pod before it is terminated. +However, certain clients can be slow to receive the endpoints update and may +attempt to send requests to the terminating pod after that pod's proxy has +already received the TERM signal and begun graceful shutdown. Those requests +will fail. + +To mitigate this, use the `--wait-before-exit-seconds` flag with +`linkerd inject` to delay the Linkerd proxy's handling of the TERM signal for +a given number of seconds using a `preStop` hook. This delay gives slow clients +additional time to receive the endpoints update before beginning graceful +shutdown. To achieve max benefit from the option, the main container should have +its own `preStop` hook with the sleep command inside which has a smaller period +than is set for the proxy sidecar. And none of them must be bigger than +`terminationGracePeriodSeconds` configured for the entire pod. + +For example, + +```yaml + # application container + lifecycle: + preStop: + exec: + command: + - /bin/bash + - -c + - sleep 20 + + # for entire pod + terminationGracePeriodSeconds: 160 +``` + +## Job Resources + +Pods which are part of a job resource run until all of the containers in the +pod complete. However, the Linkerd proxy container runs continuously until it +receives a TERM signal. This means that job pods which have been injected will +continue to run, even once the main container has completed. + +Better support for +[sidecar containers in Kubernetes](https://github.com/kubernetes/kubernetes/issues/25908) +has been proposed and Linkerd will take advantage of this support when it +becomes available. diff --git a/linkerd.io/content/2.11/tasks/install-helm.md b/linkerd.io/content/2.11/tasks/install-helm.md new file mode 100644 index 0000000000..50788d1a43 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/install-helm.md @@ -0,0 +1,159 @@ ++++ +title = "Installing Linkerd with Helm" +description = "Install Linkerd onto your own Kubernetes cluster using Helm." ++++ + +Linkerd can optionally be installed via Helm rather than with the `linkerd +install` command. + +## Prerequisite: identity certificates + +The identity component of Linkerd requires setting up a trust anchor +certificate, and an issuer certificate with its key. These must use the ECDSA +P-256 algorithm and need to be provided to Helm by the user (unlike when using +the `linkerd install` CLI which can generate these automatically). You can +provide your own, or follow [these instructions](../generate-certificates/) +to generate new ones. + +## Adding Linkerd's Helm repository + +```bash +# To add the repo for Linkerd2 stable releases: +helm repo add linkerd https://helm.linkerd.io/stable + +# To add the repo for Linkerd2 edge releases: +helm repo add linkerd-edge https://helm.linkerd.io/edge +``` + +The following instructions use the `linkerd` repo. For installing an edge +release, just replace with `linkerd-edge`. + +## Helm install procedure + +```bash +# set expiry date one year from now, in Mac: +exp=$(date -v+8760H +"%Y-%m-%dT%H:%M:%SZ") +# in Linux: +exp=$(date -d '+8760 hour' +"%Y-%m-%dT%H:%M:%SZ") + +helm install linkerd2 \ + --set-file identityTrustAnchorsPEM=ca.crt \ + --set-file identity.issuer.tls.crtPEM=issuer.crt \ + --set-file identity.issuer.tls.keyPEM=issuer.key \ + --set identity.issuer.crtExpiry=$exp \ + linkerd/linkerd2 +``` + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} + +The chart values will be picked from the chart's `values.yaml` file. + +You can override the values in that file by providing your own `values.yaml` +file passed with a `-f` option, or overriding specific values using the family of +`--set` flags like we did above for certificates. + +## Disabling The Proxy Init Container + +If installing with CNI, make sure that you add the `--set +cniEnabled=true` flag to your `helm install` command. + +## Setting High-Availability + +The chart contains a file `values-ha.yaml` that overrides some +default values as to set things up under a high-availability scenario, analogous +to the `--ha` option in `linkerd install`. Values such as higher number of +replicas, higher memory/cpu limits and affinities are specified in that file. + +You can get ahold of `values-ha.yaml` by fetching the chart files: + +```bash +helm fetch --untar linkerd/linkerd2 +``` + +Then use the `-f` flag to provide the override file, for example: + +```bash +## see above on how to set $exp +helm install linkerd2 \ + --set-file identityTrustAnchorsPEM=ca.crt \ + --set-file identity.issuer.tls.crtPEM=issuer.crt \ + --set-file identity.issuer.tls.keyPEM=issuer.key \ + --set identity.issuer.crtExpiry=$exp \ + -f linkerd2/values-ha.yaml \ + linkerd/linkerd2 +``` + +{{< note >}} +For Helm versions < v3, `--name` flag has to specifically be passed. +In Helm v3, It has been deprecated, and is the first argument as + specified above. +{{< /note >}} + +## Customizing the Namespace + +To install Linkerd to a different namespace than the default `linkerd`, +override the `Namespace` variable. + +By default, the chart creates the control plane namespace with the +`config.linkerd.io/admission-webhooks: disabled` label. It is required for the +control plane to work correctly. This means that the chart won't work with +Helm v2's `--namespace` option. If you're relying on a separate tool to create +the control plane namespace, make sure that: + +1. The namespace is labeled with `config.linkerd.io/admission-webhooks: disabled` +1. The `installNamespace` is set to `false` +1. The `namespace` variable is overridden with the name of your namespace + +{{< note >}} +In Helm v3 the `--namespace` option must be used with an existing namespace. +{{< /note >}} + +## Helm upgrade procedure + +Make sure your local Helm repos are updated: + +```bash +helm repo update + +helm search linkerd2 -v {{% latestversion %}} +NAME CHART VERSION APP VERSION DESCRIPTION +linkerd/linkerd2 {{% latestversion %}} Linkerd gives you observability, reliability, and securit... +``` + +The `helm upgrade` command has a number of flags that allow you to customize +its behaviour. The ones that special attention should be paid to are +`--reuse-values` and `--reset-values` and how they behave when charts change +from version to version and/or overrides are applied through `--set` and +`--set-file`. To summarize there are the following prominent cases that can be +observed: + +- `--reuse-values` with no overrides - all values are reused +- `--reuse-values` with overrides - all except the values that are overridden +are reused +- `--reset-values` with no overrides - no values are reused and all changes +from provided release are applied during the upgrade +- `--reset-values` with overrides - no values are reused and changed from +provided release are applied together with the overrides +- no flag and no overrides - `--reuse-values` will be used by default +- no flag and overrides - `--reset-values` will be used by default + +Bearing all that in mind, you have to decide whether you want to reuse the +values in the chart or move to the values specified in the newer chart. +The advised practice is to use a `values.yaml` file that stores all custom +overrides that you have for your chart. Before upgrade, check whether there +are breaking changes to the chart (i.e. renamed or moved keys, etc). You can +consult the [edge](https://hub.helm.sh/charts/linkerd2-edge/linkerd2) or the +[stable](https://hub.helm.sh/charts/linkerd2/linkerd2) chart docs, depending on +which one your are upgrading to. If there are, make the corresponding changes to +your `values.yaml` file. Then you can use: + +```bash +helm upgrade linkerd2 linkerd/linkerd2 --reset-values -f values.yaml --atomic +``` + +The `--atomic` flag will ensure that all changes are rolled back in case the +upgrade operation fails diff --git a/linkerd.io/content/2.11/tasks/install.md b/linkerd.io/content/2.11/tasks/install.md new file mode 100644 index 0000000000..8605f9c15c --- /dev/null +++ b/linkerd.io/content/2.11/tasks/install.md @@ -0,0 +1,155 @@ ++++ +title = "Installing Linkerd" +description = "Install Linkerd to your own Kubernetes cluster." +aliases = [ + "../upgrading/", + "../installing/", + "../rbac/" +] ++++ + +Before you can use Linkerd, you'll need to install the +[core control plane](../../reference/architecture/#control-plane). This page +covers how to accomplish that, as well as common problems that you may +encounter. + +Note that the control plane is typically installed by using Linkerd's CLI. See +[Getting Started](../../getting-started/) for how to install the CLI onto your local +environment. + +Linkerd also comprises of some first party extensions which add additional features +i.e `viz`, `multicluster` and `jaeger`. See [Extensions](../extensions/) +to understand how to install them. + +Note also that, once the control plane is installed, you'll need to "mesh" any +services you want Linkerd active for. See +[Adding Your Service](../../adding-your-service/) for how to add Linkerd's data +plane to your services. + +## Requirements + +Linkerd 2.x requires a functioning Kubernetes cluster on which to run. This +cluster may be hosted on a cloud provider or may be running locally via +Minikube or Docker for Desktop. + +You can validate that this Kubernetes cluster is configured appropriately for +Linkerd by running + +```bash +linkerd check --pre +``` + +### GKE + +If installing Linkerd on GKE, there are some extra steps required depending on +how your cluster has been configured. If you are using any of these features, +check out the additional instructions. + +- [Private clusters](../../reference/cluster-configuration/#private-clusters) + +## Installing + +Once you have a cluster ready, generally speaking, installing Linkerd is as +easy as running `linkerd install` to generate a Kubernetes manifest, and +applying that to your cluster, for example, via + +```bash +linkerd install | kubectl apply -f - +``` + +See [Getting Started](../../getting-started/) for an example. + +{{< note >}} +Most common configuration options are provided as flags for `install`. See the +[reference documentation](../../reference/cli/install/) for a complete list of +options. To do configuration that is not part of the `install` command, see how +you can create a [customized install](../customize-install/). +{{< /note >}} + +{{< note >}} +For organizations that distinguish cluster privileges by role, jump to the +[Multi-stage install](#multi-stage-install) section. +{{< /note >}} + +## Verification + +After installation, you can validate that the installation was successful by +running: + +```bash +linkerd check +``` + +## Uninstalling + +See [Uninstalling Linkerd](../uninstall/). + +## Multi-stage install + +If your organization assigns Kubernetes cluster privileges based on role +(typically cluster owner and service owner), Linkerd provides a "multi-stage" +installation to accommodate these two roles. The two installation stages are +`config` (for the cluster owner) and `control-plane` (for the service owner). +The cluster owner has privileges necessary to create namespaces, as well as +global resources including cluster roles, bindings, and custom resource +definitions. The service owner has privileges within a namespace necessary to +create deployments, configmaps, services, and secrets. + +### Stage 1: config + +The `config` stage is intended to be run by the cluster owner, the role with +more privileges. It is also the cluster owner's responsibility to run the +initial pre-install check: + +```bash +linkerd check --pre +``` + +Once the pre-install check passes, install the config stage with: + +```bash +linkerd install config | kubectl apply -f - +``` + +In addition to creating the `linkerd` namespace, this command installs the +following resources onto your Kubernetes cluster: + +- ClusterRole +- ClusterRoleBinding +- CustomResourceDefinition +- MutatingWebhookConfiguration +- PodSecurityPolicy +- Role +- RoleBinding +- Secret +- ServiceAccount +- ValidatingWebhookConfiguration + +To validate the `config` stage succeeded, run: + +```bash +linkerd check config +``` + +### Stage 2: control-plane + +Following successful installation of the `config` stage, the service owner may +install the `control-plane` with: + +```bash +linkerd install control-plane | kubectl apply -f - +``` + +This command installs the following resources onto your Kubernetes cluster, all +within the `linkerd` namespace: + +- ConfigMap +- Deployment +- Secret +- Service + +To validate the `control-plane` stage succeeded, run: + +```bash +linkerd check +``` diff --git a/linkerd.io/content/2.11/tasks/installing-multicluster.md b/linkerd.io/content/2.11/tasks/installing-multicluster.md new file mode 100644 index 0000000000..e317107d94 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/installing-multicluster.md @@ -0,0 +1,363 @@ ++++ +title = "Installing Multi-cluster Components" +description = "Allow Linkerd to manage cross-cluster communication." ++++ + +Multicluster support in Linkerd requires extra installation and configuration on +top of the default [control plane installation](../install/). This guide +walks through this installation and configuration as well as common problems +that you may encounter. For a detailed walkthrough and explanation of what's +going on, check out [getting started](../multicluster/). + +If you'd like to use an existing [Ambassador](https://www.getambassador.io/) +installation, check out the +[leverage](../installing-multicluster/#leverage-ambassador) instructions. +Alternatively, check out the Ambassador +[documentation](https://www.getambassador.io/docs/latest/howtos/linkerd2/#multicluster-operation) +for a more detailed explanation of the configuration and what's going on. + +## Requirements + +- Two clusters. +- A [control plane installation](../install/) in each cluster that shares + a common + [trust anchor](../generate-certificates/#trust-anchor-certificate). + If you have an existing installation, see the + [trust anchor bundle](../installing-multicluster/#trust-anchor-bundle) + documentation to understand what is required. +- Each of these clusters should be configured as `kubectl` + [contexts](https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/). +- Elevated privileges on both clusters. We'll be creating service accounts and + granting extended privileges, so you'll need to be able to do that on your + test clusters. +- Support for services of type `LoadBalancer` in the `east` cluster. Check out + the documentation for your cluster provider or take a look at + [inlets](https://blog.alexellis.io/ingress-for-your-local-kubernetes-cluster/). + This is what the `west` cluster will use to communicate with `east` via the + gateway. + +## Step 1: Install the multicluster control plane + +On each cluster, run: + +```bash +linkerd multicluster install | \ + kubectl apply -f - +``` + +To verify that everything has started up successfully, run: + +```bash +linkerd multicluster check +``` + +For a deep dive into what components are being added to your cluster and how all +the pieces fit together, check out the +[getting started documentation](../multicluster/#preparing-your-cluster). + +## Step 2: Link the clusters + +Each cluster must be linked. This consists of installing several resources in +the source cluster including a secret containing a kubeconfig that allows access +to the target cluster Kubernetes API, a service mirror control for mirroring +services, and a Link custom resource for holding configuration. To link cluster +`west` to cluster `east`, you would run: + +```bash +linkerd --context=east multicluster link --cluster-name east | + kubectl --context=west apply -f - +``` + +To verify that the credentials were created successfully and the clusters are +able to reach each other, run: + +```bash +linkerd --context=west multicluster check +``` + +You should also see the list of gateways show up by running. Note that you'll +need Linkerd's Viz extension to be installed in the source cluster to get the +list of gateways: + +```bash +linkerd --context=west multicluster gateways +``` + +For a detailed explanation of what this step does, check out the +[linking the clusters section](../multicluster/#linking-the-clusters). + +## Step 3: Export services + +Services are not automatically mirrored in linked clusters. By default, only +services with the `mirror.linkerd.io/exported` label will be mirrored. For each +service you would like mirrored to linked clusters, run: + +```bash +kubectl label svc foobar mirror.linkerd.io/exported=true +``` + +{{< note >}} You can configure a different label selector by using the +`--selector` flag on the `linkerd multicluster link` command or by editing +the Link resource created by the `linkerd multicluster link` command. +{{< /note >}} + +## Leverage Ambassador + +The bundled Linkerd gateway is not required. In fact, if you have an existing +Ambassador installation, it is easy to use it instead! By using your existing +Ambassador installation, you avoid needing to manage multiple ingress gateways +and pay for extra cloud load balancers. This guide assumes that Ambassador has +been installed into the `ambassador` namespace. + +First, you'll want to inject the `ambassador` deployment with Linkerd: + +```bash +kubectl -n ambassador get deploy ambassador -o yaml | \ + linkerd inject \ + --skip-inbound-ports 80,443 \ + --require-identity-on-inbound-ports 4183 - | \ + kubectl apply -f - +``` + +This will add the Linkerd proxy, skip the ports that Ambassador is handling for +public traffic and require identity on the gateway port. Check out the +[docs](../multicluster/#security) to understand why it is important to +require identity on the gateway port. + +Next, you'll want to add some configuration so that Ambassador knows how to +handle requests: + +```bash +cat < trustAnchor.crt +``` + +{{< note >}} This command requires [yq](https://github.com/mikefarah/yq). If you +don't have yq, feel free to extract the certificate from the `identityTrustAnchorsPEM` +field with your tool of choice. +{{< /note >}} + +Now, you'll want to create a new trust anchor and issuer for the new cluster: + +```bash +step certificate create root.linkerd.cluster.local root.crt root.key \ + --profile root-ca --no-password --insecure +step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \ + --profile intermediate-ca --not-after 8760h --no-password --insecure \ + --ca root.crt --ca-key root.key +``` + +{{< note >}} We use the [step cli](https://smallstep.com/cli/) to generate +certificates. `openssl` works just as well! {{< /note >}} + +With the old cluster's trust anchor and the new cluster's trust anchor, you can +create a bundle by running: + +```bash +cat trustAnchor.crt root.crt > bundle.crt +``` + +You'll want to upgrade your existing cluster with the new bundle. Make sure +every pod you'd like to have talk to the new cluster is restarted so that it can +use this bundle. To upgrade the existing cluster with this new trust anchor +bundle, run: + +```bash +linkerd upgrade --identity-trust-anchors-file=./bundle.crt | \ + kubectl apply -f - +``` + +Finally, you'll be able to install Linkerd on the new cluster by using the trust +anchor bundle that you just created along with the issuer certificate and key. + +```bash +linkerd install \ + --identity-trust-anchors-file bundle.crt \ + --identity-issuer-certificate-file issuer.crt \ + --identity-issuer-key-file issuer.key | \ + kubectl apply -f - +``` + +Make sure to verify that the cluster's have started up successfully by running +`check` on each one. + +```bash +linkerd check +``` + +## Installing the multicluster control plane components through Helm + +Linkerd's multicluster components i.e Gateway and Service Mirror can +be installed via Helm rather than the `linkerd multicluster install` command. + +This not only allows advanced configuration, but also allows users to bundle the +multicluster installation as part of their existing Helm based installation +pipeline. + +### Adding Linkerd's Helm repository + +First, let's add the Linkerd's Helm repository by running + +```bash +# To add the repo for Linkerd2 stable releases: +helm repo add linkerd https://helm.linkerd.io/stable +``` + +### Helm multicluster install procedure + +```bash +helm install linkerd2-multicluster linkerd/linkerd2-multicluster +``` + +The chart values will be picked from the chart's `values.yaml` file. + +You can override the values in that file by providing your own `values.yaml` +file passed with a `-f` option, or overriding specific values using the family of +`--set` flags. + +Full set of configuration options can be found [here](https://github.com/linkerd/linkerd2/tree/main/charts/linkerd2-multicluster#configuration) + +The installation can be verified by running + +```bash +linkerd multicluster check +``` + +Installation of the gateway can be disabled with the `gateway` setting. By +default this value is true. + +### Installing additional access credentials + +When the multicluster components are installed onto a target cluster with +`linkerd multicluster install`, a service account is created which source clusters +will use to mirror services. Using a distinct service account for each source +cluster can be benefitial since it gives you the ability to revoke service mirroring +access from specific source clusters. Generating additional service accounts +and associated RBAC can be done using the `linkerd multicluster allow` command +through the CLI. + +The same functionality can also be done through Helm setting the +`remoteMirrorServiceAccountName` value to a list. + +```bash + helm install linkerd2-mc-source linkerd/linkerd2-multicluster --set remoteMirrorServiceAccountName={source1\,source2\,source3} --kube-context target +``` + +Now that the multicluster components are installed, operations like linking, etc +can be performed by using the linkerd CLI's multicluster sub-command as per the +[multicluster task](../../features/multicluster/). diff --git a/linkerd.io/content/2.11/tasks/linkerd-smi.md b/linkerd.io/content/2.11/tasks/linkerd-smi.md new file mode 100644 index 0000000000..b8a977446c --- /dev/null +++ b/linkerd.io/content/2.11/tasks/linkerd-smi.md @@ -0,0 +1,218 @@ ++++ +title = "Getting started with Linkerd SMI extension" +description = "Use Linkerd SMI extension to work with Service Mesh Interface(SMI) resources." ++++ + +[Service Mesh Interface](https://smi-spec.io/) is a standard interface for +service meshes on Kubernetes. It defines a set of resources that could be +used across service meshes that implement it. +You can read more about it in the [specification](https://github.com/servicemeshinterface/smi-spec) + +Currently, Linkerd supports SMI's `TrafficSplit` specification which can be +used to perform traffic splitting across services natively. This means that +you can apply the SMI resources without any additional +components/configuration but this obviously has some downsides, as +Linkerd may not be able to add extra specific configurations specific to it, +as SMI is more like a lowest common denominator of service mesh functionality. + +To get around these problems, Linkerd can instead have an adaptor that converts +SMI specifications into native Linkerd configurations that it can understand +and perform the operation. This also removes the extra native coupling with SMI +resources with the control-plane, and the adaptor can move independently and +have it's own release cycle. [Linkerd SMI](https://www.github.com/linkerd/linkerd-smi) +is an extension that does just that. + +This guide will walk you through installing the SMI extension and configuring +a `TrafficSplit` specification, to perform Traffic Splitting across services. + +## Prerequisites + +- To use this guide, you'll need to have Linkerd installed on your cluster. + Follow the [Installing Linkerd Guide](../install/) if you haven't + already done this. + +## Install the Linkerd-SMI extension + +### CLI + +Install the SMI extension CLI binary by running: + +```bash +curl -sL https://linkerd.github.io/linkerd-smi/install | sh +``` + +Alternatively, you can download the CLI directly via the [releases page](https://github.com/linkerd/linkerd-smi/releases). + +The first step is installing the Linkerd-SMI extension onto your cluster. +This extension consists of a SMI-Adaptor which converts SMI resources into +native Linkerd resources. + +To install the Linkerd-SMI extension, run the command: + +```bash +linkerd smi install | kubectl apply -f - +``` + +You can verify that the Linkerd-SMI extension was installed correctly by +running: + +```bash +linkerd smi check +``` + +### Helm + +To install the `linkerd-smi` Helm chart, run: + +```bash +helm repo add l5d-smi https://linkerd.github.io/linkerd-smi +helm install l5d-smi/linkerd-smi --generate-name +``` + +## Install Sample Application + +First, let's install the sample application. + +```bash +# create a namespace for the sample application +kubectl create namespace trafficsplit-sample + +# install the sample application +linkerd inject https://raw.githubusercontent.com/linkerd/linkerd2/main/test/integration/trafficsplit/testdata/application.yaml | kubectl -n trafficsplit-sample apply -f - +``` + +This installs a simple client, and two server deployments. +One of the server deployments i.e `faling-svc` always returns a 500 error, +and the other one i.e `backend-svc` always returns a 200. + +```bash +kubectl get deployments -n trafficsplit-sample +NAME READY UP-TO-DATE AVAILABLE AGE +backend 1/1 1 1 2m29s +failing 1/1 1 1 2m29s +slow-cooker 1/1 1 1 2m29s +``` + +By default, the client will hit the `backend-svc`service. This is evident by +the `edges` sub command. + +```bash +linkerd viz edges deploy -n trafficsplit-sample +SRC DST SRC_NS DST_NS SECURED +prometheus backend linkerd-viz trafficsplit-sample √ +prometheus failing linkerd-viz trafficsplit-sample √ +prometheus slow-cooker linkerd-viz trafficsplit-sample √ +slow-cooker backend trafficsplit-sample trafficsplit-sample √ +``` + +## Configuring a TrafficSplit + +Now, Let's apply a `TrafficSplit` resource to perform Traffic Splitting on the +`backend-svc` to distribute load between it and the `failing-svc`. + +```bash +cat < +Annotations: +API Version: linkerd.io/v1alpha2 +Kind: ServiceProfile +Metadata: + Creation Timestamp: 2021-08-02T12:42:52Z + Generation: 1 + Managed Fields: + API Version: linkerd.io/v1alpha2 + Fields Type: FieldsV1 + fieldsV1: + f:spec: + .: + f:dstOverrides: + Manager: smi-adaptor + Operation: Update + Time: 2021-08-02T12:42:52Z + Resource Version: 3542 + UID: cbcdb74f-07e0-42f0-a7a8-9bbcf5e0e54e +Spec: + Dst Overrides: + Authority: backend-svc.trafficsplit-sample.svc.cluster.local + Weight: 500 + Authority: failing-svc.trafficsplit-sample.svc.cluster.local + Weight: 500 +Events: +``` + +As we can see, A relevant `ServiceProfile` with `DstOverrides` has +been created to perform the TrafficSplit. + +The Traffic Splitting can be verified by running the `edges` command. + +```bash +linkerd viz edges deploy -n trafficsplit-sample +SRC DST SRC_NS DST_NS SECURED +prometheus backend linkerd-viz trafficsplit-sample √ +prometheus failing linkerd-viz trafficsplit-sample √ +prometheus slow-cooker linkerd-viz trafficsplit-sample √ +slow-cooker backend trafficsplit-sample trafficsplit-sample √ +slow-cooker failing trafficsplit-sample trafficsplit-sample √ +``` + +This can also be verified by running `stat` sub command on the `TrafficSplit` +resource. + +```bash +linkerd viz stat ts/backend-split -n traffic-sample +NAME APEX LEAF WEIGHT SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +backend-split backend-svc backend-svc 500 100.00% 0.5rps 1ms 1ms 1ms +backend-split backend-svc failing-svc 500 0.00% 0.5rps 1ms 1ms 1ms +``` + +This can also be verified by checking the `smi-adaptor` logs. + +```bash +kubectl -n linkerd-smi logs deploy/smi-adaptor smi-adaptor +time="2021-08-04T11:04:35Z" level=info msg="Using cluster domain: cluster.local" +time="2021-08-04T11:04:35Z" level=info msg="Starting SMI Controller" +time="2021-08-04T11:04:35Z" level=info msg="Waiting for informer caches to sync" +time="2021-08-04T11:04:35Z" level=info msg="starting admin server on :9995" +time="2021-08-04T11:04:35Z" level=info msg="Starting workers" +time="2021-08-04T11:04:35Z" level=info msg="Started workers" +time="2021-08-04T11:05:17Z" level=info msg="created serviceprofile/backend-svc.trafficsplit-sample.svc.cluster.local for trafficsplit/backend-split" +time="2021-08-04T11:05:17Z" level=info msg="Successfully synced 'trafficsplit-sample/backend-split'" +``` + +## Cleanup + +Delete the `trafficsplit-sample` resource by running + +```bash +kubectl delete namespace/trafficsplit-sample +``` + +### Conclusion + +Though, Linkerd currently supports reading `TrafficSplit` resources directly +`ServiceProfiles` would always take a precedence over `TrafficSplit` resources. The +support for `TrafficSplit` resource will be removed in a further release at which +the `linkerd-smi` extension would be necessary to use `SMI` resources with Linkerd. diff --git a/linkerd.io/content/2.11/tasks/manually-rotating-control-plane-tls-credentials.md b/linkerd.io/content/2.11/tasks/manually-rotating-control-plane-tls-credentials.md new file mode 100644 index 0000000000..e2b2842a7f --- /dev/null +++ b/linkerd.io/content/2.11/tasks/manually-rotating-control-plane-tls-credentials.md @@ -0,0 +1,335 @@ ++++ +title = "Manually Rotating Control Plane TLS Credentials" +description = "Update Linkerd's TLS trust anchor and issuer certificate." +aliases = [ "rotating_identity_certificates" ] ++++ + +Linkerd's [automatic mTLS](../../features/automatic-mtls/) feature uses a set of +TLS credentials to generate TLS certificates for proxies: a trust anchor, and +an issuer certificate and private key. The trust anchor has a limited period of +validity: 365 days if generated by `linkerd install`, or a customized value if +[generated manually](../generate-certificates/). + +Thus, for clusters that are expected to outlive this lifetime, you must +manually rotate the trust anchor. In this document, we describe how to +accomplish this without downtime. + +Independent of the trust anchor, the issuer certificate and key pair can also +expire (though it is possible to [use `cert-manager` to set up automatic +rotation](../automatically-rotating-control-plane-tls-credentials/). This +document also covers how to rotate the issuer certificate and key pair without +downtime. + +## Prerequisites + +These instructions use the [step](https://smallstep.com/cli/) and +[jq](https://stedolan.github.io/jq/) CLI tools. + +## Understanding the current state of your system + +Begin by running: + +```bash +linkerd check --proxy +``` + +If your configuration is valid and your credentials are not expiring soon, you +should see output similar to: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` + +However, if you see a message warning you that your trust anchor ("trust root") +or issuer certificates are expiring soon, then you must rotate them. + +Note that this document only applies if the trust root and issuer certificate +are currently valid. If your trust anchor or issuer certificate have expired, +please follow the [Replacing Expired +Certificates Guide](../replacing_expired_certificates/) instead. + +For example, if your issuer certificate has expired, you will see a message +similar to: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +× issuer cert is within its validity period +issuer certificate is not valid anymore. Expired on 2019-12-19T09:02:01Z +see https://linkerd.io/checks/#l5d-identity-issuer-cert-is-time-valid for hints +``` + +If your trust anchor has expired, you will see a message similar to: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +× trust roots are within their validity period +Invalid roots: +* 79461543992952791393769540277800684467 identity.linkerd.cluster.local not valid anymore. Expired on 2019-12-19T09:11:30Z +see https://linkerd.io/checks/#l5d-identity-roots-are-time-valid for hints +``` + +## Rotating the trust anchor + +Rotating the trust anchor without downtime is a multi-step process: you must +generate a new trust anchor, bundle it with the old one, rotate the issuer +certificate and key pair, and finally remove the old trust anchor from the +bundle. If you simply need to rotate the issuer certificate and key pair, you +can skip directly to [Rotating the identity issuer +certificate](#rotating-the-identity-issuer-certificate) and ignore the trust +anchor rotation steps. + +## Generate a new trust anchor + +First, generate a new trust anchor certificate and private key: + +```bash +step certificate create root.linkerd.cluster.local ca-new.crt ca-new.key --profile root-ca --no-password --insecure +``` + +Note that we use `--no-password --insecure` to avoid encrypting these files +with a passphrase. Store the private key somewhere secure so that it can be +used in the future to [generate new issuer +certificates](../generate-certificates/). + +## Bundle your original trust anchor with the new one + +Next, we need to bundle the trust anchor currently used by Linkerd together with +the new anchor. The following command uses `kubectl` to fetch the Linkerd config, +`jq`/[`yq`](https://github.com/mikefarah/yq) to extract the current trust anchor, +and `step` to combine it with the newly generated trust anchor: + +```bash +kubectl -n linkerd get cm linkerd-config -o=jsonpath='{.data.values}' \ + | yq e .identityTrustAnchorsPEM - > original-trust.crt + +step certificate bundle ca-new.crt original-trust.crt bundle.crt +rm original-trust.crt +``` + +## Deploying the new bundle to Linkerd + +At this point you can use the `linkerd upgrade` command to instruct Linkerd to +work with the new trust bundle: + +```bash +linkerd upgrade --identity-trust-anchors-file=./bundle.crt | kubectl apply -f - +``` + +or you can also use the `helm upgrade` command: + +```bash +helm upgrade linkerd2 --set-file identityTrustAnchorsPEM=./bundle.crt +``` + +This will restart the proxies in the Linkerd control plane, and they will be +reconfigured with the new trust anchor. + +Finally, you must restart the proxy for all injected workloads in your cluster. +For example, doing that for the `emojivoto` namespace would look like: + +```bash +kubectl -n emojivoto rollout restart deploy +``` + +Now you can run the `check` command to ensure that everything is ok: + +```bash +linkerd check --proxy +``` + +You might have to wait a few moments until all the pods have been restarted and +are configured with the correct trust anchor. Meanwhile you might observe warnings: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +‼ issuer cert is valid for at least 60 days + issuer certificate will expire on 2019-12-19T09:51:19Z + see https://linkerd.io/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +‼ data plane proxies certificate match CA + Some pods do not have the current trust bundle and must be restarted: + * emojivoto/emoji-d8d7d9c6b-8qwfx + * emojivoto/vote-bot-588499c9f6-zpwz6 + * emojivoto/voting-8599548fdc-6v64k + * emojivoto/web-67c7599f6d-xx98n + * linkerd/linkerd-sp-validator-75f9d96dc-rch4x + * linkerd/linkerd-tap-68d8bbf64-mpzgb + * linkerd/linkerd-web-849f74b7c6-qlhwc + see https://linkerd.io/checks/#l5d-identity-data-plane-proxies-certs-match-ca for hints +``` + +When the rollout completes, your `check` command should stop warning you that +pods need to be restarted. It may still warn you, however, that your issuer +certificate is about to expire soon: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +‼ issuer cert is valid for at least 60 days + issuer certificate will expire on 2019-12-19T09:51:19Z + see https://linkerd.io/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` + +## Rotating the identity issuer certificate + +To rotate the issuer certificate and key pair, first generate a new pair: + +```bash +step certificate create identity.linkerd.cluster.local issuer-new.crt issuer-new.key \ +--profile intermediate-ca --not-after 8760h --no-password --insecure \ +--ca ca-new.crt --ca-key ca-new.key +``` + +Provided that the trust anchor has not expired and that, if recently rotated, +all proxies have been updated to include a working trust anchor (as outlined in +the previous section) it is now safe to rotate the identity issuer certificate +by using the `upgrade` command again: + +```bash +linkerd upgrade --identity-issuer-certificate-file=./issuer-new.crt --identity-issuer-key-file=./issuer-new.key | kubectl apply -f - +``` + +or + +```bash +exp=$(cat ca-new.crt | openssl x509 -noout -dates | grep "notAfter" | sed -e 's/notAfter=\(.*\)$/"\1"/' | TZ='GMT' xargs -I{} date -d {} +"%Y-%m-%dT%H:%M:%SZ") + +helm upgrade linkerd2 + --set-file identity.issuer.tls.crtPEM=./issuer-new.crt + --set-file identity.issuer.tls.keyPEM=./issuer-new.key + --set identity.issuer.crtExpiry=$exp +``` + +At this point Linkerd's `identity` control plane service should detect the +change of the secret and automatically update its issuer certificates. + +To ensure this has happened, you can check for the specific Kubernetes event: + +```bash +kubectl get events --field-selector reason=IssuerUpdated -n linkerd + +LAST SEEN TYPE REASON OBJECT MESSAGE +9s Normal IssuerUpdated deployment/linkerd-identity Updated identity issuer +``` + +Restart the proxy for all injected workloads in your cluster to ensure that +their proxies pick up certificates issued by the new issuer: + +```bash +kubectl -n emojivoto rollout restart deploy +``` + +Run the `check` command to make sure that everything is going as expected: + +```bash +linkerd check --proxy +``` + +You should see output without any certificate expiration warnings (unless an +expired trust anchor still needs to be removed): + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` + +## Removing the old trust anchor + +We can now remove the old trust anchor from the trust bundle we created earlier. +The `upgrade` command can do that for the Linkerd components: + +```bash +linkerd upgrade --identity-trust-anchors-file=./ca-new.crt | kubectl apply -f - +``` + +or + +```bash +helm upgrade linkerd2 --set-file --set-file identityTrustAnchorsPEM=./ca-new.crt +``` + +Note that the ./ca-new.crt file is the same trust anchor you created at the start +of this process. Additionally, you can use the `rollout restart` command to +bring the configuration of your other injected resources up to date: + +```bash +kubectl -n emojivoto rollout restart deploy +linkerd check --proxy +``` + +Finally the output of the `check` command should not produce any warnings or +errors: + +```text +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` + +Congratulations, you have rotated your trust anchor! 🎉 diff --git a/linkerd.io/content/2.11/tasks/modifying-proxy-log-level.md b/linkerd.io/content/2.11/tasks/modifying-proxy-log-level.md new file mode 100644 index 0000000000..1b67e23139 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/modifying-proxy-log-level.md @@ -0,0 +1,38 @@ ++++ +title = "Modifying the Proxy Log Level" +description = "Linkerd proxy log levels can be modified dynamically to assist with debugging." ++++ + +Emitting logs is an expensive operation for a network proxy, and by default, +the Linkerd data plane proxies are configured to only log exceptional events. +However, sometimes it is useful to increase the verbosity of proxy logs to +assist with diagnosing proxy behavior. Happily, Linkerd allows you to modify +these logs dynamically. + +The log level of a Linkerd proxy can be modified on the fly by using the proxy's +`/proxy-log-level` endpoint on the admin-port. + +For example, to change the proxy log-level of a pod to +`debug`, run +(replace `${POD:?}` or set the environment-variable `POD` with the pod name): + +```sh +kubectl port-forward ${POD:?} linkerd-admin +curl -v --data 'linkerd=debug' -X PUT localhost:4191/proxy-log-level +``` + +whereby `linkerd-admin` is the name of the admin-port (`4191` by default) +of the injected sidecar-proxy. + +The resulting logs can be viewed with `kubectl logs ${POD:?}`. + +If changes to the proxy log level should be retained beyond the lifetime of a +pod, add the `config.linkerd.io/proxy-log-level` annotation to the pod template +(or other options, see reference). + +The syntax of the proxy log level can be found in the +[proxy log level reference](../../reference/proxy-log-level/). + +Note that logging has a noticeable, negative impact on proxy throughput. If the +pod will continue to serve production traffic, you may wish to reset the log +level once you are done. diff --git a/linkerd.io/content/2.11/tasks/multicluster-using-statefulsets.md b/linkerd.io/content/2.11/tasks/multicluster-using-statefulsets.md new file mode 100644 index 0000000000..feabd1e6b0 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/multicluster-using-statefulsets.md @@ -0,0 +1,336 @@ ++++ +title = "Multi-cluster communication with StatefulSets" +description = "cross-cluster communication to and from headless services." ++++ + +Linkerd's multi-cluster extension works by "mirroring" service information +between clusters. Exported services in a target cluster will be mirrored as +`clusterIP` replicas. By default, every exported service will be mirrored as +`clusterIP`. When running workloads that require a headless service, such as +[StatefulSets](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/), +Linkerd's multi-cluster extension can be configured with support for headless +services to preserve the service type. Exported services that are headless will +be mirrored in a source cluster as headless, preserving functionality such as +DNS record creation and the ability to address an individual pod. + +This guide will walk you through installing and configuring Linkerd and the +multi-cluster extension with support for headless services and will exemplify +how a StatefulSet can be deployed in a target cluster. After deploying, we will +also look at how to communicate with an arbitrary pod from the target cluster's +StatefulSet from a client in the source cluster. For a more detailed overview +on how multi-cluster support for headless services work, check out +[multi-cluster communication](../../features/multicluster/). + +## Prerequisites + +- Two Kubernetes clusters. They will be referred to as `east` and `west` with + east being the "source" cluster and "west" the target cluster respectively. + These can be in any cloud or local environment, this guide will make use of + [k3d](https://github.com/rancher/k3d/releases/tag/v4.1.1) to configure two + local clusters. +- [`smallstep/CLI`](https://github.com/smallstep/cli/releases) to generate + certificates for Linkerd installation. +- [`linkerd:stable-2.11.0`](https://github.com/linkerd/linkerd2/releases) to + install Linkerd. + +To help with cluster creation and installation, there is a demo repository +available. Throughout the guide, we will be using the scripts from the +repository, but you can follow along without cloning or using the scripts. + +## Install Linkerd multi-cluster with headless support + +To start our demo and see everything in practice, we will go through a +multi-cluster scenario where a pod in an `east` cluster will try to communicate +to an arbitrary pod from a `west` cluster. + +The first step is to clone the demo +repository on your local machine. + +```sh +# clone example repository +$ git clone git@github.com:mateiidavid/l2d-k3d-statefulset.git +$ cd l2d-k3d-statefulset +``` + +The second step consists of creating two `k3d` clusters named `east` and +`west`, where the `east` cluster is the source and the `west` cluster is the +target. When creating our clusters, we need a shared trust root. Luckily, the +repository you have just cloned includes a handful of scripts that will greatly +simplify everything. + +```sh +# create k3d clusters +$ ./create.sh + +# list the clusters +$ k3d cluster list +NAME SERVERS AGENTS LOADBALANCER +east 1/1 0/0 true +west 1/1 0/0 true +``` + +Once our clusters are created, we will install Linkerd and the multi-cluster +extension. Finally, once both are installed, we need to link the two clusters +together so their services may be mirrored. To enable support for headless +services, we will pass an additional `--set "enableHeadlessServices=true` flag +to `linkerd multicluster link`. As before, these steps are automated through +the provided scripts, but feel free to have a look! + +```sh +# Install Linkerd and multicluster, output to check should be a success +$ ./install.sh + +# Next, link the two clusters together +$ ./link.sh +``` + +Perfect! If you've made it this far with no errors, then it's a good sign. In +the next chapter, we'll deploy some services and look at how communication +works. + +## Pod-to-Pod: from east, to west + +With our install steps out of the way, we can now focus on our pod-to-pod +communication. First, we will deploy our pods and services: + +- We will mesh the default namespaces in `east` and `west`. +- In `west`, we will deploy an nginx StatefulSet with its own headless + service, `nginx-svc`. +- In `east`, our script will deploy a `curl` pod that will then be used to + curl the nginx service. + +```sh +# deploy services and mesh namespaces +$ ./deploy.sh + +# verify both clusters +# +# verify east +$ kubectl --context=k3d-east get pods +NAME READY STATUS RESTARTS AGE +curl-56dc7d945d-96r6p 2/2 Running 0 7s + +# verify west has headless service +$ kubectl --context=k3d-west get services +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kubernetes ClusterIP 10.43.0.1 443/TCP 10m +nginx-svc ClusterIP None 80/TCP 8s + +# verify west has statefulset +# +# this may take a while to come up +$ kubectl --context=k3d-west get pods +NAME READY STATUS RESTARTS AGE +nginx-set-0 2/2 Running 0 53s +nginx-set-1 2/2 Running 0 43s +nginx-set-2 2/2 Running 0 36s +``` + +Before we go further, let's have a look at the endpoints object for the +`nginx-svc`: + +```sh +$ kubectl --context=k3d-west get endpoints nginx-svc -o yaml +... +subsets: +- addresses: + - hostname: nginx-set-0 + ip: 10.42.0.31 + nodeName: k3d-west-server-0 + targetRef: + kind: Pod + name: nginx-set-0 + namespace: default + resourceVersion: "114743" + uid: 7049f1c1-55dc-4b7b-a598-27003409d274 + - hostname: nginx-set-1 + ip: 10.42.0.32 + nodeName: k3d-west-server-0 + targetRef: + kind: Pod + name: nginx-set-1 + namespace: default + resourceVersion: "114775" + uid: 60df15fd-9db0-4830-9c8f-e682f3000800 + - hostname: nginx-set-2 + ip: 10.42.0.33 + nodeName: k3d-west-server-0 + targetRef: + kind: Pod + name: nginx-set-2 + namespace: default + resourceVersion: "114808" + uid: 3873bc34-26c4-454d-bd3d-7c783de16304 +``` + +We can see, based on the endpoints object that the service has three endpoints, +with each endpoint having an address (or IP) whose hostname corresponds to a +StatefulSet pod. If we were to do a curl to any of these endpoints directly, we +would get an answer back. We can test this out by applying the curl pod to the +`west` cluster: + +```sh +$ kubectl --context=k3d-west apply -f east/curl.yml +$ kubectl --context=k3d-west get pods +NAME READY STATUS RESTARTS AGE +nginx-set-0 2/2 Running 0 5m8s +nginx-set-1 2/2 Running 0 4m58s +nginx-set-2 2/2 Running 0 4m51s +curl-56dc7d945d-s4n8j 0/2 PodInitializing 0 4s + +$ kubectl --context=k3d-west exec -it curl-56dc7d945d-s4n8j -c curl -- bin/sh +/$ # prompt for curl pod +``` + +If we now curl one of these instances, we will get back a response. + +```sh +# exec'd on the pod +/ $ curl nginx-set-0.nginx-svc.default.svc.west.cluster.local +" + + +Welcome to nginx! + + + +

Welcome to nginx!

+

If you see this page, the nginx web server is successfully installed and +working. Further configuration is required.

+ +

For online documentation and support please refer to +nginx.org.
+Commercial support is available at +nginx.com.

+ +

Thank you for using nginx.

+ +" +``` + +Now, let's do the same, but this time from the `east` cluster. We will first +export the service. + +```sh +$ kubectl --context=k3d-west label service nginx-svc mirror.linkerd.io/exported="true" +service/nginx-svc labeled + +$ kubectl --context=k3d-east get services +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kubernetes ClusterIP 10.43.0.1 443/TCP 20h +nginx-svc-west ClusterIP None 80/TCP 29s +nginx-set-0-west ClusterIP 10.43.179.60 80/TCP 29s +nginx-set-1-west ClusterIP 10.43.218.18 80/TCP 29s +nginx-set-2-west ClusterIP 10.43.245.244 80/TCP 29s +``` + +If we take a look at the endpoints object, we will notice something odd, the +endpoints for `nginx-svc-west` will have the same hostnames, but each hostname +will point to one of the services we see above: + +```sh +$ kubectl --context=k3d-east get endpoints nginx-svc-west -o yaml +subsets: +- addresses: + - hostname: nginx-set-0 + ip: 10.43.179.60 + - hostname: nginx-set-1 + ip: 10.43.218.18 + - hostname: nginx-set-2 + ip: 10.43.245.244 +``` + +This is what we outlined at the start of the tutorial. Each pod from the target +cluster (`west`), will be mirrored as a clusterIP service. We will see in a +second why this matters. + +```sh +$ kubectl --context=k3d-east get pods +NAME READY STATUS RESTARTS AGE +curl-56dc7d945d-96r6p 2/2 Running 0 23m + +# exec and curl +$ kubectl --context=k3d-east exec pod curl-56dc7d945d-96r6p -it -c curl -- bin/sh +# we want to curl the same hostname we see in the endpoints object above. +# however, the service and cluster domain will now be different, since we +# are in a different cluster. +# +/ $ curl nginx-set-0.nginx-svc-west.default.svc.east.cluster.local + + + +Welcome to nginx! + + + +

Welcome to nginx!

+

If you see this page, the nginx web server is successfully installed and +working. Further configuration is required.

+ +

For online documentation and support please refer to +nginx.org.
+Commercial support is available at +nginx.com.

+ +

Thank you for using nginx.

+ + +``` + +As you can see, we get the same response back! But, nginx is in a different +cluster. So, what happened behind the scenes? + + 1. When we mirrored the headless service, we created a clusterIP service for + each pod. Since services create DNS records, naming each endpoint with the + hostname from the target gave us these pod FQDNs + (`nginx-set-0.(...).cluster.local`). + 2. Curl resolved the pod DNS name to an IP address. In our case, this IP + would be `10.43.179.60`. + 3. Once the request is in-flight, the linkerd2-proxy intercepts it. It looks + at the IP address and associates it with our `clusterIP` service. The + service itself points to the gateway, so the proxy forwards the request to + the target cluster gateway. This is the usual multi-cluster scenario. + 4. The gateway in the target cluster looks at the request and looks-up the + original destination address. In our case, since this is an "endpoint + mirror", it knows it has to go to `nginx-set-0.nginx-svc` in the same + cluster. + 5. The request is again forwarded by the gateway to the pod, and the response + comes back. + +And that's it! You can now send requests to pods across clusters. Querying any +of the 3 StatefulSet pods should have the same results. + +{{< note >}} + +To mirror a headless service as headless, the service's endpoints +must also have at least one named address (e.g a hostname for an IP), +otherwise, there will be no endpoints to mirror so the service will be mirrored +as `clusterIP`. A headless service may under normal conditions also be created +without exposing a port; the mulit-cluster service-mirror does not support +this, however, since the lack of ports means we cannot create a service that +passes Kubernetes validation. + +{{< /note >}} + +## Cleanup + +To clean-up, you can remove both clusters entirely using the k3d CLI: + +```sh +$ k3d cluster delete east +cluster east deleted +$ k3d cluster delete west +cluster west deleted +``` diff --git a/linkerd.io/content/2.11/tasks/multicluster.md b/linkerd.io/content/2.11/tasks/multicluster.md new file mode 100644 index 0000000000..ac1fc9bcb6 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/multicluster.md @@ -0,0 +1,520 @@ ++++ +title = "Multi-cluster communication" +description = "Allow Linkerd to manage cross-cluster communication." ++++ + +This guide will walk you through installing and configuring Linkerd so that two +clusters can talk to services hosted on both. There are a lot of moving parts +and concepts here, so it is valuable to read through our +[introduction](../../features/multicluster/) that explains how this works beneath +the hood. By the end of this guide, you will understand how to split traffic +between services that live on different clusters. + +At a high level, you will: + +1. [Install Linkerd](#install-linkerd) on two clusters with a shared trust + anchor. +1. [Prepare](#preparing-your-cluster) the clusters. +1. [Link](#linking-the-clusters) the clusters. +1. [Install](#installing-the-test-services) the demo. +1. [Export](#exporting-the-services) the demo services, to control visibility. +1. [Verify](#security) the security of your clusters. +1. [Split traffic](#traffic-splitting) from pods on the source cluster (`west`) + to the target cluster (`east`) + +## Prerequisites + +- Two clusters. We will refer to them as `east` and `west` in this guide. Follow + along with the + [blog post](/2020/02/25/multicluster-kubernetes-with-service-mirroring/) as + you walk through this guide! The easiest way to do this for development is + running a [kind](https://kind.sigs.k8s.io/docs/user/quick-start/) or + [k3d](https://github.com/rancher/k3d#usage) cluster locally on your laptop and + one remotely on a cloud provider, such as + [AKS](https://azure.microsoft.com/en-us/services/kubernetes-service/). +- Each of these clusters should be configured as `kubectl` + [contexts](https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/). + We'd recommend you use the names `east` and `west` so that you can follow + along with this guide. It is easy to + [rename contexts](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-rename-context-em-) + with `kubectl`, so don't feel like you need to keep it all named this way + forever. +- Elevated privileges on both clusters. We'll be creating service accounts and + granting extended privileges, so you'll need to be able to do that on your + test clusters. +- Linkerd's `viz` extension should be installed in order to run `stat` commands, + view the Grafana or Linkerd dashboard and run the `linkerd multicluster gateways` + command. +- Support for services of type `LoadBalancer` in the `east` cluster. Check out + the documentation for your cluster provider or take a look at + [inlets](https://blog.alexellis.io/ingress-for-your-local-kubernetes-cluster/). + This is what the `west` cluster will use to communicate with `east` via the + gateway. + +## Install Linkerd + +{{< fig + alt="install" + title="Two Clusters" + center="true" + src="/images/multicluster/install.svg" >}} + +Linkerd requires a shared +[trust anchor](https://linkerd.io../generate-certificates/#trust-anchor-certificate) +to exist between the installations in all clusters that communicate with each +other. This is used to encrypt the traffic between clusters and authorize +requests that reach the gateway so that your cluster is not open to the public +internet. Instead of letting `linkerd` generate everything, we'll need to +generate the credentials and use them as configuration for the `install` +command. + +We like to use the [step](https://smallstep.com/cli/) CLI to generate these +certificates. If you prefer `openssl` instead, feel free to use that! To +generate the trust anchor with step, you can run: + +```bash +step certificate create root.linkerd.cluster.local root.crt root.key \ + --profile root-ca --no-password --insecure +``` + +This certificate will form the common base of trust between all your clusters. +Each proxy will get a copy of this certificate and use it to validate the +certificates that it receives from peers as part of the mTLS handshake. With a +common base of trust, we now need to generate a certificate that can be used in +each cluster to issue certificates to the proxies. If you'd like to get a deeper +picture into how this all works, check out the +[deep dive](../../features/automatic-mtls/#how-does-it-work). + +The trust anchor that we've generated is a self-signed certificate which can be +used to create new certificates (a certificate authority). To generate the +[issuer credentials](../generate-certificates/#issuer-certificate-and-key) +using the trust anchor, run: + +```bash +step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \ + --profile intermediate-ca --not-after 8760h --no-password --insecure \ + --ca root.crt --ca-key root.key +``` + +An `identity` service in your cluster will use the certificate and key that you +generated here to generate the certificates that each individual proxy uses. +While we will be using the same issuer credentials on each cluster for this +guide, it is a good idea to have separate ones for each cluster. Read through +the [certificate documentation](../generate-certificates/) for more +details. + +With a valid trust anchor and issuer credentials, we can install Linkerd on your +`west` and `east` clusters now. + +```bash +linkerd install \ + --identity-trust-anchors-file root.crt \ + --identity-issuer-certificate-file issuer.crt \ + --identity-issuer-key-file issuer.key \ + | tee \ + >(kubectl --context=west apply -f -) \ + >(kubectl --context=east apply -f -) +``` + +The output from `install` will get applied to each cluster and come up! You can +verify that everything has come up successfully with `check`. + +```bash +for ctx in west east; do + echo "Checking cluster: ${ctx} .........\n" + linkerd --context=${ctx} check || break + echo "-------------\n" +done +``` + +## Preparing your cluster + +{{< fig + alt="preparation" + title="Preparation" + center="true" + src="/images/multicluster/prep-overview.svg" >}} + +In order to route traffic between clusters, Linkerd leverages Kubernetes +services so that your application code does not need to change and there is +nothing new to learn. This requires a gateway component that routes incoming +requests to the correct internal service. The gateway will be exposed to the +public internet via a `Service` of type `LoadBalancer`. Only requests verified +through Linkerd's mTLS (with a shared trust anchor) will be allowed through this +gateway. If you're interested, we go into more detail as to why this is +important in [architecting for multicluster Kubernetes](/2020/02/17/architecting-for-multicluster-kubernetes/#requirement-i-support-hierarchical-networks). + +To install the multicluster components on both `west` and `east`, you can run: + +```bash +for ctx in west east; do + echo "Installing on cluster: ${ctx} ........." + linkerd --context=${ctx} multicluster install | \ + kubectl --context=${ctx} apply -f - || break + echo "-------------\n" +done +``` + +{{< fig + alt="install" + title="Components" + center="true" + src="/images/multicluster/components.svg" >}} + +Installed into the `linkerd-multicluster` namespace, the gateway is a simple +[pause container](https://github.com/linkerd/linkerd2/blob/main/multicluster/charts/linkerd-multicluster/templates/gateway.yaml#L3) +which has been injected with the Linkerd proxy. On the inbound side, Linkerd +takes care of validating that the connection uses a TLS certificate that is part +of the trust anchor, then handles the outbound connection. At this point, the +Linkerd proxy is operating like any other in the data plane and forwards the +requests to the correct service. Make sure the gateway comes up successfully by +running: + +```bash +for ctx in west east; do + echo "Checking gateway on cluster: ${ctx} ........." + kubectl --context=${ctx} -n linkerd-multicluster \ + rollout status deploy/linkerd-gateway || break + echo "-------------\n" +done +``` + +Double check that the load balancer was able to allocate a public IP address by +running: + +```bash +for ctx in west east; do + printf "Checking cluster: ${ctx} ........." + while [ "$(kubectl --context=${ctx} -n linkerd-multicluster get service \ + -o 'custom-columns=:.status.loadBalancer.ingress[0].ip' \ + --no-headers)" = "" ]; do + printf '.' + sleep 1 + done + printf "\n" +done +``` + +Every cluster is now running the multicluster control plane and ready to start +mirroring services. We'll want to link the clusters together now! + +## Linking the clusters + +{{< fig + alt="link-clusters" + title="Link" + center="true" + src="/images/multicluster/link-flow.svg" >}} + +For `west` to mirror services from `east`, the `west` cluster needs to have +credentials so that it can watch for services in `east` to be exported. You'd +not want anyone to be able to introspect what's running on your cluster after +all! The credentials consist of a service account to authenticate the service +mirror as well as a `ClusterRole` and `ClusterRoleBinding` to allow watching +services. In total, the service mirror component uses these credentials to watch +services on `east` or the target cluster and add/remove them from itself +(`west`). There is a default set added as part of +`linkerd multicluster install`, but if you would like to have separate +credentials for every cluster you can run `linkerd multicluster allow`. + +The next step is to link `west` to `east`. This will create a credentials +secret, a Link resource, and a service-mirror controller. The credentials secret +contains a kubeconfig which can be used to access the target (`east`) cluster's +Kubernetes API. The Link resource is custom resource that configures service +mirroring and contains things such as the gateway address, gateway identity, +and the label selector to use when determining which services to mirror. The +service-mirror controller uses the Link and the secret to find services on +the target cluster that match the given label selector and copy them into +the source (local) cluster. + + To link the `west` cluster to the `east` one, run: + +```bash +linkerd --context=east multicluster link --cluster-name east | + kubectl --context=west apply -f - +``` + +Linkerd will look at your current `east` context, extract the `cluster` +configuration which contains the server location as well as the CA bundle. It +will then fetch the `ServiceAccount` token and merge these pieces of +configuration into a kubeconfig that is a secret. + +Running `check` again will make sure that the service mirror has discovered this +secret and can reach `east`. + +```bash +linkerd --context=west multicluster check +``` + +Additionally, the `east` gateway should now show up in the list: + +```bash +linkerd --context=west multicluster gateways +``` + +{{< note >}} `link` assumes that the two clusters will connect to each other +with the same configuration as you're using locally. If this is not the case, +you'll want to use the `--api-server-address` flag for `link`.{{< /note >}} + +## Installing the test services + +{{< fig + alt="test-services" + title="Topology" + center="true" + src="/images/multicluster/example-topology.svg" >}} + +It is time to test this all out! The first step is to add some services that we +can mirror. To add these to both clusters, you can run: + +```bash +for ctx in west east; do + echo "Adding test services on cluster: ${ctx} ........." + kubectl --context=${ctx} apply \ + -k "github.com/linkerd/website/multicluster/${ctx}/" + kubectl --context=${ctx} -n test \ + rollout status deploy/podinfo || break + echo "-------------\n" +done +``` + +You'll now have a `test` namespace running two deployments in each cluster - +frontend and podinfo. `podinfo` has been configured slightly differently in each +cluster with a different name and color so that we can tell where requests are +going. + +To see what it looks like from the `west` cluster right now, you can run: + +```bash +kubectl --context=west -n test port-forward svc/frontend 8080 +``` + +{{< fig + alt="west-podinfo" + title="West Podinfo" + center="true" + src="/images/multicluster/west-podinfo.gif" >}} + +With the podinfo landing page available at +[http://localhost:8080](http://localhost:8080), you can see how it looks in the +`west` cluster right now. Alternatively, running `curl http://localhost:8080` +will return a JSON response that looks something like: + +```json +{ + "hostname": "podinfo-5c8cf55777-zbfls", + "version": "4.0.2", + "revision": "b4138fdb4dce7b34b6fc46069f70bb295aa8963c", + "color": "#6c757d", + "logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif", + "message": "greetings from west", + "goos": "linux", + "goarch": "amd64", + "runtime": "go1.14.3", + "num_goroutine": "8", + "num_cpu": "4" +} +``` + +Notice that the `message` references the `west` cluster name. + +## Exporting the services + +To make sure sensitive services are not mirrored and cluster performance is +impacted by the creation or deletion of services, we require that services be +explicitly exported. For the purposes of this guide, we will be exporting the +`podinfo` service from the `east` cluster to the `west` cluster. To do this, we +must first export the `podinfo` service in the `east` cluster. You can do this +by adding the `mirror.linkerd.io/exported` label: + +```bash +kubectl --context=east label svc -n test podinfo mirror.linkerd.io/exported=true +``` + +{{< note >}} You can configure a different label selector by using the +`--selector` flag on the `linkerd multicluster link` command or by editting +the Link resource created by the `linkerd multicluster link` command. +{{< /note >}} + +Check out the service that was just created by the service mirror controller! + +```bash +kubectl --context=west -n test get svc podinfo-east +``` + +From the +[architecture](https://linkerd.io/2020/02/25/multicluster-kubernetes-with-service-mirroring/#step-2-endpoint-juggling), +you'll remember that the service mirror component is doing more than just moving +services over. It is also managing the endpoints on the mirrored service. To +verify that is setup correctly, you can check the endpoints in `west` and verify +that they match the gateway's public IP address in `east`. + +```bash +kubectl --context=west -n test get endpoints podinfo-east \ + -o 'custom-columns=ENDPOINT_IP:.subsets[*].addresses[*].ip' +kubectl --context=east -n linkerd-multicluster get svc linkerd-gateway \ + -o "custom-columns=GATEWAY_IP:.status.loadBalancer.ingress[*].ip" +``` + +At this point, we can hit the `podinfo` service in `east` from the `west` +cluster. This requires the client to be meshed, so let's run `curl` from within +the frontend pod: + +```bash +kubectl --context=west -n test exec -c nginx -it \ + $(kubectl --context=west -n test get po -l app=frontend \ + --no-headers -o custom-columns=:.metadata.name) \ + -- /bin/sh -c "apk add curl && curl http://podinfo-east:9898" +``` + +You'll see the `greeting from east` message! Requests from the `frontend` pod +running in `west` are being transparently forwarded to `east`. Assuming that +you're still port forwarding from the previous step, you can also reach this +from your browser at [http://localhost:8080/east](http://localhost:8080/east). +Refresh a couple times and you'll be able to get metrics from `linkerd viz stat` +as well. + +```bash +linkerd --context=west -n test viz stat --from deploy/frontend svc +``` + +We also provide a grafana dashboard to get a feel for what's going on here. You +can get to it by running `linkerd --context=west viz dashboard` and going to +[http://localhost:50750/grafana/](http://localhost:50750/grafana/d/linkerd-multicluster/linkerd-multicluster?orgId=1&refresh=1m) + +{{< fig + alt="grafana-dashboard" + title="Grafana" + center="true" + src="/images/multicluster/grafana-dashboard.png" >}} + +## Security + +By default, requests will be going across the public internet. Linkerd extends +its [automatic mTLS](../../features/automatic-mtls/) across clusters to make sure +that the communication going across the public internet is encrypted. If you'd +like to have a deep dive on how to validate this, check out the +[docs](../securing-your-service/). To quickly check, however, you can run: + +```bash +linkerd --context=west -n test viz tap deploy/frontend | \ + grep "$(kubectl --context=east -n linkerd-multicluster get svc linkerd-gateway \ + -o "custom-columns=GATEWAY_IP:.status.loadBalancer.ingress[*].ip")" +``` + +`tls=true` tells you that the requests are being encrypted! + +{{< note >}} As `linkerd edges` works on concrete resources and cannot see two +clusters at once, it is not currently able to show the edges between pods in +`east` and `west`. This is the reason we're using `tap` to validate mTLS here. +{{< /note >}} + +In addition to making sure all your requests are encrypted, it is important to +block arbitrary requests coming into your cluster. We do this by validating that +requests are coming from clients in the mesh. To do this validation, we rely on +a shared trust anchor between clusters. To see what happens when a client is +outside the mesh, you can run: + +```bash +kubectl --context=west -n test run -it --rm --image=alpine:3 test -- \ + /bin/sh -c "apk add curl && curl -vv http://podinfo-east:9898" +``` + +## Traffic Splitting + +{{< fig + alt="with-split" + title="Traffic Split" + center="true" + src="/images/multicluster/with-split.svg" >}} + +It is pretty useful to have services automatically show up in clusters and be +able to explicitly address them, however that only covers one use case for +operating multiple clusters. Another scenario for multicluster is failover. In a +failover scenario, you don't have time to update the configuration. Instead, you +need to be able to leave the application alone and just change the routing. If +this sounds a lot like how we do [canary](../canary-release/) deployments, +you'd be correct! + +`TrafficSplit` allows us to define weights between multiple services and split +traffic between them. In a failover scenario, you want to do this slowly as to +make sure you don't overload the other cluster or trip any SLOs because of the +added latency. To get this all working with our scenario, let's split between +the `podinfo` service in `west` and `east`. To configure this, you'll run: + +```bash +cat <}} + +You can also watch what's happening with metrics. To see the source side of +things (`west`), you can run: + +```bash +linkerd --context=west -n test viz stat trafficsplit +``` + +It is also possible to watch this from the target (`east`) side by running: + +```bash +linkerd --context=east -n test viz stat \ + --from deploy/linkerd-gateway \ + --from-namespace linkerd-multicluster \ + deploy/podinfo +``` + +There's even a dashboard! Run `linkerd viz dashboard` and send your browser to +[localhost:50750](http://localhost:50750/namespaces/test/trafficsplits/podinfo). + +{{< fig + alt="podinfo-split" + title="Cross Cluster Podinfo" + center="true" + src="/images/multicluster/ts-dashboard.png" >}} + +## Cleanup + +To cleanup the multicluster control plane, you can run: + +```bash +for ctx in west east; do + linkerd --context=${ctx} multicluster uninstall | kubectl --context=${ctx} delete -f - +done +``` + +If you'd also like to remove your Linkerd installation, run: + +```bash +for ctx in west east; do + linkerd --context=${ctx} uninstall | kubectl --context=${ctx} delete -f - +done +``` diff --git a/linkerd.io/content/2.11/tasks/replacing_expired_certificates.md b/linkerd.io/content/2.11/tasks/replacing_expired_certificates.md new file mode 100644 index 0000000000..cdbaee9664 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/replacing_expired_certificates.md @@ -0,0 +1,123 @@ ++++ +title = "Replacing expired certificates" +description = "Follow this workflow if any of your TLS certs have expired." ++++ + +If any of your TLS certs are approaching expiry and you are not relying on an +external certificate management solution such as `cert-manager`, you can follow +[Rotating your identity certificates](../rotating_identity_certificates/) +to update them without incurring downtime. In case you are in a situation where +any of your certs are expired however, you are already in an invalid state and +any measures to avoid downtime are not guaranteed to give results. Therefore it +is best to proceed with replacing the certificates with valid ones. + +## Replacing only the issuer certificate + +It might be the case that your issuer certificate is expired. If this it true +running `linkerd check --proxy` will produce output similar to: + +```bash +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +× issuer cert is within its validity period + issuer certificate is not valid anymore. Expired on 2019-12-19T09:21:08Z + see https://linkerd.io/checks/#l5d-identity-issuer-cert-is-time-valid for hints +``` + +In this situation, if you have installed Linkerd with a manually supplied trust +root and you have its key, you can follow +[Updating the identity issuer certificate](../manually-rotating-control-plane-tls-credentials/#rotating-the-identity-issuer-certificate) +to update your expired cert. + +## Replacing the root and issuer certificates + +If your root certificate is expired or you do not have its key, you need to +replace both your root and issuer certificates at the same time. If your root +has expired `linkerd check` will indicate that by outputting an error similar +to: + +```bash +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +× trust roots are within their validity period + Invalid roots: + * 272080721524060688352608293567629376512 identity.linkerd.cluster.local not valid anymore. Expired on 2019-12-19T10:05:31Z + see https://linkerd.io/checks/#l5d-identity-roots-are-time-valid for hints +``` + +You can follow [Generating your own mTLS root certificates](../generate-certificates/#generating-the-certificates-with-step) +to create new root and issuer certificates. Then use the `linkerd upgrade` +command: + +```bash +linkerd upgrade \ + --identity-issuer-certificate-file=./issuer-new.crt \ + --identity-issuer-key-file=./issuer-new.key \ + --identity-trust-anchors-file=./ca-new.crt \ + --force \ + | kubectl apply -f - +``` + +Usually `upgrade` will prevent you from using an issuer certificate that +will not work with the roots your meshed pods are using. At that point we +do not need this check as we are updating both the root and issuer certs at +the same time. Therefore we use the `--force` flag to ignore this error. + +If you run `linkerd check --proxy` you might see some warning, while the +upgrade process is being performed: + +```bash +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +‼ data plane proxies certificate match CA + Some pods do not have the current trust bundle and must be restarted: + * linkerd/linkerd-controller-5b69fd4fcc-7skqb + * linkerd/linkerd-destination-749df5c74-brchg + * linkerd/linkerd-grafana-6dcf86b74b-vvxjq + * linkerd/linkerd-prometheus-74cb4f4b69-kqtss + * linkerd/linkerd-proxy-injector-cbd5545bd-rblq5 + * linkerd/linkerd-sp-validator-6ff949649f-gjgfl + * linkerd/linkerd-tap-7b5bb954b6-zl9w6 + * linkerd/linkerd-web-84c555f78-v7t44 + see https://linkerd.io/checks/#l5d-identity-data-plane-proxies-certs-match-ca for hints + +``` + +Additionally you can use the `kubectl rollout restart` command to bring the +configuration of your other injected resources up to date, and then the `check` +command should stop producing warning or errors: + +```bash +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` diff --git a/linkerd.io/content/2.11/tasks/restricting-access.md b/linkerd.io/content/2.11/tasks/restricting-access.md new file mode 100644 index 0000000000..2a691bcb9e --- /dev/null +++ b/linkerd.io/content/2.11/tasks/restricting-access.md @@ -0,0 +1,157 @@ ++++ +title = "Restricting Access To Services" +description = "Use Linkerd policy to restrict access to a service." ++++ + +Linkerd policy resources can be used to restrict which clients may access a +service. In this example, we'll use Emojivoto to show how to restrict access +to the Voting service so that it may only be called from the Web service. + +For a more comprehensive description of the policy resources, see the +[Policy reference docs](../../reference/authorization-policy/). + +## Setup + +Ensure that you have Linkerd version stable-2.11.0 or later installed, and that +it is healthy: + +```console +$ linkerd install | kubectl apply -f - +... +$ linkerd check -o short +... +``` + +Inject and install the Emojivoto application: + +```console +$ linkerd inject https://run.linkerd.io/emojivoto.yml | kubectl apply -f - +... +$ linkerd check -n emojivoto --proxy -o short +... +``` + +In order to observe what's going on, we'll also install the Viz extension: + +```console +$ linkerd viz install | kubectl apply -f - +... +$ linkerd viz check +... +``` + +## Creating a Server resource + +We start by creating a `Server` resource for the Voting service. A `Server` +is a Linkerd custom resource which describes a specific port of a workload. +Once the `Server` resource has been created, only clients which have been +authorized may access it (we'll see how to authorize clients in a moment). + +```console +cat << EOF | kubectl apply -f - +--- +apiVersion: policy.linkerd.io/v1beta1 +kind: Server +metadata: + namespace: emojivoto + name: voting-grpc + labels: + app: voting-svc +spec: + podSelector: + matchLabels: + app: voting-svc + port: grpc + proxyProtocol: gRPC +EOF +``` + +We see that this `Server` uses a `podSelector` to select the pods that it +describes: in this case the voting service pods. It also specifies the named +port (grpc) that it applies to. Finally, it specifies the protocol that is +served on this port. This ensures that the proxy treats traffic correctly and +allows it skip protocol detection. + +At this point, no clients have been authorized to access this service and you +will likely see a drop in success rate as requests from the Web service to +Voting start to get rejected. + +We can use the `linkerd viz authz` command to look at the authorization status +of requests coming to the voting service and see that all incoming requests +are currently unauthorized: + +```console +> linkerd viz authz -n emojivoto deploy/voting +SERVER AUTHZ SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +voting-grpc [UNAUTHORIZED] - 0.9rps - - - +``` + +## Creating a ServerAuthorization resource + +A `ServerAuthorization` grants a set of clients access to a set of `Servers`. +Here we will create a `ServerAuthorization` which grants the Web service access +to the Voting `Server` we created above. Note that meshed mTLS uses +`ServiceAccounts` as the basis for identity, thus our authorization will also +be based on `ServiceAccounts`. + +```console +cat << EOF | kubectl apply -f - +--- +apiVersion: policy.linkerd.io/v1beta1 +kind: ServerAuthorization +metadata: + namespace: emojivoto + name: voting-grpc + labels: + app.kubernetes.io/part-of: emojivoto + app.kubernetes.io/name: voting + app.kubernetes.io/version: v11 +spec: + server: + name: voting-grpc + # The voting service only allows requests from the web service. + client: + meshTLS: + serviceAccounts: + - name: web +EOF +``` + +With this in place, we can now see that all of the requests to the Voting +service are authorized by the `voting-grpc` ServerAuthorization. Note that since +the `linkerd viz auth` command queries over a time-window, you may see some +UNAUTHORIZED requests displayed for a short amount of time. + +```console +> linkerd viz authz -n emojivoto deploy/voting +SERVER AUTHZ SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 +voting-grpc voting-grpc 70.00% 1.0rps 1ms 1ms 1ms +``` + +We can also test that request from other pods will be rejected by creating a +`grpcurl` pod and attempting to access the Voting service from it: + +```console +> kubectl run grpcurl --rm -it --image=networld/grpcurl --restart=Never --command -- ./grpcurl -plaintext voting-svc.emojivoto:8080 emojivoto.v1.VotingService/VoteDog +Error invoking method "emojivoto.v1.VotingService/VoteDog": failed to query for service descriptor "emojivoto.v1.VotingService": rpc error: code = PermissionDenied desc = +pod "grpcurl" deleted +pod default/grpcurl terminated (Error) +``` + +Because this client has not been authorized, this request gets rejected with a +`PermissionDenied` error. + +You can create as many `ServerAuthorization` resources as you like to authorize +many different clients. You can also specify whether to authorize +unauthenticated (i.e. unmeshed) client, any authenticated client, or only +authenticated clients with a particular identity. For more details, please see +the [Policy reference docs](../../reference/authorization-policy/). + +## Further Considerations + +You may have noticed that there was a period of time after we created the +`Server` resource but before we created the `ServerAuthorization` where all +requests were being rejected. To avoid this situation in live systems, we +recommend you either create the policy resources before deploying your services +or to create the `ServiceAuthorizations` BEFORE creating the `Server` so that +clients will be authorized immediately. diff --git a/linkerd.io/content/2.11/tasks/rotating_webhooks_certificates.md b/linkerd.io/content/2.11/tasks/rotating_webhooks_certificates.md new file mode 100644 index 0000000000..53ab89ca0a --- /dev/null +++ b/linkerd.io/content/2.11/tasks/rotating_webhooks_certificates.md @@ -0,0 +1,103 @@ ++++ +title = "Rotating webhooks certificates" +description = "Follow these steps to rotate your Linkerd webhooks certificates." ++++ + +Linkerd uses the +[Kubernetes admission webhooks](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks) +and +[extension API server](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) +to implement some of its core features like +[automatic proxy injection](../../features/proxy-injection/) and +[service profiles validation](../../features/service-profiles/). + +Also, the viz extension uses a webhook to make pods tappable, as does the jaeger +extension to turn on tracing on pods. + +To secure the connections between the Kubernetes API server and the +webhooks, all the webhooks are TLS-enabled. The x509 certificates used by these +webhooks are issued by the self-signed CA certificates embedded in the webhooks +configuration. + +By default, these certificates have a validity period of 365 days. They are +stored in the following secrets: + +- In the `linkerd` namespace: `linkerd-proxy-injector-k8s-tls` and `linkerd-sp-validator-k8s-tls` +- In the `linkerd-viz` namespace: `tap-injector-k8s-tls` +- In the `linkerd-jaeger` namespace: `jaeger-injector-k8s-tls` + +The rest of this documentation provides instructions on how to renew these +certificates. + +## Renewing the webhook certificates + +To check the validity of all the TLS secrets +(using [`step`](https://smallstep.com/cli/)): + +```bash +# assuming you have viz and jaeger installed, otherwise trim down these arrays +# accordingly +SECRETS=("linkerd-proxy-injector-k8s-tls" "linkerd-sp-validator-k8s-tls" "tap-injector-k8s-tls" "jaeger-injector-k8s-tls") +NS=("linkerd" "linkerd" "linkerd-viz" "linkerd-jaeger") +for idx in "${!SECRETS[@]}"; do \ + kubectl -n "${NS[$idx]}" get secret "${SECRETS[$idx]}" -ojsonpath='{.data.tls\.crt}' | \ + base64 --decode - | \ + step certificate inspect - | \ + grep -iA2 validity; \ +done +``` + +Manually delete these secrets and use `upgrade`/`install` to recreate them: + +```bash +for idx in "${!SECRETS[@]}"; do \ + kubectl -n "${NS[$idx]}" delete secret "${SECRETS[$idx]}"; \ +done + +linkerd upgrade | kubectl apply -f - +linkerd viz install | kubectl apply -f - +linkerd jaeger install | kubectl apply -f - +``` + +The above command will recreate the secrets without restarting Linkerd. + +{{< note >}} +For Helm users, use the `helm upgrade` command to recreate the deleted secrets. + +If you render the helm charts externally and apply them with `kubectl apply` +(e.g. in a CI/CD pipeline), you do not need to delete the secrets manually, +as they wil be overwritten by a new cert and key generated by the helm chart. +{{< /note >}} + +Confirm that the secrets are recreated with new certificates: + +```bash +for idx in "${!SECRETS[@]}"; do \ + kubectl -n "${NS[$idx]}" get secret "${SECRETS[$idx]}" -ojsonpath='{.data.crt\.pem}' | \ + base64 --decode - | \ + step certificate inspect - | \ + grep -iA2 validity; \ +done +``` + +Ensure that Linkerd remains healthy: + +```bash +linkerd check +``` + +Restarting the pods that implement the webhooks and API services is usually not +necessary. But if the cluster is large, or has a high pod churn, it may be +advisable to restart the pods manually, to avoid cascading failures. + +If you observe certificate expiry errors or mismatched CA certs, restart their +pods with: + +```sh +kubectl -n linkerd rollout restart deploy \ + linkerd-proxy-injector \ + linkerd-sp-validator \ + +kubectl -n linkerd-viz rollout restart deploy tap tap-injector +kubectl -n linkerd-jaeger rollout restart deploy jaeger-injector +``` diff --git a/linkerd.io/content/2.11/tasks/securing-your-cluster.md b/linkerd.io/content/2.11/tasks/securing-your-cluster.md new file mode 100644 index 0000000000..1816a3b1b7 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/securing-your-cluster.md @@ -0,0 +1,220 @@ ++++ +title = "Securing Your Cluster" +description = "Best practices for securing your Linkerd installation." +aliases = [ + "../tap-rbac/", +] ++++ + +Linkerd provides powerful introspection into your Kubernetes cluster and +services. Linkerd installations are secure by default. This page illustrates +best practices to enable this introspection in a secure way. + +## Tap + +Linkerd's Viz extension includes Tap support. This feature is available via the +following commands: + +- [`linkerd viz tap`](../../reference/cli/viz/#tap) +- [`linkerd viz top`](../../reference/cli/viz/#top) +- [`linkerd viz profile --tap`](../../reference/cli/viz/#profile) +- [`linkerd viz dashboard`](../../reference/cli/viz/#dashboard) + +Depending on your RBAC setup, you may need to perform additional steps to enable +your user(s) to perform Tap actions. + +{{< note >}} +If you are on GKE, skip to the [GKE section below](#gke). +{{< /note >}} + +### Check for Tap access + +Use `kubectl` to determine whether your user is authorized to perform tap +actions. For more information, see the +[Kubernetes docs on authorization](https://kubernetes.io/docs/reference/access-authn-authz/authorization/#checking-api-access). + +To determine if you can watch pods in all namespaces: + +```bash +kubectl auth can-i watch pods.tap.linkerd.io --all-namespaces +``` + +To determine if you can watch deployments in the emojivoto namespace: + +```bash +kubectl auth can-i watch deployments.tap.linkerd.io -n emojivoto +``` + +To determine if a specific user can watch deployments in the emojivoto namespace: + +```bash +kubectl auth can-i watch deployments.tap.linkerd.io -n emojivoto --as $(whoami) +``` + +You can also use the Linkerd CLI's `--as` flag to confirm: + +```bash +$ linkerd viz tap -n linkerd deploy/linkerd-controller --as $(whoami) +Cannot connect to Linkerd Viz: namespaces is forbidden: User "XXXX" cannot list resource "namespaces" in API group "" at the cluster scope +Validate the install with: linkerd viz check +... +``` + +### Enabling Tap access + +If the above commands indicate you need additional access, you can enable access +with as much granularity as you choose. + +#### Granular Tap access + +To enable tap access to all resources in all namespaces, you may bind your user +to the `linkerd-linkerd-tap-admin` ClusterRole, installed by default: + +```bash +$ kubectl describe clusterroles/linkerd-linkerd-viz-tap-admin +Name: linkerd-linkerd-viz-tap-admin +Labels: component=tap + linkerd.io/extension=viz +Annotations: kubectl.kubernetes.io/last-applied-configuration: + {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"labels":{"component=tap... +PolicyRule: + Resources Non-Resource URLs Resource Names Verbs + --------- ----------------- -------------- ----- + *.tap.linkerd.io [] [] [watch] +``` + +{{< note >}} +This ClusterRole name includes the Linkerd Viz namespace, so it may vary if you +installed Viz into a non-default namespace: +`linkerd-[LINKERD_VIZ_NAMESPACE]-tap-admin` +{{< /note >}} + +To bind the `linkerd-linkerd-viz-tap-admin` ClusterRole to a particular user: + +```bash +kubectl create clusterrolebinding \ + $(whoami)-tap-admin \ + --clusterrole=linkerd-linkerd-viz-tap-admin \ + --user=$(whoami) +``` + +You can verify you now have tap access with: + +```bash +$ linkerd viz tap -n linkerd deploy/linkerd-controller --as $(whoami) +req id=3:0 proxy=in src=10.244.0.1:37392 dst=10.244.0.13:9996 tls=not_provided_by_remote :method=GET :authority=10.244.0.13:9996 :path=/ping +... +``` + +#### Cluster admin access + +To simply give your user cluster-admin access: + +```bash +kubectl create clusterrolebinding \ + $(whoami)-cluster-admin \ + --clusterrole=cluster-admin \ + --user=$(whoami) +``` + +{{< note >}} +Not recommended for production, only do this for testing/development. +{{< /note >}} + +### GKE + +Google Kubernetes Engine (GKE) provides access to your Kubernetes cluster via +Google Cloud IAM. See the +[GKE IAM Docs](https://cloud.google.com/kubernetes-engine/docs/how-to/iam) for +more information. + +Because GCloud provides this additional level of access, there are cases where +`kubectl auth can-i` will report you have Tap access when your RBAC user may +not. To validate this, check whether your GCloud user has Tap access: + +```bash +$ kubectl auth can-i watch pods.tap.linkerd.io --all-namespaces +yes +``` + +And then validate whether your RBAC user has Tap access: + +```bash +$ kubectl auth can-i watch pods.tap.linkerd.io --all-namespaces --as $(gcloud config get-value account) +no - no RBAC policy matched +``` + +If the second command reported you do not have access, you may enable access +with: + +```bash +kubectl create clusterrolebinding \ + $(whoami)-tap-admin \ + --clusterrole=linkerd-linkerd-viz-tap-admin \ + --user=$(gcloud config get-value account) +``` + +To simply give your user cluster-admin access: + +```bash +kubectl create clusterrolebinding \ + $(whoami)-cluster-admin \ + --clusterrole=cluster-admin \ + --user=$(gcloud config get-value account) +``` + +{{< note >}} +Not recommended for production, only do this for testing/development. +{{< /note >}} + +### Linkerd Dashboard tap access + +By default, the [Linkerd dashboard](../../features/dashboard/) has the RBAC +privileges necessary to tap resources. + +To confirm: + +```bash +$ kubectl auth can-i watch pods.tap.linkerd.io --all-namespaces --as system:serviceaccount:linkerd-viz:web +yes +``` + +This access is enabled via a `linkerd-linkerd-viz-web-admin` ClusterRoleBinding: + +```bash +$ kubectl describe clusterrolebindings/linkerd-linkerd-viz-web-admin +Name: linkerd-linkerd-viz-web-admin +Labels: component=web + linkerd.io/extensions=viz +Annotations: kubectl.kubernetes.io/last-applied-configuration: + {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"labels":{"component=web... +Role: + Kind: ClusterRole + Name: linkerd-linkerd-viz-tap-admin +Subjects: + Kind Name Namespace + ---- ---- --------- + ServiceAccount web linkerd-viz +``` + +If you would like to restrict the Linkerd dashboard's tap access. You may +install Linkerd viz with the `--set dashboard.restrictPrivileges` flag: + +```bash +linkerd viz install --set dashboard.restrictPrivileges +``` + +This will omit the `linkerd-linkerd-web-admin` ClusterRoleBinding. If you have +already installed Linkerd, you may simply delete the ClusterRoleBinding +manually: + +```bash +kubectl delete clusterrolebindings/linkerd-linkerd-viz-web-admin +``` + +To confirm: + +```bash +$ kubectl auth can-i watch pods.tap.linkerd.io --all-namespaces --as system:serviceaccount:linkerd-viz:web +no +``` diff --git a/linkerd.io/content/2.11/tasks/setting-up-service-profiles.md b/linkerd.io/content/2.11/tasks/setting-up-service-profiles.md new file mode 100644 index 0000000000..e2b8364bec --- /dev/null +++ b/linkerd.io/content/2.11/tasks/setting-up-service-profiles.md @@ -0,0 +1,148 @@ ++++ +title = "Setting Up Service Profiles" +description = "Create a service profile that provides more details for Linkerd to build on." ++++ + +[Service profiles](../../features/service-profiles/) provide Linkerd additional +information about a service and how to handle requests for a service. + +When an HTTP (not HTTPS) request is received by a Linkerd proxy, +the `destination service` of that request is identified. If a +service profile for that destination service exists, then that +service profile is used to +to provide [per-route metrics](../getting-per-route-metrics/), +[retries](../configuring-retries/) and +[timeouts](../configuring-timeouts/). + +The `destination service` for a request is computed by selecting +the value of the first header to exist of, `l5d-dst-override`, +`:authority`, and `Host`. The port component, if included and +including the colon, is stripped. That value is mapped to the fully +qualified DNS name. When the `destination service` matches the +name of a service profile in the namespace of the sender or the +receiver, Linkerd will use that to provide [per-route +metrics](../getting-per-route-metrics/), +[retries](../configuring-retries/) and +[timeouts](../configuring-timeouts/). + +There are times when you may need to define a service profile for +a service which resides in a namespace that you do not control. To +accomplish this, simply create a service profile as before, but +edit the namespace of the service profile to the namespace of the +pod which is calling the service. When Linkerd proxies a request +to a service, a service profile in the source namespace will take +priority over a service profile in the destination namespace. + +Your `destination service` may be a [ExternalName +service](https://kubernetes.io/docs/concepts/services-networking/service/#externalname). +In that case, use the `spec.metadata.name` and the +`spec.metadata.namespace' values to name your ServiceProfile. For +example, + +```yaml +apiVersion: v1 +kind: Service +metadata: + name: my-service + namespace: prod +spec: + type: ExternalName + externalName: my.database.example.com +``` + +use the name `my-service.prod.svc.cluster.local` for the ServiceProfile. + +Note that at present, you cannot view statistics gathered for routes +in this ServiceProfile in the web dashboard. You can get the +statistics using the CLI. + +For a complete demo walkthrough, check out the +[books](../books/#service-profiles) demo. + +There are a couple different ways to use `linkerd profile` to create service +profiles. + +{{< pagetoc >}} + +Requests which have been associated with a route will have a `rt_route` +annotation. To manually verify if the requests are being associated correctly, +run `tap` on your own deployment: + +```bash +linkerd viz tap -o wide | grep req +``` + +The output will stream the requests that `deploy/webapp` is receiving in real +time. A sample is: + +```bash +req id=0:1 proxy=in src=10.1.3.76:57152 dst=10.1.3.74:7000 tls=disabled :method=POST :authority=webapp.default:7000 :path=/books/2878/edit src_res=deploy/traffic src_ns=foobar dst_res=deploy/webapp dst_ns=default rt_route=POST /books/{id}/edit +``` + +Conversely, if `rt_route` is not present, a request has *not* been associated +with any route. Try running: + +```bash +linkerd viz tap -o wide | grep req | grep -v rt_route +``` + +## Swagger + +If you have an [OpenAPI (Swagger)](https://swagger.io/docs/specification/about/) +spec for your service, you can use the `--open-api` flag to generate a service +profile from the OpenAPI spec file. + +```bash +linkerd profile --open-api webapp.swagger webapp +``` + +This generates a service profile from the `webapp.swagger` OpenAPI spec file +for the `webapp` service. The resulting service profile can be piped directly +to `kubectl apply` and will be installed into the service's namespace. + +```bash +linkerd profile --open-api webapp.swagger webapp | kubectl apply -f - +``` + +## Protobuf + +If you have a [protobuf](https://developers.google.com/protocol-buffers/) format +for your service, you can use the `--proto` flag to generate a service profile. + +```bash +linkerd profile --proto web.proto web-svc +``` + +This generates a service profile from the `web.proto` format file for the +`web-svc` service. The resulting service profile can be piped directly to +`kubectl apply` and will be installed into the service's namespace. + +## Auto-Creation + +It is common to not have an OpenAPI spec or a protobuf format. You can also +generate service profiles from watching live traffic. This is based off tap data +and is a great way to understand what service profiles can do for you. To start +this generation process, you can use the `--tap` flag: + +```bash +linkerd viz profile -n emojivoto web-svc --tap deploy/web --tap-duration 10s +``` + +This generates a service profile from the traffic observed to +`deploy/web` over the 10 seconds that this command is running. The resulting service +profile can be piped directly to `kubectl apply` and will be installed into the +service's namespace. + +## Template + +Alongside all the methods for automatically creating service profiles, you can +get a template that allows you to add routes manually. To generate the template, +run: + +```bash +linkerd profile -n emojivoto web-svc --template +``` + +This generates a service profile template with examples that can be manually +updated. Once you've updated the service profile, use `kubectl apply` to get it +installed into the service's namespace on your cluster. diff --git a/linkerd.io/content/2.11/tasks/troubleshooting.md b/linkerd.io/content/2.11/tasks/troubleshooting.md new file mode 100644 index 0000000000..68e4f5f80f --- /dev/null +++ b/linkerd.io/content/2.11/tasks/troubleshooting.md @@ -0,0 +1,2476 @@ ++++ +title = "Troubleshooting" +description = "Troubleshoot issues with your Linkerd installation." ++++ + +This section provides resolution steps for common problems reported with the +`linkerd check` command. + +## The "pre-kubernetes-cluster-setup" checks {#pre-k8s-cluster} + +These checks only run when the `--pre` flag is set. This flag is intended for +use prior to running `linkerd install`, to verify your cluster is prepared for +installation. + +### √ control plane namespace does not already exist {#pre-ns} + +Example failure: + +```bash +× control plane namespace does not already exist + The "linkerd" namespace already exists +``` + +By default `linkerd install` will create a `linkerd` namespace. Prior to +installation, that namespace should not exist. To check with a different +namespace, run: + +```bash +linkerd check --pre --linkerd-namespace linkerd-test +``` + +### √ can create Kubernetes resources {#pre-k8s-cluster-k8s} + +The subsequent checks in this section validate whether you have permission to +create the Kubernetes resources required for Linkerd installation, specifically: + +```bash +√ can create Namespaces +√ can create ClusterRoles +√ can create ClusterRoleBindings +√ can create CustomResourceDefinitions +``` + +## The "pre-kubernetes-setup" checks {#pre-k8s} + +These checks only run when the `--pre` flag is set This flag is intended for use +prior to running `linkerd install`, to verify you have the correct RBAC +permissions to install Linkerd. + +```bash +√ can create Namespaces +√ can create ClusterRoles +√ can create ClusterRoleBindings +√ can create CustomResourceDefinitions +√ can create PodSecurityPolicies +√ can create ServiceAccounts +√ can create Services +√ can create Deployments +√ can create ConfigMaps +``` + +### √ no clock skew detected {#pre-k8s-clock-skew} + +This check detects any differences between the system running the +`linkerd install` command and the Kubernetes nodes (known as clock skew). Having +a substantial clock skew can cause TLS validation problems because a node may +determine that a TLS certificate is expired when it should not be, or vice +versa. + +Linkerd version edge-20.3.4 and later check for a difference of at most 5 +minutes and older versions of Linkerd (including stable-2.7) check for a +difference of at most 1 minute. If your Kubernetes node heartbeat interval is +longer than this difference, you may experience false positives of this check. +The default node heartbeat interval was increased to 5 minutes in Kubernetes +1.17 meaning that users running Linkerd versions prior to edge-20.3.4 on +Kubernetes 1.17 or later are likely to experience these false positives. If this +is the case, you can upgrade to Linkerd edge-20.3.4 or later. If you choose to +ignore this error, we strongly recommend that you verify that your system clocks +are consistent. + +## The "pre-kubernetes-capability" checks {#pre-k8s-capability} + +These checks only run when the `--pre` flag is set. This flag is intended for +use prior to running `linkerd install`, to verify you have the correct +Kubernetes capability permissions to install Linkerd. + +### √ has NET_ADMIN capability {#pre-k8s-cluster-net-admin} + +Example failure: + +```bash +× has NET_ADMIN capability + found 3 PodSecurityPolicies, but none provide NET_ADMIN + see https://linkerd.io/checks/#pre-k8s-cluster-net-admin for hints +``` + +Linkerd installation requires the `NET_ADMIN` Kubernetes capability, to allow +for modification of iptables. + +For more information, see the Kubernetes documentation on +[Pod Security Policies](https://kubernetes.io/docs/concepts/policy/pod-security-policy/), +[Security Contexts](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/), +and the +[man page on Linux Capabilities](https://www.man7.org/linux/man-pages/man7/capabilities.7.html). + +### √ has NET_RAW capability {#pre-k8s-cluster-net-raw} + +Example failure: + +```bash +× has NET_RAW capability + found 3 PodSecurityPolicies, but none provide NET_RAW + see https://linkerd.io/checks/#pre-k8s-cluster-net-raw for hints +``` + +Linkerd installation requires the `NET_RAW` Kubernetes capability, to allow for +modification of iptables. + +For more information, see the Kubernetes documentation on +[Pod Security Policies](https://kubernetes.io/docs/concepts/policy/pod-security-policy/), +[Security Contexts](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/), +and the +[man page on Linux Capabilities](https://www.man7.org/linux/man-pages/man7/capabilities.7.html). + +## The "pre-linkerd-global-resources" checks {#pre-l5d-existence} + +These checks only run when the `--pre` flag is set. This flag is intended for +use prior to running `linkerd install`, to verify you have not already installed +the Linkerd control plane. + +```bash +√ no ClusterRoles exist +√ no ClusterRoleBindings exist +√ no CustomResourceDefinitions exist +√ no MutatingWebhookConfigurations exist +√ no ValidatingWebhookConfigurations exist +√ no PodSecurityPolicies exist +``` + +## The "pre-kubernetes-single-namespace-setup" checks {#pre-single} + +If you do not expect to have the permission for a full cluster install, try the +`--single-namespace` flag, which validates if Linkerd can be installed in a +single namespace, with limited cluster access: + +```bash +linkerd check --pre --single-namespace +``` + +### √ control plane namespace exists {#pre-single-ns} + +```bash +× control plane namespace exists + The "linkerd" namespace does not exist +``` + +In `--single-namespace` mode, `linkerd check` assumes that the installer does +not have permission to create a namespace, so the installation namespace must +already exist. + +By default the `linkerd` namespace is used. To use a different namespace run: + +```bash +linkerd check --pre --single-namespace --linkerd-namespace linkerd-test +``` + +### √ can create Kubernetes resources {#pre-single-k8s} + +The subsequent checks in this section validate whether you have permission to +create the Kubernetes resources required for Linkerd `--single-namespace` +installation, specifically: + +```bash +√ can create Roles +√ can create RoleBindings +``` + +For more information on cluster access, see the +[GKE Setup](../install/#gke) section above. + +## The "kubernetes-api" checks {#k8s-api} + +Example failures: + +```bash +× can initialize the client + error configuring Kubernetes API client: stat badconfig: no such file or directory +× can query the Kubernetes API + Get https://8.8.8.8/version: dial tcp 8.8.8.8:443: i/o timeout +``` + +Ensure that your system is configured to connect to a Kubernetes cluster. +Validate that the `KUBECONFIG` environment variable is set properly, and/or +`~/.kube/config` points to a valid cluster. + +For more information see these pages in the Kubernetes Documentation: + +- [Accessing Clusters](https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster/) +- [Configure Access to Multiple Clusters](https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) + +Also verify that these command works: + +```bash +kubectl config view +kubectl cluster-info +kubectl version +``` + +Another example failure: + +```bash +✘ can query the Kubernetes API + Get REDACTED/version: x509: certificate signed by unknown authority +``` + +As an (unsafe) workaround to this, you may try: + +```bash +kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \ + --server=${KUBE_CONTEXT} +``` + +## The "kubernetes-version" checks {#k8s-version} + +### √ is running the minimum Kubernetes API version {#k8s-version-api} + +Example failure: + +```bash +× is running the minimum Kubernetes API version + Kubernetes is on version [1.7.16], but version [1.13.0] or more recent is required +``` + +Linkerd requires at least version `1.13.0`. Verify your cluster version with: + +```bash +kubectl version +``` + +### √ is running the minimum kubectl version {#kubectl-version} + +Example failure: + +```bash +× is running the minimum kubectl version + kubectl is on version [1.9.1], but version [1.13.0] or more recent is required + see https://linkerd.io/checks/#kubectl-version for hints +``` + +Linkerd requires at least version `1.13.0`. Verify your kubectl version with: + +```bash +kubectl version --client --short +``` + +To fix please update kubectl version. + +For more information on upgrading Kubernetes, see the page in the Kubernetes +Documentation. + +## The "linkerd-config" checks {#l5d-config} + +This category of checks validates that Linkerd's cluster-wide RBAC and related +resources have been installed. These checks run via a default `linkerd check`, +and also in the context of a multi-stage setup, for example: + +```bash +# install cluster-wide resources (first stage) +linkerd install config | kubectl apply -f - + +# validate successful cluster-wide resources installation +linkerd check config + +# install Linkerd control plane +linkerd install control-plane | kubectl apply -f - + +# validate successful control-plane installation +linkerd check +``` + +### √ control plane Namespace exists {#l5d-existence-ns} + +Example failure: + +```bash +× control plane Namespace exists + The "foo" namespace does not exist + see https://linkerd.io/checks/#l5d-existence-ns for hints +``` + +Ensure the Linkerd control plane namespace exists: + +```bash +kubectl get ns +``` + +The default control plane namespace is `linkerd`. If you installed Linkerd into +a different namespace, specify that in your check command: + +```bash +linkerd check --linkerd-namespace linkerdtest +``` + +### √ control plane ClusterRoles exist {#l5d-existence-cr} + +Example failure: + +```bash +× control plane ClusterRoles exist + missing ClusterRoles: linkerd-linkerd-controller + see https://linkerd.io/checks/#l5d-existence-cr for hints +``` + +Ensure the Linkerd ClusterRoles exist: + +```bash +$ kubectl get clusterroles | grep linkerd +linkerd-linkerd-controller 9d +linkerd-linkerd-identity 9d +linkerd-linkerd-proxy-injector 20d +linkerd-linkerd-sp-validator 9d +``` + +Also ensure you have permission to create ClusterRoles: + +```bash +$ kubectl auth can-i create clusterroles +yes +``` + +### √ control plane ClusterRoleBindings exist {#l5d-existence-crb} + +Example failure: + +```bash +× control plane ClusterRoleBindings exist + missing ClusterRoleBindings: linkerd-linkerd-controller + see https://linkerd.io/checks/#l5d-existence-crb for hints +``` + +Ensure the Linkerd ClusterRoleBindings exist: + +```bash +$ kubectl get clusterrolebindings | grep linkerd +linkerd-linkerd-controller 9d +linkerd-linkerd-identity 9d +linkerd-linkerd-proxy-injector 20d +linkerd-linkerd-sp-validator 9d +``` + +Also ensure you have permission to create ClusterRoleBindings: + +```bash +$ kubectl auth can-i create clusterrolebindings +yes +``` + +### √ control plane ServiceAccounts exist {#l5d-existence-sa} + +Example failure: + +```bash +× control plane ServiceAccounts exist + missing ServiceAccounts: linkerd-controller + see https://linkerd.io/checks/#l5d-existence-sa for hints +``` + +Ensure the Linkerd ServiceAccounts exist: + +```bash +$ kubectl -n linkerd get serviceaccounts +NAME SECRETS AGE +default 1 14m +linkerd-controller 1 14m +linkerd-destination 1 14m +linkerd-heartbeat 1 14m +linkerd-identity 1 14m +linkerd-proxy-injector 1 14m +linkerd-sp-validator 1 13m +``` + +Also ensure you have permission to create ServiceAccounts in the Linkerd +namespace: + +```bash +$ kubectl -n linkerd auth can-i create serviceaccounts +yes +``` + +### √ control plane CustomResourceDefinitions exist {#l5d-existence-crd} + +Example failure: + +```bash +× control plane CustomResourceDefinitions exist + missing CustomResourceDefinitions: serviceprofiles.linkerd.io + see https://linkerd.io/checks/#l5d-existence-crd for hints +``` + +Ensure the Linkerd CRD exists: + +```bash +$ kubectl get customresourcedefinitions +NAME CREATED AT +serviceprofiles.linkerd.io 2019-04-25T21:47:31Z +``` + +Also ensure you have permission to create CRDs: + +```bash +$ kubectl auth can-i create customresourcedefinitions +yes +``` + +### √ control plane MutatingWebhookConfigurations exist {#l5d-existence-mwc} + +Example failure: + +```bash +× control plane MutatingWebhookConfigurations exist + missing MutatingWebhookConfigurations: linkerd-proxy-injector-webhook-config + see https://linkerd.io/checks/#l5d-existence-mwc for hints +``` + +Ensure the Linkerd MutatingWebhookConfigurations exists: + +```bash +$ kubectl get mutatingwebhookconfigurations | grep linkerd +linkerd-proxy-injector-webhook-config 2019-07-01T13:13:26Z +``` + +Also ensure you have permission to create MutatingWebhookConfigurations: + +```bash +$ kubectl auth can-i create mutatingwebhookconfigurations +yes +``` + +### √ control plane ValidatingWebhookConfigurations exist {#l5d-existence-vwc} + +Example failure: + +```bash +× control plane ValidatingWebhookConfigurations exist + missing ValidatingWebhookConfigurations: linkerd-sp-validator-webhook-config + see https://linkerd.io/checks/#l5d-existence-vwc for hints +``` + +Ensure the Linkerd ValidatingWebhookConfiguration exists: + +```bash +$ kubectl get validatingwebhookconfigurations | grep linkerd +linkerd-sp-validator-webhook-config 2019-07-01T13:13:26Z +``` + +Also ensure you have permission to create ValidatingWebhookConfigurations: + +```bash +$ kubectl auth can-i create validatingwebhookconfigurations +yes +``` + +### √ control plane PodSecurityPolicies exist {#l5d-existence-psp} + +Example failure: + +```bash +× control plane PodSecurityPolicies exist + missing PodSecurityPolicies: linkerd-linkerd-control-plane + see https://linkerd.io/checks/#l5d-existence-psp for hints +``` + +Ensure the Linkerd PodSecurityPolicy exists: + +```bash +$ kubectl get podsecuritypolicies | grep linkerd +linkerd-linkerd-control-plane false NET_ADMIN,NET_RAW RunAsAny RunAsAny MustRunAs MustRunAs true configMap,emptyDir,secret,projected,downwardAPI,persistentVolumeClaim +``` + +Also ensure you have permission to create PodSecurityPolicies: + +```bash +$ kubectl auth can-i create podsecuritypolicies +yes +``` + +## The "linkerd-existence" checks {#l5d-existence} + +### √ 'linkerd-config' config map exists {#l5d-existence-linkerd-config} + +Example failure: + +```bash +× 'linkerd-config' config map exists + missing ConfigMaps: linkerd-config + see https://linkerd.io/checks/#l5d-existence-linkerd-config for hints +``` + +Ensure the Linkerd ConfigMap exists: + +```bash +$ kubectl -n linkerd get configmap/linkerd-config +NAME DATA AGE +linkerd-config 3 61m +``` + +Also ensure you have permission to create ConfigMaps: + +```bash +$ kubectl -n linkerd auth can-i create configmap +yes +``` + +### √ control plane replica sets are ready {#l5d-existence-replicasets} + +This failure occurs when one of Linkerd's ReplicaSets fails to schedule a pod. + +For more information, see the Kubernetes documentation on +[Failed Deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#failed-deployment). + +### √ no unschedulable pods {#l5d-existence-unschedulable-pods} + +Example failure: + +```bash +× no unschedulable pods + linkerd-prometheus-6b668f774d-j8ncr: 0/1 nodes are available: 1 Insufficient cpu. + see https://linkerd.io/checks/#l5d-existence-unschedulable-pods for hints +``` + +For more information, see the Kubernetes documentation on the +[Unschedulable Pod Condition](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-conditions). + +### √ controller pod is running {#l5d-existence-controller} + +Example failure: + +```bash +× controller pod is running + No running pods for "linkerd-controller" +``` + +Note, it takes a little bit for pods to be scheduled, images to be pulled and +everything to start up. If this is a permanent error, you'll want to validate +the state of the controller pod with: + +```bash +$ kubectl -n linkerd get po --selector linkerd.io/control-plane-component=controller +NAME READY STATUS RESTARTS AGE +linkerd-controller-7bb8ff5967-zg265 4/4 Running 0 40m +``` + +Check the controller's logs with: + +```bash +kubectl logs -n linkerd linkerd-controller-7bb8ff5967-zg265 public-api +``` + +## The "linkerd-identity" checks {#l5d-identity} + +### √ certificate config is valid {#l5d-identity-cert-config-valid} + +Example failures: + +```bash +× certificate config is valid + key ca.crt containing the trust anchors needs to exist in secret linkerd-identity-issuer if --identity-external-issuer=true + see https://linkerd.io/checks/#l5d-identity-cert-config-valid +``` + +```bash +× certificate config is valid + key crt.pem containing the issuer certificate needs to exist in secret linkerd-identity-issuer if --identity-external-issuer=false + see https://linkerd.io/checks/#l5d-identity-cert-config-valid +``` + +Ensure that your `linkerd-identity-issuer` secret contains the correct keys for +the `scheme` that Linkerd is configured with. If the scheme is +`kubernetes.io/tls` your secret should contain the `tls.crt`, `tls.key` and +`ca.crt` keys. Alternatively if your scheme is `linkerd.io/tls`, the required +keys are `crt.pem` and `key.pem`. + +### √ trust roots are using supported crypto algorithm {#l5d-identity-trustAnchors-use-supported-crypto} + +Example failure: + +```bash +× trust roots are using supported crypto algorithm + Invalid roots: + * 165223702412626077778653586125774349756 identity.linkerd.cluster.local must use P-256 curve for public key, instead P-521 was used + see https://linkerd.io/checks/#l5d-identity-trustAnchors-use-supported-crypto +``` + +You need to ensure that all of your roots use ECDSA P-256 for their public key +algorithm. + +### √ trust roots are within their validity period {#l5d-identity-trustAnchors-are-time-valid} + +Example failure: + +```bash +× trust roots are within their validity period + Invalid roots: + * 199607941798581518463476688845828639279 identity.linkerd.cluster.local not valid anymore. Expired on 2019-12-19T13:08:18Z + see https://linkerd.io/checks/#l5d-identity-trustAnchors-are-time-valid for hints +``` + +Failures of such nature indicate that your roots have expired. If that is the +case you will have to update both the root and issuer certificates at once. You +can follow the process outlined in +[Replacing Expired Certificates](../replacing_expired_certificates/) to +get your cluster back to a stable state. + +### √ trust roots are valid for at least 60 days {#l5d-identity-trustAnchors-not-expiring-soon} + +Example warnings: + +```bash +‼ trust roots are valid for at least 60 days + Roots expiring soon: + * 66509928892441932260491975092256847205 identity.linkerd.cluster.local will expire on 2019-12-19T13:30:57Z + see https://linkerd.io/checks/#l5d-identity-trustAnchors-not-expiring-soon for hints +``` + +This warning indicates that the expiry of some of your roots is approaching. In +order to address this problem without incurring downtime, you can follow the +process outlined in +[Rotating your identity certificates](../rotating_identity_certificates/). + +### √ issuer cert is using supported crypto algorithm {#l5d-identity-issuer-cert-uses-supported-crypto} + +Example failure: + +```bash +× issuer cert is using supported crypto algorithm + issuer certificate must use P-256 curve for public key, instead P-521 was used + see https://linkerd.io/checks/#5d-identity-issuer-cert-uses-supported-crypto for hints +``` + +You need to ensure that your issuer certificate uses ECDSA P-256 for its public +key algorithm. You can refer to +[Generating your own mTLS root certificates](../generate-certificates/#generating-the-certificates-with-step) +to see how you can generate certificates that will work with Linkerd. + +### √ issuer cert is within its validity period {#l5d-identity-issuer-cert-is-time-valid} + +Example failure: + +```bash +× issuer cert is within its validity period + issuer certificate is not valid anymore. Expired on 2019-12-19T13:35:49Z + see https://linkerd.io/checks/#l5d-identity-issuer-cert-is-time-valid +``` + +This failure indicates that your issuer certificate has expired. In order to +bring your cluster back to a valid state, follow the process outlined in +[Replacing Expired Certificates](../replacing_expired_certificates/). + +### √ issuer cert is valid for at least 60 days {#l5d-identity-issuer-cert-not-expiring-soon} + +Example warning: + +```bash +‼ issuer cert is valid for at least 60 days + issuer certificate will expire on 2019-12-19T13:35:49Z + see https://linkerd.io/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints +``` + +This warning means that your issuer certificate is expiring soon. If you do not +rely on external certificate management solution such as `cert-manager`, you can +follow the process outlined in +[Rotating your identity certificates](../rotating_identity_certificates/) + +### √ issuer cert is issued by the trust root {#l5d-identity-issuer-cert-issued-by-trust-anchor} + +Example error: + +```bash +× issuer cert is issued by the trust root + x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "identity.linkerd.cluster.local") + see https://linkerd.io/checks/#l5d-identity-issuer-cert-issued-by-trust-anchor for hints +``` + +This error indicates that the issuer certificate that is in the +`linkerd-identity-issuer` secret cannot be verified with any of the roots that +Linkerd has been configured with. Using the CLI install process, this should +never happen. If Helm was used for installation or the issuer certificates are +managed by a malfunctioning certificate management solution, it is possible for +the cluster to end up in such an invalid state. At that point the best to do is +to use the upgrade command to update your certificates: + +```bash +linkerd upgrade \ + --identity-issuer-certificate-file=./your-new-issuer.crt \ + --identity-issuer-key-file=./your-new-issuer.key \ + --identity-trust-anchors-file=./your-new-roots.crt \ + --force | kubectl apply -f - +``` + +Once the upgrade process is over, the output of `linkerd check --proxy` should +be: + +```bash +linkerd-identity +---------------- +√ certificate config is valid +√ trust roots are using supported crypto algorithm +√ trust roots are within their validity period +√ trust roots are valid for at least 60 days +√ issuer cert is using supported crypto algorithm +√ issuer cert is within its validity period +√ issuer cert is valid for at least 60 days +√ issuer cert is issued by the trust root + +linkerd-identity-data-plane +--------------------------- +√ data plane proxies certificate match CA +``` + +## The "linkerd-webhooks-and-apisvc-tls" checks {#l5d-webhook} + +### √ proxy-injector webhook has valid cert {#l5d-proxy-injector-webhook-cert-valid} + +Example failure: + +```bash +× proxy-injector webhook has valid cert + secrets "linkerd-proxy-injector-tls" not found + see https://linkerd.io/checks/#l5d-proxy-injector-webhook-cert-valid for hints +``` + +Ensure that the `linkerd-proxy-injector-k8s-tls` secret exists and contains the +appropriate `tls.crt` and `tls.key` data entries. For versions before 2.9, the +secret is named `linkerd-proxy-injector-tls` and it should contain the `crt.pem` +and `key.pem` data entries. + +```bash +× proxy-injector webhook has valid cert + cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-proxy-injector.linkerd.svc + see https://linkerd.io/checks/#l5d-proxy-injector-webhook-cert-valid for hints +``` + +Here you need to make sure the certificate was issued specifically for +`linkerd-proxy-injector.linkerd.svc`. + +### √ proxy-injector cert is valid for at least 60 days {#l5d-proxy-injector-webhook-cert-not-expiring-soon} + +Example failure: + +```bash +‼ proxy-injector cert is valid for at least 60 days + certificate will expire on 2020-11-07T17:00:07Z + see https://linkerd.io/checks/#l5d-proxy-injector-webhook-cert-not-expiring-soon for hints +``` + +This warning indicates that the expiry of proxy-injnector webhook +cert is approaching. In order to address this +problem without incurring downtime, you can follow the process outlined in +[Automatically Rotating your webhook TLS Credentials](../automatically-rotating-webhook-tls-credentials/). + +### √ sp-validator webhook has valid cert {#l5d-sp-validator-webhook-cert-valid} + +Example failure: + +```bash +× sp-validator webhook has valid cert + secrets "linkerd-sp-validator-tls" not found + see https://linkerd.io/checks/#l5d-sp-validator-webhook-cert-valid for hints +``` + +Ensure that the `linkerd-sp-validator-k8s-tls` secret exists and contains the +appropriate `tls.crt` and `tls.key` data entries. For versions before 2.9, the +secret is named `linkerd-sp-validator-tls` and it should contain the `crt.pem` +and `key.pem` data entries. + +```bash +× sp-validator webhook has valid cert + cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-sp-validator.linkerd.svc + see https://linkerd.io/checks/#l5d-sp-validator-webhook-cert-valid for hints +``` + +Here you need to make sure the certificate was issued specifically for +`linkerd-sp-validator.linkerd.svc`. + +### √ sp-validator cert is valid for at least 60 days {#l5d-sp-validator-webhook-cert-not-expiring-soon} + +Example failure: + +```bash +‼ sp-validator cert is valid for at least 60 days + certificate will expire on 2020-11-07T17:00:07Z + see https://linkerd.io/checks/#l5d-sp-validator-webhook-cert-not-expiring-soon for hints +``` + +This warning indicates that the expiry of sp-validator webhook +cert is approaching. In order to address this +problem without incurring downtime, you can follow the process outlined in +[Automatically Rotating your webhook TLS Credentials](../automatically-rotating-webhook-tls-credentials/). + +### √ policy-validator webhook has valid cert {#l5d-policy-validator-webhook-cert-valid} + +Example failure: + +```bash +× policy-validator webhook has valid cert + secrets "linkerd-policy-validator-tls" not found + see https://linkerd.io/checks/#l5d-policy-validator-webhook-cert-valid for hints +``` + +Ensure that the `linkerd-policy-validator-k8s-tls` secret exists and contains the +appropriate `tls.crt` and `tls.key` data entries. + +```bash +× policy-validator webhook has valid cert + cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-policy-validator.linkerd.svc + see https://linkerd.io/checks/#l5d-policy-validator-webhook-cert-valid for hints +``` + +Here you need to make sure the certificate was issued specifically for +`linkerd-policy-validator.linkerd.svc`. + +### √ policy-validator cert is valid for at least 60 days {#l5d-policy-validator-webhook-cert-not-expiring-soon} + +Example failure: + +```bash +‼ policy-validator cert is valid for at least 60 days + certificate will expire on 2020-11-07T17:00:07Z + see https://linkerd.io/checks/#l5d-policy-validator-webhook-cert-not-expiring-soon for hints +``` + +This warning indicates that the expiry of policy-validator webhook +cert is approaching. In order to address this +problem without incurring downtime, you can follow the process outlined in +[Automatically Rotating your webhook TLS Credentials](../automatically-rotating-webhook-tls-credentials/). + +## The "linkerd-identity-data-plane" checks {#l5d-identity-data-plane} + +### √ data plane proxies certificate match CA {#l5d-identity-data-plane-proxies-certs-match-ca} + +Example warning: + +```bash +‼ data plane proxies certificate match CA + Some pods do not have the current trust bundle and must be restarted: + * emojivoto/emoji-d8d7d9c6b-8qwfx + * emojivoto/vote-bot-588499c9f6-zpwz6 + * emojivoto/voting-8599548fdc-6v64k + see https://linkerd.io/checks/{#l5d-identity-data-plane-proxies-certs-match-ca for hints +``` + +Observing this warning indicates that some of your meshed pods have proxies that +have stale certificates. This is most likely to happen during `upgrade` +operations that deal with cert rotation. In order to solve the problem you can +use `rollout restart` to restart the pods in question. That should cause them to +pick the correct certs from the `linkerd-config` configmap. When `upgrade` is +performed using the `--identity-trust-anchors-file` flag to modify the roots, +the Linkerd components are restarted. While this operation is in progress the +`check --proxy` command may output a warning, pertaining to the Linkerd +components: + +```bash +‼ data plane proxies certificate match CA + Some pods do not have the current trust bundle and must be restarted: + * linkerd/linkerd-sp-validator-75f9d96dc-rch4x + * linkerd-viz/tap-68d8bbf64-mpzgb + * linkerd-viz/web-849f74b7c6-qlhwc + see https://linkerd.io/checks/{#l5d-identity-data-plane-proxies-certs-match-ca for hints +``` + +If that is the case, simply wait for the `upgrade` operation to complete. The +stale pods should terminate and be replaced by new ones, configured with the +correct certificates. + +## The "linkerd-api" checks {#l5d-api} + +### √ control plane pods are ready {#l5d-api-control-ready} + +Example failure: + +```bash +× control plane pods are ready + No running pods for "linkerd-sp-validator" +``` + +Verify the state of the control plane pods with: + +```bash +$ kubectl -n linkerd get po +NAME READY STATUS RESTARTS AGE +linkerd-controller-78957587d6-4qfp2 2/2 Running 1 12m +linkerd-destination-5fd7b5d466-szgqm 2/2 Running 1 12m +linkerd-identity-54df78c479-hbh5m 2/2 Running 0 12m +linkerd-proxy-injector-67f8cf65f7-4tvt5 2/2 Running 1 12m +linkerd-sp-validator-59796bdccc-95rn5 2/2 Running 0 12m +``` + +### √ cluster networks contains all node podCIDRs {#l5d-cluster-networks-cidr} + +Example failure: + +```bash +× cluster networks contains all node podCIDRs + node has podCIDR(s) [10.244.0.0/24] which are not contained in the Linkerd clusterNetworks. + Try installing linkerd via --set clusterNetworks=10.244.0.0/24 + see https://linkerd.io/2/checks/#l5d-cluster-networks-cidr for hints +``` + +Linkerd has a `clusterNetworks` setting which allows it to differentiate between +intra-cluster and egress traffic. This warning indicates that the cluster has +a podCIDR which is not included in Linkerd's `clusterNetworks`. Traffic to pods +in this network may not be meshed properly. To remedy this, update the +`clusterNetworks` setting to include all pod networks in the cluster. + +### √ can initialize the client {#l5d-api-control-client} + +Example failure: + +```bash +× can initialize the client + parse http:// bad/: invalid character " " in host name +``` + +Verify that a well-formed `--api-addr` parameter was specified, if any: + +```bash +linkerd check --api-addr " bad" +``` + +### √ can query the control plane API {#l5d-api-control-api} + +Example failure: + +```bash +× can query the control plane API + Post http://8.8.8.8/api/v1/Version: context deadline exceeded +``` + +This check indicates a connectivity failure between the cli and the Linkerd +control plane. To verify connectivity, manually connect to the controller pod: + +```bash +kubectl -n linkerd port-forward \ + $(kubectl -n linkerd get po \ + --selector=linkerd.io/control-plane-component=controller \ + -o jsonpath='{.items[*].metadata.name}') \ +9995:9995 +``` + +...and then curl the `/metrics` endpoint: + +```bash +curl localhost:9995/metrics +``` + +## The "linkerd-service-profile" checks {#l5d-sp} + +Example failure: + +```bash +‼ no invalid service profiles + ServiceProfile "bad" has invalid name (must be "..svc.cluster.local") +``` + +Validate the structure of your service profiles: + +```bash +$ kubectl -n linkerd get sp +NAME AGE +bad 51s +linkerd-controller-api.linkerd.svc.cluster.local 1m +``` + +Example failure: + +```bash +‼ no invalid service profiles + the server could not find the requested resource (get serviceprofiles.linkerd.io) +``` + +Validate that the Service Profile CRD is installed on your cluster and that its +`linkerd.io/created-by` annotation matches your `linkerd version` client +version: + +```bash +kubectl get crd/serviceprofiles.linkerd.io -o yaml | grep linkerd.io/created-by +``` + +If the CRD is missing or out-of-date you can update it: + +```bash +linkerd upgrade | kubectl apply -f - +``` + +## The "linkerd-version" checks {#l5d-version} + +### √ can determine the latest version {#l5d-version-latest} + +Example failure: + +```bash +× can determine the latest version + Get https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli: context deadline exceeded +``` + +Ensure you can connect to the Linkerd version check endpoint from the +environment the `linkerd` cli is running: + +```bash +$ curl "https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli" +{"stable":"stable-2.1.0","edge":"edge-19.1.2"} +``` + +### √ cli is up-to-date {#l5d-version-cli} + +Example failure: + +```bash +‼ cli is up-to-date + is running version 19.1.1 but the latest edge version is 19.1.2 +``` + +See the page on [Upgrading Linkerd](../../upgrade/). + +## The "control-plane-version" checks {#l5d-version-control} + +Example failures: + +```bash +‼ control plane is up-to-date + is running version 19.1.1 but the latest edge version is 19.1.2 +‼ control plane and cli versions match + mismatched channels: running stable-2.1.0 but retrieved edge-19.1.2 +``` + +See the page on [Upgrading Linkerd](../../upgrade/). + +## The "linkerd-control-plane-proxy" checks {#linkerd-control-plane-proxy} + +### √ control plane proxies are healthy {#l5d-cp-proxy-healthy} + +This error indicates that the proxies running in the Linkerd control plane are +not healthy. Ensure that Linkerd has been installed with all of the correct +setting or re-install Linkerd as necessary. + +### √ control plane proxies are up-to-date {#l5d-cp-proxy-version} + +This warning indicates the proxies running in the Linkerd control plane are +running an old version. We recommend downloading the latest Linkerd release +and [Upgrading Linkerd](../../upgrade/). + +### √ control plane proxies and cli versions match {#l5d-cp-proxy-cli-version} + +This warning indicates that the proxies running in the Linkerd control plane are +running a different version from the Linkerd CLI. We recommend keeping this +versions in sync by updating either the CLI or the control plane as necessary. + +## The "linkerd-data-plane" checks {#l5d-data-plane} + +These checks only run when the `--proxy` flag is set. This flag is intended for +use after running `linkerd inject`, to verify the injected proxies are operating +normally. + +### √ data plane namespace exists {#l5d-data-plane-exists} + +Example failure: + +```bash +$ linkerd check --proxy --namespace foo +... +× data plane namespace exists + The "foo" namespace does not exist +``` + +Ensure the `--namespace` specified exists, or, omit the parameter to check all +namespaces. + +### √ data plane proxies are ready {#l5d-data-plane-ready} + +Example failure: + +```bash +× data plane proxies are ready + No "linkerd-proxy" containers found +``` + +Ensure you have injected the Linkerd proxy into your application via the +`linkerd inject` command. + +For more information on `linkerd inject`, see +[Step 5: Install the demo app](../../getting-started/#step-5-install-the-demo-app) +in our [Getting Started](../../getting-started/) guide. + +### √ data plane proxy metrics are present in Prometheus {#l5d-data-plane-prom} + +Example failure: + +```bash +× data plane proxy metrics are present in Prometheus + Data plane metrics not found for linkerd/linkerd-controller-b8c4c48c8-pflc9. +``` + +Ensure Prometheus can connect to each `linkerd-proxy` via the Prometheus +dashboard: + +```bash +kubectl -n linkerd port-forward svc/linkerd-prometheus 9090 +``` + +...and then browse to +[http://localhost:9090/targets](http://localhost:9090/targets), validate the +`linkerd-proxy` section. + +You should see all your pods here. If they are not: + +- Prometheus might be experiencing connectivity issues with the k8s api server. + Check out the logs and delete the pod to flush any possible transient errors. + +### √ data plane is up-to-date {#l5d-data-plane-version} + +Example failure: + +```bash +‼ data plane is up-to-date + linkerd/linkerd-prometheus-74d66f86f6-6t6dh: is running version 19.1.2 but the latest edge version is 19.1.3 +``` + +See the page on [Upgrading Linkerd](../../upgrade/). + +### √ data plane and cli versions match {#l5d-data-plane-cli-version} + +```bash +‼ data plane and cli versions match + linkerd/linkerd-controller-5f6c45d6d9-9hd9j: is running version 19.1.2 but the latest edge version is 19.1.3 +``` + +See the page on [Upgrading Linkerd](../../upgrade/). + +### √ data plane pod labels are configured correctly {#l5d-data-plane-pod-labels} + +Example failure: + +```bash +‼ data plane pod labels are configured correctly + Some labels on data plane pods should be annotations: + * emojivoto/voting-ff4c54b8d-tv9pp + linkerd.io/inject +``` + +`linkerd.io/inject`, `config.linkerd.io/*` or `config.alpha.linkerd.io/*` should +be annotations in order to take effect. + +### √ data plane service labels are configured correctly {#l5d-data-plane-services-labels} + +Example failure: + +```bash +‼ data plane service labels and annotations are configured correctly + Some labels on data plane services should be annotations: + * emojivoto/emoji-svc + config.linkerd.io/control-port +``` + +`config.linkerd.io/*` or `config.alpha.linkerd.io/*` should +be annotations in order to take effect. + +### √ data plane service annotations are configured correctly {#l5d-data-plane-services-annotations} + +Example failure: + +```bash +‼ data plane service annotations are configured correctly + Some annotations on data plane services should be labels: + * emojivoto/emoji-svc + mirror.linkerd.io/exported +``` + +`mirror.linkerd.io/exported` should +be a label in order to take effect. + +### √ opaque ports are properly annotated {#linkerd-opaque-ports-definition} + +Example failure: + +```bash +× opaque ports are properly annotated + * service emoji-svc targets the opaque port 8080 through 8080; add 8080 to its config.linkerd.io/opaque-ports annotation + see https://linkerd.io/2/checks/#linkerd-opaque-ports-definition for hints +``` + +If a Pod marks a port as opaque by using the `config.linkerd.io/opaque-ports` +annotation, then any Service which targets that port must also use the +`config.linkerd.io/opaque-ports` annotation to mark that port as opaque. Having +a port marked as opaque on the Pod but not the Service (or vice versa) can +cause inconsistent behavior depending on if traffic is sent to the Pod directly +(for example with a headless Service) or through a ClusterIP Service. This +error can be remedied by adding the `config.linkerd.io/opaque-ports` annotation +to both the Pod and Service. See +[Protocol Detection](../../features/protocol-detection/) for more information. + +## The "linkerd-ha-checks" checks {#l5d-ha} + +These checks are ran if Linkerd has been installed in HA mode. + +### √ pod injection disabled on kube-system {#l5d-injection-disabled} + +Example warning: + +```bash +‼ pod injection disabled on kube-system + kube-system namespace needs to have the label config.linkerd.io/admission-webhooks: disabled if HA mode is enabled + see https://linkerd.io/checks/#l5d-injection-disabled for hints +``` + +Ensure the kube-system namespace has the +`config.linkerd.io/admission-webhooks:disabled` label: + +```bash +$ kubectl get namespace kube-system -oyaml +kind: Namespace +apiVersion: v1 +metadata: + name: kube-system + annotations: + linkerd.io/inject: disabled + labels: + config.linkerd.io/admission-webhooks: disabled +``` + +### √ multiple replicas of control plane pods {#l5d-control-plane-replicas} + +Example warning: + +```bash +‼ multiple replicas of control plane pods + not enough replicas available for [linkerd-controller] + see https://linkerd.io/checks/#l5d-control-plane-replicas for hints +``` + +This happens when one of the control plane pods doesn't have at least two +replicas running. This is likely caused by insufficient node resources. + +### The "extensions" checks {#extensions} + +When any [Extensions](../extensions/) are installed, The Linkerd binary +tries to invoke `check --output json` on the extension binaries. +It is important that the extension binaries implement it. +For more information, See [Extension +developer docs](https://github.com/linkerd/linkerd2/blob/main/EXTENSIONS.md) + +Example error: + +```bash +invalid extension check output from \"jaeger\" (JSON object expected) +``` + +Make sure that the extension binary implements `check --output json` +which returns the healthchecks in the [expected json format](https://github.com/linkerd/linkerd2/blob/main/EXTENSIONS.md#linkerd-name-check). + +Example error: + +```bash +× Linkerd command jaeger exists +``` + +Make sure that relevant binary exists in `$PATH`. + +For more information about Linkerd extensions. See +[Extension developer docs](https://github.com/linkerd/linkerd2/blob/main/EXTENSIONS.md) + +## The "linkerd-cni-plugin" checks {#l5d-cni} + +These checks run if Linkerd has been installed with the `--linkerd-cni-enabled` +flag. Alternatively they can be run as part of the pre-checks by providing the +`--linkerd-cni-enabled` flag. Most of these checks verify that the required +resources are in place. If any of them are missing, you can use +`linkerd install-cni | kubectl apply -f -` to re-install them. + +### √ cni plugin ConfigMap exists {#cni-plugin-cm-exists} + +Example error: + +```bash +× cni plugin ConfigMap exists + configmaps "linkerd-cni-config" not found + see https://linkerd.io/checks/#cni-plugin-cm-exists for hints +``` + +Ensure that the linkerd-cni-config ConfigMap exists in the CNI namespace: + +```bash +$ kubectl get cm linkerd-cni-config -n linkerd-cni +NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES +linkerd-linkerd-cni-cni false RunAsAny RunAsAny RunAsAny RunAsAny false hostPath,secret +``` + +Also ensure you have permission to create ConfigMaps: + +```bash +$ kubectl auth can-i create ConfigMaps +yes +``` + +### √ cni plugin PodSecurityPolicy exists {#cni-plugin-psp-exists} + +Example error: + +```bash +× cni plugin PodSecurityPolicy exists + missing PodSecurityPolicy: linkerd-linkerd-cni-cni + see https://linkerd.io/checks/#cni-plugin-psp-exists for hint +``` + +Ensure that the pod security policy exists: + +```bash +$ kubectl get psp linkerd-linkerd-cni-cni +NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES +linkerd-linkerd-cni-cni false RunAsAny RunAsAny RunAsAny RunAsAny false hostPath,secret +``` + +Also ensure you have permission to create PodSecurityPolicies: + +```bash +$ kubectl auth can-i create PodSecurityPolicies +yes +``` + +### √ cni plugin ClusterRole exist {#cni-plugin-cr-exists} + +Example error: + +```bash +× cni plugin ClusterRole exists + missing ClusterRole: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-cr-exists for hints +``` + +Ensure that the cluster role exists: + +```bash +$ kubectl get clusterrole linkerd-cni +NAME AGE +linkerd-cni 54m +``` + +Also ensure you have permission to create ClusterRoles: + +```bash +$ kubectl auth can-i create ClusterRoles +yes +``` + +### √ cni plugin ClusterRoleBinding exist {#cni-plugin-crb-exists} + +Example error: + +```bash +× cni plugin ClusterRoleBinding exists + missing ClusterRoleBinding: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-crb-exists for hints +``` + +Ensure that the cluster role binding exists: + +```bash +$ kubectl get clusterrolebinding linkerd-cni +NAME AGE +linkerd-cni 54m +``` + +Also ensure you have permission to create ClusterRoleBindings: + +```bash +$ kubectl auth can-i create ClusterRoleBindings +yes +``` + +### √ cni plugin Role exists {#cni-plugin-r-exists} + +Example error: + +```bash +× cni plugin Role exists + missing Role: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-r-exists for hints +``` + +Ensure that the role exists in the CNI namespace: + +```bash +$ kubectl get role linkerd-cni -n linkerd-cni +NAME AGE +linkerd-cni 52m +``` + +Also ensure you have permission to create Roles: + +```bash +$ kubectl auth can-i create Roles -n linkerd-cni +yes +``` + +### √ cni plugin RoleBinding exists {#cni-plugin-rb-exists} + +Example error: + +```bash +× cni plugin RoleBinding exists + missing RoleBinding: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-rb-exists for hints +``` + +Ensure that the role binding exists in the CNI namespace: + +```bash +$ kubectl get rolebinding linkerd-cni -n linkerd-cni +NAME AGE +linkerd-cni 49m +``` + +Also ensure you have permission to create RoleBindings: + +```bash +$ kubectl auth can-i create RoleBindings -n linkerd-cni +yes +``` + +### √ cni plugin ServiceAccount exists {#cni-plugin-sa-exists} + +Example error: + +```bash +× cni plugin ServiceAccount exists + missing ServiceAccount: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-sa-exists for hints +``` + +Ensure that the CNI service account exists in the CNI namespace: + +```bash +$ kubectl get ServiceAccount linkerd-cni -n linkerd-cni +NAME SECRETS AGE +linkerd-cni 1 45m +``` + +Also ensure you have permission to create ServiceAccount: + +```bash +$ kubectl auth can-i create ServiceAccounts -n linkerd-cni +yes +``` + +### √ cni plugin DaemonSet exists {#cni-plugin-ds-exists} + +Example error: + +```bash +× cni plugin DaemonSet exists + missing DaemonSet: linkerd-cni + see https://linkerd.io/checks/#cni-plugin-ds-exists for hints +``` + +Ensure that the CNI daemonset exists in the CNI namespace: + +```bash +$ kubectl get ds -n linkerd-cni +NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE +linkerd-cni 1 1 1 1 1 beta.kubernetes.io/os=linux 14m +``` + +Also ensure you have permission to create DaemonSets: + +```bash +$ kubectl auth can-i create DaemonSets -n linkerd-cni +yes +``` + +### √ cni plugin pod is running on all nodes {#cni-plugin-ready} + +Example failure: + +```bash +‼ cni plugin pod is running on all nodes + number ready: 2, number scheduled: 3 + see https://linkerd.io/checks/#cni-plugin-ready +``` + +Ensure that all the CNI pods are running: + +```bash +$ kubectl get po -n linkerd-cn +NAME READY STATUS RESTARTS AGE +linkerd-cni-rzp2q 1/1 Running 0 9m20s +linkerd-cni-mf564 1/1 Running 0 9m22s +linkerd-cni-p5670 1/1 Running 0 9m25s +``` + +Ensure that all pods have finished the deployment of the CNI config and binary: + +```bash +$ kubectl logs linkerd-cni-rzp2q -n linkerd-cni +Wrote linkerd CNI binaries to /host/opt/cni/bin +Created CNI config /host/etc/cni/net.d/10-kindnet.conflist +Done configuring CNI. Sleep=true +``` + +## The "linkerd-multicluster checks {#l5d-multicluster} + +These checks run if the service mirroring controller has been installed. +Additionally they can be ran with `linkerd multicluster check`. +Most of these checks verify that the service mirroring controllers are working +correctly along with remote gateways. Furthermore the checks ensure that +end to end TLS is possible between paired clusters. + +### √ Link CRD exists {#l5d-multicluster-link-crd-exists} + +Example error: + +```bash +× Link CRD exists + multicluster.linkerd.io/Link CRD is missing + see https://linkerd.io/checks/#l5d-multicluster-link-crd-exists for hints +``` + +Make sure multicluster extension is correctly installed and that the +`links.multicluster.linkerd.io` CRD is present. + +```bash +$ kubectll get crds | grep multicluster +NAME CREATED AT +links.multicluster.linkerd.io 2021-03-10T09:58:10Z +``` + +### √ Link resources are valid {#l5d-multicluster-links-are-valid} + +Example error: + +```bash +× Link resources are valid + failed to parse Link east + see https://linkerd.io/checks/#l5d-multicluster-links-are-valid for hints +``` + +Make sure all the link objects are specified in the expected format. + +### √ remote cluster access credentials are valid {#l5d-smc-target-clusters-access} + +Example error: + +```bash +× remote cluster access credentials are valid + * secret [east/east-config]: could not find east-config secret + see https://linkerd.io/checks/#l5d-smc-target-clusters-access for hints +``` + +Make sure the relevant Kube-config with relevant permissions. +for the specific target cluster is present as a secret correctly + +### √ clusters share trust anchors {#l5d-multicluster-clusters-share-anchors} + +Example errors: + +```bash +× clusters share trust anchors + Problematic clusters: + * remote + see https://linkerd.io/checks/#l5d-multicluster-clusters-share-anchors for hints +``` + +The error above indicates that your trust anchors are not compatible. In order +to fix that you need to ensure that both your anchors contain identical sets of +certificates. + +```bash +× clusters share trust anchors + Problematic clusters: + * remote: cannot parse trust anchors + see https://linkerd.io/checks/#l5d-multicluster-clusters-share-anchors for hints +``` + +Such an error indicates that there is a problem with your anchors on the cluster +named `remote` You need to make sure the identity config aspect of your Linkerd +installation on the `remote` cluster is ok. You can run `check` against the +remote cluster to verify that: + +```bash +linkerd --context=remote check +``` + +### √ service mirror controller has required permissions {#l5d-multicluster-source-rbac-correct} + +Example error: + +```bash +× service mirror controller has required permissions + missing Service mirror ClusterRole linkerd-service-mirror-access-local-resources: unexpected verbs expected create,delete,get,list,update,watch, got create,delete,get,update,watch + see https://linkerd.io/checks/#l5d-multicluster-source-rbac-correct for hints +``` + +This error indicates that the local RBAC permissions of the service mirror +service account are not correct. In order to ensure that you have the correct +verbs and resources you can inspect your ClusterRole and Role object and look at +the rules section. + +Expected rules for `linkerd-service-mirror-access-local-resources` cluster role: + +```bash +$ kubectl --context=local get clusterrole linkerd-service-mirror-access-local-resources -o yaml +kind: ClusterRole +metadata: + labels: + linkerd.io/control-plane-component: linkerd-service-mirror + name: linkerd-service-mirror-access-local-resources +rules: +- apiGroups: + - "" + resources: + - endpoints + - services + verbs: + - list + - get + - watch + - create + - delete + - update +- apiGroups: + - "" + resources: + - namespaces + verbs: + - create + - list + - get + - watch +``` + +Expected rules for `linkerd-service-mirror-read-remote-creds` role: + +```bash +$ kubectl --context=local get role linkerd-service-mirror-read-remote-creds -n linkerd-multicluster -o yaml +kind: Role +metadata: + labels: + linkerd.io/control-plane-component: linkerd-service-mirror + name: linkerd-service-mirror-read-remote-creds + namespace: linkerd-multicluster + rules: +- apiGroups: + - "" + resources: + - secrets + verbs: + - list + - get + - watch +``` + +### √ service mirror controllers are running {#l5d-multicluster-service-mirror-running} + +Example error: + +```bash +× service mirror controllers are running + Service mirror controller is not present + see https://linkerd.io/checks/#l5d-multicluster-service-mirror-running for hints +``` + +Note, it takes a little bit for pods to be scheduled, images to be pulled and +everything to start up. If this is a permanent error, you'll want to validate +the state of the controller pod with: + +```bash +$ kubectl --all-namespaces get po --selector linkerd.io/control-plane-component=linkerd-service-mirror +NAME READY STATUS RESTARTS AGE +linkerd-service-mirror-7bb8ff5967-zg265 2/2 Running 0 50m +``` + +### √ all gateway mirrors are healthy {#l5d-multicluster-gateways-endpoints} + +Example errors: + +```bash +‼ all gateway mirrors are healthy + Some gateway mirrors do not have endpoints: + linkerd-gateway-gke.linkerd-multicluster mirrored from cluster [gke] + see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints +``` + +The error above indicates that some gateway mirror services in the source +cluster do not have associated endpoints resources. These endpoints are created +by the Linkerd service mirror controller on the source cluster whenever a link +is established with a target cluster. + +Such an error indicates that there could be a problem with the creation of the +resources by the service mirror controller or the external IP of the gateway +service in target cluster. + +### √ all mirror services have endpoints {#l5d-multicluster-services-endpoints} + +Example errors: + +```bash +‼ all mirror services have endpoints + Some mirror services do not have endpoints: + voting-svc-gke.emojivoto mirrored from cluster [gke] (gateway: [linkerd-multicluster/linkerd-gateway]) + see https://linkerd.io/checks/#l5d-multicluster-services-endpoints for hints +``` + +The error above indicates that some mirror services in the source cluster do not +have associated endpoints resources. These endpoints are created by the Linkerd +service mirror controller when creating a mirror service with endpoints values +as the remote gateway's external IP. + +Such an error indicates that there could be a problem with the creation of the +mirror resources by the service mirror controller or the mirror gateway service +in the source cluster or the external IP of the gateway service in target +cluster. + +### √ all mirror services are part of a Link {#l5d-multicluster-orphaned-services} + +Example errors: + +```bash +‼ all mirror services are part of a Link + mirror service voting-east.emojivoto is not part of any Link + see https://linkerd.io/checks/#l5d-multicluster-orphaned-services for hints +``` + +The error above indicates that some mirror services in the source cluster do not +have associated link. These mirror services are created by the Linkerd +service mirror controller when a remote service is marked to be +mirrored. + +Make sure services are marked to be mirrored correctly at remote, and delete +if there are any unnecessary ones. + +## The "linkerd-viz" checks {#l5d-viz} + +These checks only run when the `linkerd-viz` extension is installed. +This check is intended to verify the installation of linkerd-viz +extension which comprises of `tap`, `web`, +`metrics-api` and optional `grafana` and `prometheus` instances +along with `tap-injector` which injects the specific +tap configuration to the proxies. + +### √ linkerd-viz Namespace exists {#l5d-viz-ns-exists} + +This is the basic check used to verify if the linkerd-viz extension +namespace is installed or not. The extension can be installed by running +the following command: + +```bash +linkerd viz install | kubectl apply -f - +``` + +The installation can be configured by using the +`--set`, `--values`, `--set-string` and `--set-file` flags. +See [Linkerd Viz Readme](https://www.github.com/linkerd/linkerd2/tree/main/viz/charts/linkerd-viz/README.md) +for a full list of configurable fields. + +### √ linkerd-viz ClusterRoles exist {#l5d-viz-cr-exists} + +Example failure: + +```bash +× linkerd-viz ClusterRoles exist + missing ClusterRoles: linkerd-linkerd-viz-metrics-api + see https://linkerd.io/checks/#l5d-viz-cr-exists for hints +``` + +Ensure the linkerd-viz extension ClusterRoles exist: + +```bash +$ kubectl get clusterroles | grep linkerd-viz +linkerd-linkerd-viz-metrics-api 2021-01-26T18:02:17Z +linkerd-linkerd-viz-prometheus 2021-01-26T18:02:17Z +linkerd-linkerd-viz-tap 2021-01-26T18:02:17Z +linkerd-linkerd-viz-tap-admin 2021-01-26T18:02:17Z +linkerd-linkerd-viz-web-check 2021-01-26T18:02:18Z +``` + +Also ensure you have permission to create ClusterRoles: + +```bash +$ kubectl auth can-i create clusterroles +yes +``` + +### √ linkerd-viz ClusterRoleBindings exist {#l5d-viz-crb-exists} + +Example failure: + +```bash +× linkerd-viz ClusterRoleBindings exist + missing ClusterRoleBindings: linkerd-linkerd-viz-metrics-api + see https://linkerd.io/checks/#l5d-viz-crb-exists for hints +``` + +Ensure the linkerd-viz extension ClusterRoleBindings exist: + +```bash +$ kubectl get clusterrolebindings | grep linkerd-viz +linkerd-linkerd-viz-metrics-api ClusterRole/linkerd-linkerd-viz-metrics-api 18h +linkerd-linkerd-viz-prometheus ClusterRole/linkerd-linkerd-viz-prometheus 18h +linkerd-linkerd-viz-tap ClusterRole/linkerd-linkerd-viz-tap 18h +linkerd-linkerd-viz-tap-auth-delegator ClusterRole/system:auth-delegator 18h +linkerd-linkerd-viz-web-admin ClusterRole/linkerd-linkerd-viz-tap-admin 18h +linkerd-linkerd-viz-web-check ClusterRole/linkerd-linkerd-viz-web-check 18h +``` + +Also ensure you have permission to create ClusterRoleBindings: + +```bash +$ kubectl auth can-i create clusterrolebindings +yes +``` + +### √ tap API server has valid cert {#l5d-tap-cert-valid} + +Example failure: + +```bash +× tap API server has valid cert + secrets "tap-k8s-tls" not found + see https://linkerd.io/checks/#l5d-tap-cert-valid for hints +``` + +Ensure that the `tap-k8s-tls` secret exists and contains the appropriate +`tls.crt` and `tls.key` data entries. For versions before 2.9, the secret is +named `linkerd-tap-tls` and it should contain the `crt.pem` and `key.pem` data +entries. + +```bash +× tap API server has valid cert + cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not tap.linkerd-viz.svc + see https://linkerd.io/checks/#l5d-tap-cert-valid for hints +``` + +Here you need to make sure the certificate was issued specifically for +`tap.linkerd-viz.svc`. + +### √ tap API server cert is valid for at least 60 days {#l5d-tap-cert-not-expiring-soon} + +Example failure: + +```bash +‼ tap API server cert is valid for at least 60 days + certificate will expire on 2020-11-07T17:00:07Z + see https://linkerd.io/checks/#l5d-webhook-cert-not-expiring-soon for hints +``` + +This warning indicates that the expiry of the tap API Server webhook +cert is approaching. In order to address this +problem without incurring downtime, you can follow the process outlined in +[Automatically Rotating your webhook TLS Credentials](../automatically-rotating-webhook-tls-credentials/). + +### √ tap api service is running {#l5d-tap-api} + +Example failure: + +```bash +× FailedDiscoveryCheck: no response from https://10.233.31.133:443: Get https://10.233.31.133:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) +``` + +tap uses the +[kubernetes Aggregated Api-Server model](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/) +to allow users to have k8s RBAC on top. This model has the following specific +requirements in the cluster: + +- tap Server must be + [reachable from kube-apiserver](https://kubernetes.io/docs/concepts/architecture/master-node-communication/#master-to-cluster) +- The kube-apiserver must be correctly configured to + [enable an aggregation layer](https://kubernetes.io/docs/tasks/access-kubernetes-api/configure-aggregation-layer/) + +### √ linkerd-viz pods are injected {#l5d-viz-pods-injection} + +```bash +× linkerd-viz extension pods are injected + could not find proxy container for tap-59f5595fc7-ttndp pod + see https://linkerd.io/checks/#l5d-viz-pods-injection for hints +``` + +Ensure all the linkerd-viz pods are injected + +```bash +$ kubectl -n linkerd-viz get pods +NAME READY STATUS RESTARTS AGE +grafana-68cddd7cc8-nrv4h 2/2 Running 3 18h +metrics-api-77f684f7c7-hnw8r 2/2 Running 2 18h +prometheus-5f6898ff8b-s6rjc 2/2 Running 2 18h +tap-59f5595fc7-ttndp 2/2 Running 2 18h +web-78d6588d4-pn299 2/2 Running 2 18h +tap-injector-566f7ff8df-vpcwc 2/2 Running 2 18h +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check` + +### √ viz extension pods are running {#l5d-viz-pods-running} + +```bash +× viz extension pods are running + container linkerd-proxy in pod tap-59f5595fc7-ttndp is not ready + see https://linkerd.io/checks/#l5d-viz-pods-running for hints +``` + +Ensure all the linkerd-viz pods are running with 2/2 + +```bash +$ kubectl -n linkerd-viz get pods +NAME READY STATUS RESTARTS AGE +grafana-68cddd7cc8-nrv4h 2/2 Running 3 18h +metrics-api-77f684f7c7-hnw8r 2/2 Running 2 18h +prometheus-5f6898ff8b-s6rjc 2/2 Running 2 18h +tap-59f5595fc7-ttndp 2/2 Running 2 18h +web-78d6588d4-pn299 2/2 Running 2 18h +tap-injector-566f7ff8df-vpcwc 2/2 Running 2 18h +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check` + +### √ prometheus is installed and configured correctly {#l5d-viz-prometheus} + +```bash +× prometheus is installed and configured correctly + missing ClusterRoles: linkerd-linkerd-viz-prometheus + see https://linkerd.io/checks/#l5d-viz-cr-exists for hints +``` + +Ensure all the prometheus related resources are present and running +correctly. + +```bash +❯ kubectl -n linkerd-viz get deploy,cm | grep prometheus +deployment.apps/prometheus 1/1 1 1 3m18s +configmap/prometheus-config 1 3m18s +❯ kubectl get clusterRoleBindings | grep prometheus +linkerd-linkerd-viz-prometheus ClusterRole/linkerd-linkerd-viz-prometheus 3m37s +❯ kubectl get clusterRoles | grep prometheus +linkerd-linkerd-viz-prometheus 2021-02-26T06:03:11Zh +``` + +### √ can initialize the client {#l5d-viz-existence-client} + +Example failure: + +```bash +× can initialize the client + Failed to get deploy for pod metrics-api-77f684f7c7-hnw8r: not running +``` + +Verify that the metrics API pod is running correctly + +```bash +❯ kubectl -n linkerd-viz get pods +NAME READY STATUS RESTARTS AGE +metrics-api-7bb8cb8489-cbq4m 2/2 Running 0 4m58s +tap-injector-6b9bc6fc4-cgbr4 2/2 Running 0 4m56s +tap-5f6ddcc684-k2fd6 2/2 Running 0 4m57s +web-cbb846484-d987n 2/2 Running 0 4m56s +grafana-76fd8765f4-9rg8q 2/2 Running 0 4m58s +prometheus-7c5c48c466-jc27g 2/2 Running 0 4m58s +``` + +### √ viz extension self-check {#l5d-viz-metrics-api} + +Example failure: + +```bash +× viz extension self-check + No results returned +``` + +Check the logs on the viz extensions's metrics API: + +```bash +kubectl -n linkerd-viz logs deploy/metrics-api metrics-api +``` + +## The "linkerd-jaeger" checks {#l5d-jaeger} + +These checks only run when the `linkerd-jaeger` extension is installed. +This check is intended to verify the installation of linkerd-jaeger +extension which comprises of open-census collector and jaeger +components along with `jaeger-injector` which injects the specific +trace configuration to the proxies. + +### √ linkerd-jaeger extension Namespace exists {#l5d-jaeger-ns-exists} + +This is the basic check used to verify if the linkerd-jaeger extension +namespace is installed or not. The extension can be installed by running +the following command + +```bash +linkerd jaeger install | kubectl apply -f - +``` + +The installation can be configured by using the +`--set`, `--values`, `--set-string` and `--set-file` flags. +See [Linkerd Jaeger Readme](https://www.github.com/linkerd/linkerd2/tree/main/jaeger/charts/linkerd-jaeger/README.md) +for a full list of configurable fields. + +### √ collector and jaeger service account exists {#l5d-jaeger-sc-exists} + +Example failure: + +```bash +× collector and jaeger service account exists + missing ServiceAccounts: collector + see https://linkerd.io/checks/#l5d-jaeger-sc-exists for hints +``` + +Ensure the linkerd-jaeger ServiceAccounts exist: + +```bash +$ kubectl -n linkerd-jaeger get serviceaccounts +NAME SECRETS AGE +collector 1 23m +jaeger 1 23m +``` + +Also ensure you have permission to create ServiceAccounts in the linkerd-jaeger +namespace: + +```bash +$ kubectl -n linkerd-jaeger auth can-i create serviceaccounts +yes +``` + +### √ collector config map exists {#l5d-jaeger-oc-cm-exists} + +Example failure: + +```bash +× collector config map exists + missing ConfigMaps: collector-config + see https://linkerd.io/checks/#l5d-jaeger-oc-cm-exists for hints +``` + +Ensure the Linkerd ConfigMap exists: + +```bash +$ kubectl -n linkerd-jaeger get configmap/collector-config +NAME DATA AGE +collector-config 1 61m +``` + +Also ensure you have permission to create ConfigMaps: + +```bash +$ kubectl -n linkerd-jaeger auth can-i create configmap +yes +``` + +### √ jaeger extension pods are injected {#l5d-jaeger-pods-injection} + +```bash +× jaeger extension pods are injected + could not find proxy container for jaeger-6f98d5c979-scqlq pod + see https://linkerd.io/checks/#l5d-jaeger-pods-injections for hints +``` + +Ensure all the jaeger pods are injected + +```bash +$ kubectl -n linkerd-jaeger get pods +NAME READY STATUS RESTARTS AGE +collector-69cc44dfbc-rhpfg 2/2 Running 0 11s +jaeger-6f98d5c979-scqlq 2/2 Running 0 11s +jaeger-injector-6c594f5577-cz75h 2/2 Running 0 10s +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check` + +### √ jaeger extension pods are running {#l5d-jaeger-pods-running} + +```bash +× jaeger extension pods are running + container linkerd-proxy in pod jaeger-59f5595fc7-ttndp is not ready + see https://linkerd.io/checks/#l5d-jaeger-pods-running for hints +``` + +Ensure all the linkerd-jaeger pods are running with 2/2 + +```bash +$ kubectl -n linkerd-jaeger get pods +NAME READY STATUS RESTARTS AGE +jaeger-injector-548684d74b-bcq5h 2/2 Running 0 5s +collector-69cc44dfbc-wqf6s 2/2 Running 0 5s +jaeger-6f98d5c979-vs622 2/2 Running 0 5sh +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check` + +## The "linkerd-buoyant" checks {#l5d-buoyant} + +These checks only run when the `linkerd-buoyant` extension is installed. +This check is intended to verify the installation of linkerd-buoyant +extension which comprises `linkerd-buoyant` CLI, the `buoyant-cloud-agent` +Deployment, and the `buoyant-cloud-metrics` DaemonSet. + +### √ Linkerd extension command linkerd-buoyant exists + +```bash +‼ Linkerd extension command linkerd-buoyant exists + exec: "linkerd-buoyant": executable file not found in $PATH + see https://linkerd.io/2/checks/#extensions for hints +``` + +Ensure you have the `linkerd-buoyant` cli installed: + +```bash +linkerd-buoyant check +``` + +To install the CLI: + +```bash +curl https://buoyant.cloud/install | sh +``` + +### √ linkerd-buoyant can determine the latest version + +```bash +‼ linkerd-buoyant can determine the latest version + Get "https://buoyant.cloud/version.json": dial tcp: lookup buoyant.cloud: no such host + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure you can connect to the Linkerd Buoyant version check endpoint from the +environment the `linkerd` cli is running: + +```bash +$ curl https://buoyant.cloud/version.json +{"linkerd-buoyant":"v0.4.4"} +``` + +### √ linkerd-buoyant cli is up-to-date + +```bash +‼ linkerd-buoyant cli is up-to-date + CLI version is v0.4.3 but the latest is v0.4.4 + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +To update to the latest version of the `linkerd-buoyant` CLI: + +```bash +curl https://buoyant.cloud/install | sh +``` + +### √ buoyant-cloud Namespace exists + +```bash +× buoyant-cloud Namespace exists + namespaces "buoyant-cloud" not found + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure the `buoyant-cloud` namespace exists: + +```bash +kubectl get ns/buoyant-cloud +``` + +If the namespace does not exist, the `linkerd-buoyant` installation may be +missing or incomplete. To install the extension: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` + +### √ buoyant-cloud Namespace has correct labels + +```bash +× buoyant-cloud Namespace has correct labels + missing app.kubernetes.io/part-of label + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +The `linkerd-buoyant` installation may be missing or incomplete. To install the +extension: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` + +### √ buoyant-cloud-agent ClusterRole exists + +```bash +× buoyant-cloud-agent ClusterRole exists + missing ClusterRole: buoyant-cloud-agent + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure that the cluster role exists: + +```bash +$ kubectl get clusterrole buoyant-cloud-agent +NAME CREATED AT +buoyant-cloud-agent 2020-11-13T00:59:50Z +``` + +Also ensure you have permission to create ClusterRoles: + +```bash +$ kubectl auth can-i create ClusterRoles +yes +``` + +### √ buoyant-cloud-agent ClusterRoleBinding exists + +```bash +× buoyant-cloud-agent ClusterRoleBinding exists + missing ClusterRoleBinding: buoyant-cloud-agent + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure that the cluster role binding exists: + +```bash +$ kubectl get clusterrolebinding buoyant-cloud-agent +NAME ROLE AGE +buoyant-cloud-agent ClusterRole/buoyant-cloud-agent 301d +``` + +Also ensure you have permission to create ClusterRoleBindings: + +```bash +$ kubectl auth can-i create ClusterRoleBindings +yes +``` + +### √ buoyant-cloud-agent ServiceAccount exists + +```bash +× buoyant-cloud-agent ServiceAccount exists + missing ServiceAccount: buoyant-cloud-agent + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure that the service account exists: + +```bash +$ kubectl -n buoyant-cloud get serviceaccount buoyant-cloud-agent +NAME SECRETS AGE +buoyant-cloud-agent 1 301d +``` + +Also ensure you have permission to create ServiceAccounts: + +```bash +$ kubectl -n buoyant-cloud auth can-i create ServiceAccount +yes +``` + +### √ buoyant-cloud-id Secret exists + +```bash +× buoyant-cloud-id Secret exists + missing Secret: buoyant-cloud-id + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure that the secret exists: + +```bash +$ kubectl -n buoyant-cloud get secret buoyant-cloud-id +NAME TYPE DATA AGE +buoyant-cloud-id Opaque 4 301d +``` + +Also ensure you have permission to create ServiceAccounts: + +```bash +$ kubectl -n buoyant-cloud auth can-i create ServiceAccount +yes +``` + +### √ buoyant-cloud-agent Deployment exists + +```bash +× buoyant-cloud-agent Deployment exists + deployments.apps "buoyant-cloud-agent" not found + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure the `buoyant-cloud-agent` Deployment exists: + +```bash +kubectl -n buoyant-cloud get deploy/buoyant-cloud-agent +``` + +If the Deployment does not exist, the `linkerd-buoyant` installation may be +missing or incomplete. To reinstall the extension: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` + +### √ buoyant-cloud-agent Deployment is running + +```bash +× buoyant-cloud-agent Deployment is running + no running pods for buoyant-cloud-agent Deployment + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Note, it takes a little bit for pods to be scheduled, images to be pulled and +everything to start up. If this is a permanent error, you'll want to validate +the state of the `buoyant-cloud-agent` Deployment with: + +```bash +$ kubectl -n buoyant-cloud get po --selector app=buoyant-cloud-agent +NAME READY STATUS RESTARTS AGE +buoyant-cloud-agent-6b8c6888d7-htr7d 2/2 Running 0 156m +``` + +Check the agent's logs with: + +```bash +kubectl logs -n buoyant-cloud buoyant-cloud-agent-6b8c6888d7-htr7d buoyant-cloud-agent +``` + +### √ buoyant-cloud-agent Deployment is injected + +```bash +× buoyant-cloud-agent Deployment is injected + could not find proxy container for buoyant-cloud-agent-6b8c6888d7-htr7d pod + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure the `buoyant-cloud-agent` pod is injected, the `READY` column should show +`2/2`: + +```bash +$ kubectl -n buoyant-cloud get pods --selector app=buoyant-cloud-agent +NAME READY STATUS RESTARTS AGE +buoyant-cloud-agent-6b8c6888d7-htr7d 2/2 Running 0 161m +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check`. + +### √ buoyant-cloud-agent Deployment is up-to-date + +```bash +‼ buoyant-cloud-agent Deployment is up-to-date + incorrect app.kubernetes.io/version label: v0.4.3, expected: v0.4.4 + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Check the version with: + +```bash +$ linkerd-buoyant version +CLI version: v0.4.4 +Agent version: v0.4.4 +``` + +To update to the latest version: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` + +### √ buoyant-cloud-agent Deployment is running a single pod + +```bash +× buoyant-cloud-agent Deployment is running a single pod + expected 1 buoyant-cloud-agent pod, found 2 + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +`buoyant-cloud-agent` should run as a singleton. Check for other pods: + +```bash +kubectl get po -A --selector app=buoyant-cloud-agent +``` + +### √ buoyant-cloud-metrics DaemonSet exists + +```bash +× buoyant-cloud-metrics DaemonSet exists + deployments.apps "buoyant-cloud-metrics" not found + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure the `buoyant-cloud-metrics` DaemonSet exists: + +```bash +kubectl -n buoyant-cloud get daemonset/buoyant-cloud-metrics +``` + +If the DaemonSet does not exist, the `linkerd-buoyant` installation may be +missing or incomplete. To reinstall the extension: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` + +### √ buoyant-cloud-metrics DaemonSet is running + +```bash +× buoyant-cloud-metrics DaemonSet is running + no running pods for buoyant-cloud-metrics DaemonSet + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Note, it takes a little bit for pods to be scheduled, images to be pulled and +everything to start up. If this is a permanent error, you'll want to validate +the state of the `buoyant-cloud-metrics` DaemonSet with: + +```bash +$ kubectl -n buoyant-cloud get po --selector app=buoyant-cloud-metrics +NAME READY STATUS RESTARTS AGE +buoyant-cloud-metrics-kt9mv 2/2 Running 0 163m +buoyant-cloud-metrics-q8jhj 2/2 Running 0 163m +buoyant-cloud-metrics-qtflh 2/2 Running 0 164m +buoyant-cloud-metrics-wqs4k 2/2 Running 0 163m +``` + +Check the agent's logs with: + +```bash +kubectl logs -n buoyant-cloud buoyant-cloud-metrics-kt9mv buoyant-cloud-metrics +``` + +### √ buoyant-cloud-metrics DaemonSet is injected + +```bash +× buoyant-cloud-metrics DaemonSet is injected + could not find proxy container for buoyant-cloud-agent-6b8c6888d7-htr7d pod + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Ensure the `buoyant-cloud-metrics` pods are injected, the `READY` column should +show `2/2`: + +```bash +$ kubectl -n buoyant-cloud get pods --selector app=buoyant-cloud-metrics +NAME READY STATUS RESTARTS AGE +buoyant-cloud-metrics-kt9mv 2/2 Running 0 166m +buoyant-cloud-metrics-q8jhj 2/2 Running 0 166m +buoyant-cloud-metrics-qtflh 2/2 Running 0 166m +buoyant-cloud-metrics-wqs4k 2/2 Running 0 166m +``` + +Make sure that the `proxy-injector` is working correctly by running +`linkerd check`. + +### √ buoyant-cloud-metrics DaemonSet is up-to-date + +```bash +‼ buoyant-cloud-metrics DaemonSet is up-to-date + incorrect app.kubernetes.io/version label: v0.4.3, expected: v0.4.4 + see https://linkerd.io/checks#l5d-buoyant for hints +``` + +Check the version with: + +```bash +$ kubectl -n buoyant-cloud get daemonset/buoyant-cloud-metrics -o jsonpath='{.metadata.labels}' +{"app.kubernetes.io/name":"metrics","app.kubernetes.io/part-of":"buoyant-cloud","app.kubernetes.io/version":"v0.4.4"} +``` + +To update to the latest version: + +```bash +linkerd-buoyant install | kubectl apply -f - +``` diff --git a/linkerd.io/content/2.11/tasks/uninstall-multicluster.md b/linkerd.io/content/2.11/tasks/uninstall-multicluster.md new file mode 100644 index 0000000000..205cda90c8 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/uninstall-multicluster.md @@ -0,0 +1,41 @@ ++++ +title = "Uninstalling Multicluster" +description = "Unlink and uninstall Linkerd multicluster." ++++ + +The Linkerd multicluster components allow for sending traffic from one cluster +to another. For more information on how to set this up, see [installing multicluster](../installing-multicluster/). + +## Unlinking + +Unlinking a cluster will delete all resources associated with that link +including: + +* the service mirror controller +* the Link resource +* the credentials secret +* mirror services + +It is recommended that you use the `unlink` command rather than deleting any +of these resources individually to help ensure that all mirror services get +cleaned up correctly and are not left orphaned. + +To unlink, run the `linkerd multicluster unlink` command and pipe the output +to `kubectl delete`: + +```bash +linkerd multicluster unlink --cluster-name=target | kubectl delete -f - +``` + +## Uninstalling + +Uninstalling the multicluster components will remove all components associated +with Linkerd's multicluster functionality including the gateway and service +account. Before you can uninstall, you must remove all existing links as +described above. Once all links have been removed, run: + +```bash +linkerd multicluster uninstall | kubectl delete -f - +``` + +Attempting to uninstall while at least one link remains will result in an error. diff --git a/linkerd.io/content/2.11/tasks/uninstall.md b/linkerd.io/content/2.11/tasks/uninstall.md new file mode 100644 index 0000000000..742afa7e56 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/uninstall.md @@ -0,0 +1,52 @@ ++++ +title = "Uninstalling Linkerd" +description = "Linkerd can be easily removed from a Kubernetes cluster." ++++ + +Removing Linkerd from a Kubernetes cluster requires a few steps: removing any +data plane proxies, removing all the extensions and then removing the core +control plane. + +## Removing Linkerd data plane proxies + +To remove the Linkerd data plane proxies, you should remove any [Linkerd proxy +injection annotations](../../features/proxy-injection/) and roll the deployments. +When Kubernetes recreates the pods, they will not have the Linkerd data plane +attached. + +## Removing extensions + +To remove any extension, call its `uninstall` subcommand and pipe it to `kubectl +delete -f -`. For the bundled extensions that means: + +```bash +# To remove Linkerd Viz +linkerd viz uninstall | kubectl delete -f - + +# To remove Linkerd Jaeger +linkerd jaeger uninstall | kubectl delete -f - + +# To remove Linkerd Multicluster +linkerd multicluster uninstall | kubectl delete -f - +``` + +## Removing the control plane + +{{< note >}} +Uninstallating the control plane requires cluster-wide permissions. +{{< /note >}} + +To remove the [control plane](../../reference/architecture/#control-plane), run: + +```bash +linkerd uninstall | kubectl delete -f - +``` + +The `linkerd uninstall` command outputs the manifest for all of the Kubernetes +resources necessary for the control plane, including namespaces, service +accounts, CRDs, and more; `kubectl delete` then deletes those resources. + +This command can also be used to remove control planes that have been partially +installed. Note that `kubectl delete` will complain about any resources that it +was asked to delete that hadn't been created, but these errors can be safely +ignored. diff --git a/linkerd.io/content/2.11/tasks/upgrade-multicluster.md b/linkerd.io/content/2.11/tasks/upgrade-multicluster.md new file mode 100644 index 0000000000..d363c9d114 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/upgrade-multicluster.md @@ -0,0 +1,109 @@ ++++ +title = "Upgrading Multicluster in Linkerd 2.9" +description = "Upgrading Multicluster to Linkerd 2.9." ++++ + +Linkerd 2.9 changes the way that some of the multicluster components work and +are installed compared to Linkerd 2.8.x. Users installing the multicluster +components for the first time with Linkerd 2.9 can ignore these instructions +and instead refer directly to the [installing multicluster](../installing-multicluster/). + +Users who installed the multicluster component in Linkerd 2.8.x and wish to +upgrade to Linkerd 2.9 should follow these instructions. + +## Overview + +The main differences between multicluster in 2.8 and 2.9 is that in 2.9 we +create a service mirror controller for each target cluster that a source +cluster is linked to. The service mirror controller is created as part of the +`linkerd multicluster link` command instead of the `linkerd multicluster install` +command. There is also a new CRD type called `Link` which is used to configure +the service mirror controller and allows you to be able to specify the label +selector used to determine which services to mirror. + +## Ordering of Cluster Upgrades + +Clusters may be upgraded in any order regardless of if each cluster is source +cluster, target cluster, or both. + +## Target Clusters + +A cluster which receives multicluster traffic but does not send multicluster +traffic requires no special upgrade treatment. It can safely be upgraded by +just upgrading the main Linkerd controller plane: + +```bash +linkerd upgrade | kubectl apply -f - +``` + +## Source Clusters + +A cluster which sends multicluster traffic must be upgraded carefully to ensure +that mirror services remain up during the upgrade so as to avoid downtime. + +### Control Plane + +Begin by upgrading the Linkerd control plane and multicluster resources to +version 2.9 by running + +```bash +linkerd upgrade | kubectl apply -f - +linkerd --context=source multicluster install | kubectl --context=source apply -f - +linkerd --context=target multicluster install | kubectl --context=target apply -f - +``` + +### Label Exported Services + +Next must apply a label to each exported service in the target cluster. This +label is how the 2.9 service mirror controller will know to mirror those +services. The label can be anything you want, but by default we will use +`mirror.linkerd.io/exported=true`. For each exported service in the target +cluster, run: + +```bash +kubectl --context=target label svc/ mirror.linkerd.io=true +``` + +Any services not labeled in this way will no longer be mirrored after the +upgrade is complete. + +### Upgrade Link + +Next we re-establish the link. This will create a 2.9 version of the service +mirror controller. Note that this is the same command that you used to establish +the link while running Linkerd 2.8 but here we are running it with version 2.9 +of the Linkerd CLI: + +```bash +linkerd --context=target multicluster link --cluster-name= \ + | kubectl --context=source apply -f - +``` + +If you used a label other than `mirror.linkerd.io/exported=true` when labeling +your exported services, you must specify that in the `--selector` flag: + +```bash +linkerd --context=target multicluster link --cluster-name= \ + --selector my.cool.label=true | kubectl --context=source apply -f - +``` + +There should now be two service mirror deployments running: one from version +2.8 called `linkerd-service-mirror` and one from version 2.9 called +`linkerd-service-mirror-`. All mirror services should remain +active and healthy. + +### Cleanup + +The 2.8 version of the service mirror controller can now be safely deleted: + +```bash +kubectl --context=source -n linkerd-multicluster delete deploy/linkerd-service-mirror +``` + +Ensure that your cluster is still in a healthy state by running: + +```bash +linkerd --context=source multicluster check +``` + +Congratulations, your upgrade is complete! diff --git a/linkerd.io/content/2.11/tasks/upgrade.md b/linkerd.io/content/2.11/tasks/upgrade.md new file mode 100644 index 0000000000..e05728c756 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/upgrade.md @@ -0,0 +1,1030 @@ ++++ +title = "Upgrading Linkerd" +description = "Upgrade Linkerd to the latest version." +aliases = [ + "../upgrade/", + "../update/" +] ++++ + +In this guide, we'll walk you through how to upgrade Linkerd. + +Before starting, read through the version-specific upgrade notices below, which +may contain important information you need to be aware of before commencing +with the upgrade process: + +- [Upgrade notice: stable-2.11.0](#upgrade-notice-stable-2-11-0) +- [Upgrade notice: stable-2.10.0](#upgrade-notice-stable-2-10-0) +- [Upgrade notice: stable-2.9.4](#upgrade-notice-stable-2-9-4) +- [Upgrade notice: stable-2.9.3](#upgrade-notice-stable-2-9-3) +- [Upgrade notice: stable-2.9.0](#upgrade-notice-stable-2-9-0) +- [Upgrade notice: stable-2.8.0](#upgrade-notice-stable-2-8-0) +- [Upgrade notice: stable-2.7.0](#upgrade-notice-stable-2-7-0) +- [Upgrade notice: stable-2.6.0](#upgrade-notice-stable-2-6-0) +- [Upgrade notice: stable-2.5.0](#upgrade-notice-stable-2-5-0) +- [Upgrade notice: stable-2.4.0](#upgrade-notice-stable-2-4-0) +- [Upgrade notice: stable-2.3.0](#upgrade-notice-stable-2-3-0) +- [Upgrade notice: stable-2.2.0](#upgrade-notice-stable-2-2-0) + +## Steps to upgrade + +There are three components that need to be upgraded, in turn: + +- [CLI](#upgrade-the-cli) +- [Control Plane](#upgrade-the-control-plane) +- [Data Plane](#upgrade-the-data-plane) + +## Upgrade the CLI + +This will upgrade your local CLI to the latest version. You will want to follow +these instructions for anywhere that uses the Linkerd CLI. For Helm users feel +free to skip to the [Helm section](#with-helm). + +To upgrade the CLI locally, run: + +```bash +curl -sL https://run.linkerd.io/install | sh +``` + +Alternatively, you can download the CLI directly via the +[Linkerd releases page](https://github.com/linkerd/linkerd2/releases/). + +Verify the CLI is installed and running correctly with: + +```bash +linkerd version --client +``` + +Which should display: + +```bash +Client version: {{% latestversion %}} +``` + +{{< note >}} +Until you upgrade the control plane, some new CLI commands may not work. +{{< /note >}} + +You are now ready to [upgrade your control plane](#upgrade-the-control-plane). + +## Upgrade the Control Plane + +Now that you have upgraded the CLI, it is time to upgrade the Linkerd control +plane on your Kubernetes cluster. Don't worry, the existing data plane will +continue to operate with a newer version of the control plane and your meshed +services will not go down. + +{{< note >}} +You will lose the historical data from Prometheus. If you would like to have +that data persisted through an upgrade, take a look at the +[persistence documentation](../../observability/exporting-metrics/) +{{< /note >}} + +### With Linkerd CLI + +Use the `linkerd upgrade` command to upgrade the control plane. This command +ensures that all of the control plane's existing configuration and mTLS secrets +are retained. Notice that we use the `--prune` flag to remove any Linkerd +resources from the previous version which no longer exist in the new version. + +```bash +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +Next, run this command again with some `--prune-whitelist` flags added. This is +necessary to make sure that certain cluster-scoped resources are correctly +pruned. + +```bash +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd \ + --prune-whitelist=rbac.authorization.k8s.io/v1/clusterrole \ + --prune-whitelist=rbac.authorization.k8s.io/v1/clusterrolebinding \ + --prune-whitelist=apiregistration.k8s.io/v1/apiservice -f - +``` + +For upgrading a multi-stage installation setup, follow the instructions at +[Upgrading a multi-stage install](#upgrading-a-multi-stage-install). + +Users who have previously saved the Linkerd control plane's configuration to +files can follow the instructions at +[Upgrading via manifests](#upgrading-via-manifests) +to ensure those configuration are retained by the `linkerd upgrade` command. + +### With Helm + +For a Helm workflow, check out the instructions at +[Helm upgrade procedure](../install-helm/#helm-upgrade-procedure). + +### Verify the control plane upgrade + +Once the upgrade process completes, check to make sure everything is healthy +by running: + +```bash +linkerd check +``` + +This will run through a set of checks against your control plane and make sure +that it is operating correctly. + +To verify the Linkerd control plane version, run: + +```bash +linkerd version +``` + +Which should display: + +```txt +Client version: {{% latestversion %}} +Server version: {{% latestversion %}} +``` + +Next, we will [upgrade your data plane](#upgrade-the-data-plane). + +## Upgrade the Data Plane + +With a fully up-to-date CLI running locally and Linkerd control plane running on +your Kubernetes cluster, it is time to upgrade the data plane. The easiest +way to do this is to run a rolling deploy on your services, allowing the +proxy-injector to inject the latest version of the proxy as they come up. + +With `kubectl` 1.15+, this can be as simple as using the `kubectl rollout +restart` command to restart all your meshed services. For example, + +```bash +kubectl -n rollout restart deploy +``` + +{{< note >}} +Unless otherwise documented in the release notes, stable release control planes +should be compatible with the data plane from the previous stable release. +Thus, data plane upgrades can be done at any point after the control plane has +been upgraded, including as part of the application's natural deploy cycle. A +gap of more than one stable version between control plane and data plane is not +recommended. +{{< /note >}} + +Workloads that were previously injected using the `linkerd inject --manual` +command can be upgraded by re-injecting the applications in-place. For example, + +```bash +kubectl -n emojivoto get deploy -l linkerd.io/control-plane-ns=linkerd -oyaml \ + | linkerd inject --manual - \ + | kubectl apply -f - +``` + +### Verify the data plane upgrade + +Check to make sure everything is healthy by running: + +```bash +linkerd check --proxy +``` + +This will run through a set of checks to verify that the data plane is +operating correctly, and will list any pods that are still running older +versions of the proxy. + +Congratulation! You have successfully upgraded your Linkerd to the newer +version. If you have any questions, feel free to raise them at the #linkerd2 +channel in the [Linkerd slack](https://slack.linkerd.io/). + +## Upgrade notice: stable-2.11.0 + +There are two breaking changes in the 2.11.0 release: pods in `ingress` no +longer support non-HTTP traffic to meshed workloads; and the proxy no longer +forwards traffic to ports that are bound only to localhost. Additionally, users +of the multi-cluster extension will need to re-link their cluster after +upgrading. + +### Control plane changes + +The `controller` pod has been removed from the control plane. All configuration +options that previously applied to it are no longer valid (e.g +`publicAPIResources` and all of its nested fields). Additionally, the +destination pod has a new `policy` container that runs the policy controller. + +### Routing breaking changes + +There are two breaking changes to be aware of when it comes to how traffic is +routed. + +First, when the proxy runs in ingress mode (`config.linkerd.io/inject: +ingress`), non-HTTP traffic to meshed pods is no longer supported. To get +around this, you will need to use the `config.linkerd.io/skip-outbound-ports` +annotation on your ingress controller pod. In many cases, ingress mode is no +longer necessary. Before upgrading, it may be worth revisiting [how to use +ingress](../using-ingress/) with Linkerd. + +Second, the proxy will no longer forward traffic to ports only bound on +localhost, such as `127.0.0.1:8080`. Services that want to receive traffic from +other pods should now be bound to a public interface (e.g `0.0.0.0:8080`). This +change prevents ports from being accidentally exposed outside of the pod. + +### Multicluster + +The gateway component has been changed to use a `pause` container instead of +`nginx`. This change should reduce the footprint of the extension; the proxy +routes traffic internally and does not need to rely on `nginx` to receive or +forward traffic. While this will not cause any downtime when upgrading +multicluster, it does affect probing. `linkerd multicluster gateways` will +falsely advertise the target cluster gateway as being down until the clusters +are re-linked. + +Multicluster now supports `NodePort` type services for the gateway. To support +this change, the configuration options in the Helm values file are now grouped +under the `gateway` field. If you have installed the extension with other +options than the provided defaults, you will need to update your `values.yaml` +file to reflect this change in field grouping. + +### Other changes + +Besides the breaking changes described above, there are other minor changes to +be aware of when upgrading from `stable-2.10.x`: + +- `PodSecurityPolicy` (PSP) resources are no longer installed by default as a + result of their deprecation in Kubernetes v1.21 and above. The control plane + and core extensions will now be shipped without PSPs; they can be enabled + through a new install option `enablePSP: true`. +- `tcp_connection_duration_ms` metric has been removed. +- Opaque ports changes: `443` is no longer included in the default opaque ports + list. Ports `4444`, `6379` and `9300` corresponding to Galera, Redis and + ElasticSearch respectively (all server speak first protocols) have been added + to the default opaque ports list. The default ignore inbound ports list has + also been changed to include ports `4567` and `4568`. + +## Upgrade notice: stable-2.10.0 + +If you are currently running Linkerd 2.9.0, 2.9.1, 2.9.2, or 2.9.3 (but *not* +2.9.4), and you *upgraded* to that release using the `--prune` flag (as opposed +to installing it fresh), you will need to use the `linkerd repair` command as +outlined in the [Linkerd 2.9.3 upgrade notes](#upgrade-notice-stable-2-9-3) +before you can upgrade to Linkerd 2.10. + +Additionally, there are two changes in the 2.10.0 release that may affect you. +First, the handling of certain ports and protocols has changed. Please read +through our [ports and protocols in 2.10 upgrade +guide](../upgrading-2.10-ports-and-protocols/) for the repercussions. + +Second, we've introduced [extensions](../extensions/) and moved the +default visualization components into a Linkerd-Viz extension. Read on for what +this means for you. + +### Visualization components moved to Linkerd-Viz extension + +With the introduction of [extensions](../extensions/), all of the +Linkerd control plane components related to visibility (including Prometheus, +Grafana, Web, and Tap) have been removed from the main Linkerd control plane +and moved into the Linkerd-Viz extension. This means that when you upgrade to +stable-2.10.0, these components will be removed from your cluster and you will +not be able to run commands such as `linkerd stat` or +`linkerd dashboard`. To restore this functionality, you must install the +Linkerd-Viz extension by running `linkerd viz install | kubectl apply -f -` +and then invoke those commands through `linkerd viz stat`, +`linkerd viz dashboard`, etc. + +```bash +# Upgrade the control plane (this will remove viz components). +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +# Prune cluster-scoped resources +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd \ + --prune-whitelist=rbac.authorization.k8s.io/v1/clusterrole \ + --prune-whitelist=rbac.authorization.k8s.io/v1/clusterrolebinding \ + --prune-whitelist=apiregistration.k8s.io/v1/apiservice -f - +# Install the Linkerd-Viz extension to restore viz functionality. +linkerd viz install | kubectl apply -f - +``` + +Helm users should note that configuration values related to these visibility +components have moved to the Linkerd-Viz chart. Please update any values +overrides you have and use these updated overrides when upgrading the Linkerd +chart or installing the Linkerd-Viz chart. See below for a complete list of +values which have moved. + +```bash +helm repo update +# Upgrade the control plane (this will remove viz components). +helm upgrade linkerd2 linkerd/linkerd2 --reset-values -f values.yaml --atomic +# Install the Linkerd-Viz extension to restore viz functionality. +helm install linkerd2-viz linkerd/linkerd2-viz -f viz-values.yaml +``` + +The following values were removed from the Linkerd2 chart. Most of the removed +values have been moved to the Linkerd-Viz chart or the Linkerd-Jaeger chart. + +- `dashboard.replicas` moved to Linkerd-Viz as `dashboard.replicas` +- `tap` moved to Linkerd-Viz as `tap` +- `tapResources` moved to Linkerd-Viz as `tap.resources` +- `tapProxyResources` moved to Linkerd-Viz as `tap.proxy.resources` +- `webImage` moved to Linkerd-Viz as `dashboard.image` +- `webResources` moved to Linkerd-Viz as `dashboard.resources` +- `webProxyResources` moved to Linkerd-Viz as `dashboard.proxy.resources` +- `grafana` moved to Linkerd-Viz as `grafana` +- `grafana.proxy` moved to Linkerd-Viz as `grafana.proxy` +- `prometheus` moved to Linkerd-Viz as `prometheus` +- `prometheus.proxy` moved to Linkerd-Viz as `prometheus.proxy` +- `global.proxy.trace.collectorSvcAddr` moved to Linkerd-Jaeger as `webhook.collectorSvcAddr` +- `global.proxy.trace.collectorSvcAccount` moved to Linkerd-Jaeger as `webhook.collectorSvcAccount` +- `tracing.enabled` removed +- `tracing.collector` moved to Linkerd-Jaeger as `collector` +- `tracing.jaeger` moved to Linkerd-Jaeger as `jaeger` + +Also please note the global scope from the Linkerd2 chart values has been +dropped, moving the config values underneath it into the root scope. Any values +you had customized there will need to be migrated; in particular +`identityTrustAnchorsPEM` in order to conserve the value you set during +install." + +## Upgrade notice: stable-2.9.4 + +See upgrade notes for 2.9.3 below. + +## Upgrade notice: stable-2.9.3 + +### Linkerd Repair + +Due to a known issue in versions stable-2.9.0, stable-2.9.1, and stable-2.9.2, +users who upgraded to one of those versions with the --prune flag (as described +above) will have deleted the `secret/linkerd-config-overrides` resource which is +necessary for performing any subsequent upgrades. Linkerd stable-2.9.3 includes +a new `linkerd repair` command which restores this deleted resource. If you see +unexpected error messages during upgrade such as "failed to read CA: not +PEM-encoded", please upgrade your CLI to stable-2.9.3 and run: + +```bash +linkerd repair | kubectl apply -f - +``` + +This will restore the `secret/linkerd-config-overrides` resource and allow you +to proceed with upgrading your control plane. + +## Upgrade notice: stable-2.9.0 + +### Images are now hosted on ghcr.io + +As of this version images are now hosted under `ghcr.io` instead of `gcr.io`. If +you're pulling images into a private repo please make the necessary changes. + +### Upgrading multicluster environments + +Linkerd 2.9 changes the way that some of the multicluster components work and +are installed compared to Linkerd 2.8.x. Users installing the multicluster +components for the first time with Linkerd 2.9 can ignore these instructions and +instead refer directly to the [installing +multicluster instructions](../installing-multicluster/). + +Users who installed the multicluster component in Linkerd 2.8.x and wish to +upgrade to Linkerd 2.9 should follow the [upgrade multicluster +instructions](../upgrade-multicluster/). + +### Ingress behavior changes + +In previous versions when you injected your ingress controller (Nginx, Traefik, +Ambassador, etc), then the ingress' balancing/routing choices would be +overridden with Linkerd's (using service profiles, traffic splits, etc.). + +As of 2.9 the ingress's choices are honored instead, which allows preserving +things like session-stickiness. Note however that this means per-route metrics +are not collected, traffic splits will not be honored and retries/timeouts are +not applied. + +If you want to revert to the previous behavior, inject the proxy into the +ingress controller using the annotation `linkerd.io/inject: ingress`, as +explained in [using ingress](../using-ingress/) + +### Breaking changes in Helm charts + +Some entries like `controllerLogLevel` and all the Prometheus config have +changed their position in the settings hierarchy. To get a precise view of what +has changed you can compare the +[stable-2.8.1](https://github.com/linkerd/linkerd2/blob/stable-2.8.1/charts/linkerd2/values.yaml) +and +[stable-2.9.0](https://github.com/linkerd/linkerd2/blob/stable-2.9.0/charts/linkerd2/values.yaml) +`values.yaml` files. + +### Post-upgrade cleanup + +In order to better support cert-manager, the secrets +`linkerd-proxy-injector-tls`, `linkerd-sp-validator-tls` and `linkerd-tap-tls` +have been replaced by the secrets `linkerd-proxy-injector-k8s-tls`, +`linkerd-sp-validator-k8s-tls` and `linkerd-tap-k8s-tls` respectively. If you +upgraded through the CLI, please delete the old ones (if you upgraded through +Helm the cleanup was automated). + +## Upgrade notice: stable-2.8.0 + +There are no version-specific notes for upgrading to this release. The upgrade +process detailed above ([upgrade the CLI](#upgrade-the-cli), +[upgrade the control plane](#upgrade-the-control-plane), then +[upgrade the data plane](#upgrade-the-data-plane)) should +work. + +## Upgrade notice: stable-2.7.0 + +### Checking whether any of your TLS certificates are approaching expiry + +This version introduces a set of CLI flags and checks that help you rotate +your TLS certificates. The new CLI checks will warn you if any of your +certificates are expiring in the next 60 days. If you however want to check +the expiration date of your certificates and determine for yourself whether +you should be rotating them, you can execute the following commands. Note that +this will require [step 0.13.3](https://smallstep.com/cli/) and +[jq 1.6](https://stedolan.github.io/jq/). + +Check your trust roots: + +```bash +kubectl -n linkerd get cm linkerd-config -o=jsonpath="{.data}" | \ +jq -r .identityContext.trustAnchorsPem | \ +step certificate inspect --short - + +X.509v3 Root CA Certificate (ECDSA P-256) [Serial: 1] + Subject: identity.linkerd.cluster.local + Issuer: identity.linkerd.cluster.local + Valid from: 2020-01-14T13:23:32Z + to: 2021-01-13T13:23:52Z +``` + +Check your issuer certificate: + +```bash +kubectl -n linkerd get secret linkerd-identity-issuer -o=jsonpath="{.data['crt\.pem']}" | \ +base64 --decode | \ +step certificate inspect --short - + +X.509v3 Root CA Certificate (ECDSA P-256) [Serial: 1] + Subject: identity.linkerd.cluster.local + Issuer: identity.linkerd.cluster.local + Valid from: 2020-01-14T13:23:32Z + to: 2021-01-13T13:23:52Z +``` + +If you determine that you wish to rotate your certificates you can follow +the process outlined in +[Rotating your identity certificates](../rotating_identity_certificates/). +Note that this process uses functionality available in stable-2.7.0. So before +you start your cert rotation, make sure to upgrade. + +When ready, you can begin the upgrade process by +[installing the new CLI](#upgrade-the-cli). + +### Breaking changes in Helm charts + +As part of an effort to follow Helm's best practices the Linkerd Helm +chart has been restructured. As a result most of the keys have been changed. +In order to ensure trouble-free upgrade of your Helm installation, please take +a look at [Helm upgrade procedure](../install-helm/). To get a precise +view of what has changed you can compare that +[stable-2.6.0](https://github.com/linkerd/linkerd2/blob/stable-2.6.0/charts/linkerd2/values.yaml) +and [stable-2.7.0](https://github.com/linkerd/linkerd2/blob/stable-2.7.0/charts/linkerd2/values.yaml) +`values.yaml` files. + +## Upgrade notice: stable-2.6.0 + +{{< note >}} +Upgrading to this release from edge-19.9.3, edge-19.9.4, edge-19.9.5 and +edge-19.10.1 will incur data plane downtime, due to a recent change introduced +to ensure zero downtime upgrade for previous stable releases. +{{< /note >}} + +The `destination` container is now deployed as its own `Deployment` workload. +When you are planning the upgrade from one of the edge versions listed above, +be sure to allocate time to restart the data plane once the control plane is +successfully upgraded. This restart can be done at your convenience with the +recommendation that it be done over the course of time appropriate for your +application. + +If you are upgrading from a previous stable version, restarting the data-plane +is __recommended__ as a best practice, although not necessary. + +If you have previously labelled any of your namespaces with the +`linkerd.io/is-control-plane` label so that their pod creation events are +ignored by the HA proxy injector, you will need to update these namespaces +to use the new `config.linkerd.io/admission-webhooks: disabled` label. + +When ready, you can begin the upgrade process by +[installing the new CLI](#upgrade-the-cli). + +## Upgrade notice: stable-2.5.0 + +This release supports Kubernetes 1.12+. + +{{< note >}} +Linkerd 2.5.0 introduced [Helm support](../install-helm/). If Linkerd was +installed via `linkerd install`, it must be upgraded via `linkerd upgrade`. If +Linkerd was installed via Helm, it must be upgraded via Helm. Mixing these two +installation procedures is not supported. +{{< /note >}} + +### Upgrading from stable-2.4.x + +{{< note >}} +These instructions also apply to upgrading from edge-19.7.4, edge-19.7.5, +edge-19.8.1, edge-19.8.2, edge-19.8.3, edge-19.8.4, and edge-19.8.5. +{{< /note >}} + +Use the `linkerd upgrade` command to upgrade the control plane. This command +ensures that all of the control plane's existing configuration and mTLS secrets +are retained. + +```bash +# get the latest stable CLI +curl -sL https://run.linkerd.io/install | sh +``` + +{{< note >}} The linkerd cli installer installs the CLI binary into a +versioned file (e.g. `linkerd-stable-2.5.0`) under the `$INSTALLROOT` (default: +`$HOME/.linkerd`) directory and provides a convenience symlink at +`$INSTALLROOT/bin/linkerd`. + +If you need to have multiple versions of the linkerd cli installed +alongside each other (for example if you are running an edge release on +your test cluster but a stable release on your production cluster) you +can refer to them by their full paths, e.g. `$INSTALLROOT/bin/linkerd-stable-2.5.0` +and `$INSTALLROOT/bin/linkerd-edge-19.8.8`. +{{< /note >}} + +```bash +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +The options `--prune -l linkerd.io/control-plane-ns=linkerd` above make sure +that any resources that are removed from the `linkerd upgrade` output, are +effectively removed from the system. + +For upgrading a multi-stage installation setup, follow the instructions at +[Upgrading a multi-stage install](#upgrading-a-multi-stage-install). + +Users who have previously saved the Linkerd control plane's configuration to +files can follow the instructions at +[Upgrading via manifests](#upgrading-via-manifests) +to ensure those configuration are retained by the `linkerd upgrade` command. + +Once the `upgrade` command completes, use the `linkerd check` command to confirm +the control plane is ready. + +{{< note >}} +The `stable-2.5` `linkerd check` command will return an error when run against +an older control plane. This error is benign and will resolve itself once the +control plane is upgraded to `stable-2.5`: + +```bash +linkerd-config +-------------- +√ control plane Namespace exists +√ control plane ClusterRoles exist +√ control plane ClusterRoleBindings exist +× control plane ServiceAccounts exist + missing ServiceAccounts: linkerd-heartbeat + see https://linkerd.io/checks/#l5d-existence-sa for hints +``` + +{{< /note >}} + +When ready, proceed to upgrading the data plane by following the instructions at +[Upgrade the data plane](#upgrade-the-data-plane). + +## Upgrade notice: stable-2.4.0 + +This release supports Kubernetes 1.12+. + +### Upgrading from stable-2.3.x, edge-19.4.5, edge-19.5.x, edge-19.6.x, edge-19.7.x + +Use the `linkerd upgrade` command to upgrade the control plane. This command +ensures that all of the control plane's existing configuration and mTLS secrets +are retained. + +```bash +# get the latest stable CLI +curl -sL https://run.linkerd.io/install | sh +``` + +For Kubernetes 1.12+: + +```bash +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +For Kubernetes pre-1.12 where the mutating and validating webhook +configurations' `sideEffects` fields aren't supported: + +```bash +linkerd upgrade --omit-webhook-side-effects | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +The `sideEffects` field is added to the Linkerd webhook configurations to +indicate that the webhooks have no side effects on other resources. + +For HA setup, the `linkerd upgrade` command will also retain all previous HA +configuration. Note that the mutating and validating webhook configurations are +updated to set their `failurePolicy` fields to `fail` to ensure that un-injected +workloads (as a result of unexpected errors) are rejected during the admission +process. The HA mode has also been updated to schedule multiple replicas of the +`linkerd-proxy-injector` and `linkerd-sp-validator` deployments. + +For users upgrading from the `edge-19.5.3` release, note that the upgrade +process will fail with the following error message, due to a naming bug: + +```bash +The ClusterRoleBinding "linkerd-linkerd-tap" is invalid: roleRef: Invalid value: +rbac.RoleRef{APIGroup:"rbac.authorization.k8s.io", Kind:"ClusterRole", +Name:"linkerd-linkerd-tap"}: cannot change roleRef +``` + +This can be resolved by simply deleting the `linkerd-linkerd-tap` cluster role +binding resource, and re-running the `linkerd upgrade` command: + +```bash +kubectl delete clusterrole/linkerd-linkerd-tap +``` + +For upgrading a multi-stage installation setup, follow the instructions at +[Upgrading a multi-stage install](#upgrading-a-multi-stage-install). + +Users who have previously saved the Linkerd control plane's configuration to +files can follow the instructions at +[Upgrading via manifests](#upgrading-via-manifests) +to ensure those configuration are retained by the `linkerd upgrade` command. + +Once the `upgrade` command completes, use the `linkerd check` command to confirm +the control plane is ready. + +{{< note >}} +The `stable-2.4` `linkerd check` command will return an error when run against +an older control plane. This error is benign and will resolve itself once the +control plane is upgraded to `stable-2.4`: + +```bash +linkerd-config +-------------- +√ control plane Namespace exists +× control plane ClusterRoles exist + missing ClusterRoles: linkerd-linkerd-controller, linkerd-linkerd-identity, linkerd-linkerd-prometheus, linkerd-linkerd-proxy-injector, linkerd-linkerd-sp-validator, linkerd-linkerd-tap + see https://linkerd.io/checks/#l5d-existence-cr for hints +``` + +{{< /note >}} + +When ready, proceed to upgrading the data plane by following the instructions at +[Upgrade the data plane](#upgrade-the-data-plane). + +### Upgrading from stable-2.2.x + +Follow the [stable-2.3.0 upgrade instructions](#upgrading-from-stable-22x-1) +to upgrade the control plane to the stable-2.3.2 release first. Then follow +[these instructions](#upgrading-from-stable-23x-edge-1945-edge-195x-edge-196x-edge-197x) +to upgrade the stable-2.3.2 control plane to `stable-2.4.0`. + +## Upgrade notice: stable-2.3.0 + +`stable-2.3.0` introduces a new `upgrade` command. This command only works for +the `edge-19.4.x` and newer releases. When using the `upgrade` command from +`edge-19.2.x` or `edge-19.3.x`, all the installation flags previously provided +to the `install` command must also be added. + +### Upgrading from stable-2.2.x + +To upgrade from the `stable-2.2.x` release, follow the +[Step-by-step instructions](#step-by-step-instructions-stable-22x). + +Note that if you had previously installed Linkerd with `--tls=optional`, delete +the `linkerd-ca` deployment after successful Linkerd control plane upgrade: + +```bash +kubectl -n linkerd delete deploy/linkerd-ca +``` + +### Upgrading from edge-19.4.x + +```bash +# get the latest stable +curl -sL https://run.linkerd.io/install | sh + +# upgrade the control plane +linkerd upgrade | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +Follow instructions for +[upgrading the data plane](#upgrade-the-data-plane). + +#### Upgrading a multi-stage install + +`edge-19.4.5` introduced a +[Multi-stage install](../install/#multi-stage-install) feature. If you +previously installed Linkerd via a multi-stage install process, you can upgrade +each stage, analogous to the original multi-stage installation process. + +Stage 1, for the cluster owner: + +```bash +linkerd upgrade config | kubectl apply -f - +``` + +Stage 2, for the service owner: + +```bash +linkerd upgrade control-plane | kubectl apply -f - +``` + +{{< note >}} +Passing the `--prune` flag to `kubectl` does not work well with multi-stage +upgrades. It is recommended to manually prune old resources after completing +the above steps. +{{< /note >}} + +#### Upgrading via manifests + +By default, the `linkerd upgrade` command reuses the existing `linkerd-config` +config map and the `linkerd-identity-issuer` secret, by fetching them via the +the Kubernetes API. `edge-19.4.5` introduced a new `--from-manifests` flag to +allow the upgrade command to read the `linkerd-config` config map and the +`linkerd-identity-issuer` secret from a static YAML file. This option is +relevant to CI/CD workflows where the Linkerd configuration is managed by a +configuration repository. + +For release after `edge-20.10.1`/`stable-2.9.0`, you need to add `secret/linkerd-config-overrides` +to the `linkerd-manifest.yaml` by running command: + +```bash +kubectl -n linkerd get \ + secret/linkerd-identity-issuer \ + configmap/linkerd-config \ + secret/linkerd-config-overrides \ + -oyaml > linkerd-manifests.yaml + +linkerd upgrade --from-manifests linkerd-manifests.yaml | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +For release after `stable-2.6.0` and prior to `edge-20.10.1`/`stable-2.9.0`, +you can use this command: + +```bash +kubectl -n linkerd get \ + secret/linkerd-identity-issuer \ + configmap/linkerd-config \ + -oyaml > linkerd-manifests.yaml + +linkerd upgrade --from-manifests linkerd-manifests.yaml | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +For releases prior to `edge-19.8.1`/`stable-2.5.0`, and after `stable-2.6.0`, +you may pipe a full `linkerd install` manifest into the upgrade command: + +```bash +linkerd install > linkerd-install.yaml + +# deploy Linkerd +cat linkerd-install.yaml | kubectl apply -f - + +# upgrade Linkerd via manifests +cat linkerd-install.yaml | linkerd upgrade --from-manifests - +``` + +{{< note >}} +`secret/linkerd-identity-issuer` contains the trust root of Linkerd's Identity +system, in the form of a private key. Care should be taken if storing this +information on disk, such as using tools like +[git-secret](https://git-secret.io/). +{{< /note >}} + +### Upgrading from edge-19.2.x or edge-19.3.x + +```bash +# get the latest stable +curl -sL https://run.linkerd.io/install | sh + +# Install stable control plane, using flags previously supplied during +# installation. +# For example, if the previous installation was: +# linkerd install --proxy-log-level=warn --proxy-auto-inject | kubectl apply -f - +# The upgrade command would be: +linkerd upgrade --proxy-log-level=warn --proxy-auto-inject | kubectl apply --prune -l linkerd.io/control-plane-ns=linkerd -f - +``` + +Follow instructions for +[upgrading the data plane](#upgrade-the-data-plane). + +## Upgrade notice: stable-2.2.0 + +There are two breaking changes in `stable-2.2.0`. One relates to +[Service Profiles](../../features/service-profiles/), the other relates to +[Automatic Proxy Injection](../../features/proxy-injection/). If you are not using +either of these features, you may [skip directly](#step-by-step-instructions-stable-22x) +to the full upgrade instructions. + +### Service Profile namespace location + +[Service Profiles](../../features/service-profiles/), previously defined in the +control plane namespace in `stable-2.1.0`, are now defined in their respective +client and server namespaces. Service Profiles defined in the client namespace +take priority over ones defined in the server namespace. + +### Automatic Proxy Injection opt-in + +The `linkerd.io/inject` annotation, previously opt-out in `stable-2.1.0`, is now +opt-in. + +To enable automation proxy injection for a namespace, you must enable the +`linkerd.io/inject` annotation on either the namespace or the pod spec. For more +details, see the [Automatic Proxy Injection](../../features/proxy-injection/) doc. + +#### A note about application updates + +Also note that auto-injection only works during resource creation, not update. +To update the data plane proxies of a deployment that was auto-injected, do one +of the following: + +- Manually re-inject the application via `linkerd inject` (more info below under + [Upgrade the data plane](#upgrade-the-data-plane)) +- Delete and redeploy the application + +Auto-inject support for application updates is tracked on +[github](https://github.com/linkerd/linkerd2/issues/2260) + +## Step-by-step instructions (stable-2.2.x) + +### Upgrade the 2.2.x CLI + +This will upgrade your local CLI to the latest version. You will want to follow +these instructions for anywhere that uses the linkerd CLI. + +To upgrade the CLI locally, run: + +```bash +curl -sL https://run.linkerd.io/install | sh +``` + +Alternatively, you can download the CLI directly via the +[Linkerd releases page](https://github.com/linkerd/linkerd2/releases/). + +Verify the CLI is installed and running correctly with: + +```bash +linkerd version +``` + +Which should display: + +```bash +Client version: {{% latestversion %}} +Server version: stable-2.1.0 +``` + +It is expected that the Client and Server versions won't match at this point in +the process. Nothing has been changed on the cluster, only the local CLI has +been updated. + +{{< note >}} +Until you upgrade the control plane, some new CLI commands may not work. +{{< /note >}} + +### Upgrade the 2.2.x control plane + +Now that you have upgraded the CLI running locally, it is time to upgrade the +Linkerd control plane on your Kubernetes cluster. Don't worry, the existing data +plane will continue to operate with a newer version of the control plane and +your meshed services will not go down. + +To upgrade the control plane in your environment, run the following command. +This will cause a rolling deploy of the control plane components that have +changed. + +```bash +linkerd install | kubectl apply -f - +``` + +The output will be: + +```bash +namespace/linkerd configured +configmap/linkerd-config created +serviceaccount/linkerd-identity created +clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity configured +clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-identity configured +service/linkerd-identity created +secret/linkerd-identity-issuer created +deployment.extensions/linkerd-identity created +serviceaccount/linkerd-controller unchanged +clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller configured +clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-controller configured +service/linkerd-controller-api configured +service/linkerd-destination created +deployment.extensions/linkerd-controller configured +customresourcedefinition.apiextensions.k8s.io/serviceprofiles.linkerd.io configured +serviceaccount/linkerd-web unchanged +service/linkerd-web configured +deployment.extensions/linkerd-web configured +serviceaccount/linkerd-prometheus unchanged +clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-prometheus configured +clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-prometheus configured +service/linkerd-prometheus configured +deployment.extensions/linkerd-prometheus configured +configmap/linkerd-prometheus-config configured +serviceaccount/linkerd-grafana unchanged +service/linkerd-grafana configured +deployment.extensions/linkerd-grafana configured +configmap/linkerd-grafana-config configured +serviceaccount/linkerd-sp-validator created +clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-sp-validator configured +clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-sp-validator configured +service/linkerd-sp-validator created +deployment.extensions/linkerd-sp-validator created +``` + +Check to make sure everything is healthy by running: + +```bash +linkerd check +``` + +This will run through a set of checks against your control plane and make sure +that it is operating correctly. + +To verify the Linkerd control plane version, run: + +```bash +linkerd version +``` + +Which should display: + +```txt +Client version: {{% latestversion %}} +Server version: {{% latestversion %}} +``` + +{{< note >}} +You will lose the historical data from Prometheus. If you would like to have +that data persisted through an upgrade, take a look at the +[persistence documentation](../../observability/exporting-metrics/) +{{< /note >}} + +### Upgrade the 2.2.x data plane + +With a fully up-to-date CLI running locally and Linkerd control plane running on +your Kubernetes cluster, it is time to upgrade the data plane. This will change +the version of the `linkerd-proxy` sidecar container and run a rolling deploy on +your service. + +For `stable-2.3.0`+, if your workloads are annotated with the auto-inject +`linkerd.io/inject: enabled` annotation, then you can just restart your pods +using your Kubernetes cluster management tools (`helm`, `kubectl` etc.). + +{{< note >}} +With `kubectl` 1.15+, you can use the `kubectl rollout restart` command to +restart all your meshed services. For example, + +```bash +kubectl -n rollout restart deploy +``` + +{{< /note >}} + +As the pods are being re-created, the proxy injector will auto-inject the new +version of the proxy into the pods. + +If auto-injection is not part of your workflow, you can still manually upgrade +your meshed services by re-injecting your applications in-place. + +Begin by retrieving your YAML resources via `kubectl`, and pass them through the +`linkerd inject` command. This will update the pod spec with the +`linkerd.io/inject: enabled` annotation. This annotation will be picked up by +Linkerd's proxy injector during the admission phase where the Linkerd proxy will +be injected into the workload. By using `kubectl apply`, Kubernetes will do a +rolling deploy of your service and update the running pods to the latest +version. + +Example command to upgrade an application in the `emojivoto` namespace, composed +of deployments: + +```bash +kubectl -n emojivoto get deploy -l linkerd.io/control-plane-ns=linkerd -oyaml \ + | linkerd inject - \ + | kubectl apply -f - +``` + +Check to make sure everything is healthy by running: + +```bash +linkerd check --proxy +``` + +This will run through a set of checks against both your control plane and data +plane to verify that it is operating correctly. + +You can make sure that you've fully upgraded all the data plane by running: + +```bash +kubectl get po --all-namespaces -o yaml \ + | grep linkerd.io/proxy-version +``` + +The output will look something like: + +```bash +linkerd.io/proxy-version: {{% latestversion %}} +linkerd.io/proxy-version: {{% latestversion %}} +``` + +If there are any older versions listed, you will want to upgrade them as well. diff --git a/linkerd.io/content/2.11/tasks/upgrading-2.10-ports-and-protocols.md b/linkerd.io/content/2.11/tasks/upgrading-2.10-ports-and-protocols.md new file mode 100644 index 0000000000..0ef35cf3db --- /dev/null +++ b/linkerd.io/content/2.11/tasks/upgrading-2.10-ports-and-protocols.md @@ -0,0 +1,121 @@ ++++ +title = "Upgrading to Linkerd 2.10: ports and protocols" +description = "Upgrading to Linkerd 2.10 and handling skip-ports, server-speaks-first protocols, and more." ++++ + +Linkerd 2.10 introduced some significant changes to the way that certain types +of traffic are handled. These changes may require new or different +configuration on your part. + +## What were the changes? + +The majority of traffic "just works" in Linkerd. However, there are certain +types of traffic that Linkerd cannot handle without additional configuration. +This includes "server-speaks-first" protocols such as MySQL and SMTP, as well +(in some Linkerd versions) protocols such as Postgres and Memcache. Linkerd's +protocol detection logic is unable to efficiently handle these protocols. + +In Linkerd 2.9 and earlier, these protocols were handled by simply bypassing +them. Users could mark specific ports as "skip ports" and ensure that traffic +to these ports would not transit Linkerd's data plane proxy. To make this easy, +Linkerd 2.9 and earlier shipped with a default set of skip ports which included +25 (SMTP), 443 (client-initiated TLS), 587 (SMTP), 3306 (MySQL), 5432 +(Postgres), and 11211 (Memcache). + +In the 2.10 release, Linkerd introduced three changes to the way that protocols +are detected and handled: + +1. It added an _opaque ports_ feature, which disables protocol detection on + specific ports. This means Linkerd 2.10 can now handle these protocols and + provide TCP-level metrics, mTLS, etc. +2. It replaced the default list of skip ports with a default list of opaque + ports, covering the same ports. This means that the default behavior for + these protocols is to transit the proxy rather than bypass it. +3. It changed the handling of connections to continue even if protocol + detection times out. This means that attempting e.g. server-speaks-first + protocols through the proxy _without_ skip ports or opaque ports + configuration has a better behavior: instead of failing, the proxy will + forward the connection (with service discovery, TCP load balancing, and + mTLS) after a 10-second connect delay. + +## What does this mean for me? + +In short, it means that there are several types of traffic that, in Linkerd 2.9 +and earlier, simply bypassed the proxy by default, but that in Linkerd 2.10 now +transit the proxy. This is a good thing!—you are using Linkerd for a reason, +after all—but it has some implications in certain situations. + +## What do I need to change? + +As part of Linkerd 2.10, you may need to update your configuration in certain +situations. + +### SMTP, MySQL, Postgres, or Memcache traffic to an off-cluster destination + +If you have existing SMTP, MySQL, Postgres, or Memcache traffic to an +off-cluster destination, *on the default port for that protocol*, then you will +need to update your configuration. + +**Behavior in 2.9:** Traffic automatically *skips* the proxy. +**Behavior in 2.10:** Traffic automatically *transits* the proxy, and will incur +a 10-second connect delay. +**Steps to upgrade:** Use `skip-outbound-ports` to mark the port so as to +bypass the proxy. (There is currently no ability to use opaque ports in this +situation.) + +### Client-initiated TLS calls at startup + +If you have client-initiated TLS calls to any destination, on- or off-cluster, +you may have to update your configuration if these connections are made at +application startup and not retried. + +**Behavior in 2.9:** Traffic automatically *skips* the proxy. +**Behavior in 2.10:** Traffic automatically *transits* the proxy. +**Steps to upgrade:** See "Connecting at startup" below. + +### An existing skip-ports configuration + +If you have an existing configuration involving `skip-inbound-ports` or +`skip-outbound-ports` annotations, everything should continue working as is. +However, you may choose to convert this configuration to opaque ports. + +**Behavior in 2.9:** Traffic skips the proxy. +**Behavior in 2.10:** Traffic skips the proxy. +**Steps to upgrade:** Optionally, change this configuration to opaque ports to +take advantage of metrics, mTLS (for meshed destinations), etc. See "Connecting +at startup" below if any of these connections happen at application startup and +are not retried. + +## Note: Connecting at startup + +There is one additional consideration for traffic that previously skipped the +proxy but now transits the proxy. If your application makes connections at +_startup time_, those connections will now require the proxy to be active +before they succeed. Unfortunately, Kubernetes currently provides no container +ordering constraints, so the proxy may not be active before the application +container starts. Thus, if your application does not retry with sufficient +leeway to allow the proxy to start up, it may fail. (This typically manifests +as container restarts.) + +To handle this situation, you have four options: + +1. Ignore the container restarts. (It's "eventually consistent".) +2. Use [linkerd-await](https://github.com/olix0r/linkerd-await) to make the + application container wait for the proxy to be ready before starting. +3. Set a `skip-outbound-ports` annotation to bypass the proxy for that port. + (You will lose Linkerd's features for that connection.) +4. Add retry logic to the application to make it resilient to transient network + failures. + +The last option is arguably the "rightest" approach, but not always the most +practical. + +In the future, Kubernetes may provide mechanisms for specifying container +startup ordering, at which point this will no longer be an issue. + +## How do I set an opaque port or skip port? + +Ports can be marked as opaque ports or as skip ports via Kubernetes +annotations. These annotations can be set at the namespace, workload, or +service level. The `linkerd inject` CLI command provides flags to set these +annotations; they are also settable as defaults in the Helm config. diff --git a/linkerd.io/content/2.11/tasks/using-a-private-docker-repository.md b/linkerd.io/content/2.11/tasks/using-a-private-docker-repository.md new file mode 100644 index 0000000000..3ecfc4751c --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-a-private-docker-repository.md @@ -0,0 +1,52 @@ ++++ +title = "Using A Private Docker Repository" +description = "Using Linkerd with a Private Docker Repository." ++++ + +In some cases, you will want to use a private docker repository to store the +Linkerd images. This scenario requires knowing the names and locations of the +docker images used by the Linkerd control and data planes so that you can +store them in your private repository. + +The easiest way to get those images is to use the +[Linkerd CLI](../../getting-started/#step-1-install-the-cli) +to pull the images to an internal host and push them to your private repository. + +To get the names of the images used by the control plane, [install] +(../../getting-started/#step-1-install-the-cli) +the Linkerd CLI and run this command: + +```bash +linkerd install --ignore-cluster | grep image: | sed -e 's/^ *//' | sort | uniq +``` + +For the current stable version, the output will be: + +```bash +image: gcr.io/linkerd-io/controller:stable-2.6.0 +image: gcr.io/linkerd-io/grafana:stable-2.6.0 +image: gcr.io/linkerd-io/proxy-init:v1.2.0 +image: gcr.io/linkerd-io/proxy:stable-2.6.0 +image: gcr.io/linkerd-io/web:stable-2.6.0 +image: prom/prometheus:v2.11.1 +``` + +All of the Linkerd images are publicly available in the +[Linkerd Google Container Repository](https://console.cloud.google.com/gcr/images/linkerd-io/GLOBAL/) + +Stable images are named using the convention `stable-` and the edge +images use the convention `edge-..`. + +Examples of each are: `stable-2.6.0` and `edge-2019.11.1`. + +Once you have identified which images you want to store in your private +repository, use the `docker pull ` command to pull the images to +a machine on your network, then use the `docker push` command to push the +images to your private repository. + +Now that the images are hosted by your private repository, you can update +your deployment configuration to pull from your private docker repository. + +For a more advanced configuration, you can clone the [linkerd2 repository +](https://github.com/linkerd/linkerd2) to your CI/CD system and build +specific tags to push to your private repository. diff --git a/linkerd.io/content/2.11/tasks/using-custom-domain.md b/linkerd.io/content/2.11/tasks/using-custom-domain.md new file mode 100644 index 0000000000..5d0f14e014 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-custom-domain.md @@ -0,0 +1,31 @@ ++++ +title = "Using a Custom Cluster Domain" +description = "Use Linkerd with a custom cluster domain." ++++ + +For Kubernetes clusters that use [custom cluster domain](https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/), +Linkerd must be installed using the `--cluster-domain` option: + +```bash +linkerd install --cluster-domain=example.org \ + --identity-trust-domain=example.org \ + | kubectl apply -f - + +# The Linkerd Viz extension also requires a similar setting: +linkerd viz install --set clusterDomain=example.org | kubectl apply -f - + +# And so does the Multicluster extension: +linkerd multicluster install --set identityTrustDomain=example.org | kubectl apply -f - +``` + +This ensures that all Linkerd handles all service discovery, routing, service +profiles and traffic split resources using the `example.org` domain. + +{{< note >}} +Note that the identity trust domain must match the cluster domain for mTLS to +work. +{{< /note >}} + +{{< note >}} +Changing the cluster domain while upgrading Linkerd isn't supported. +{{< /note >}} diff --git a/linkerd.io/content/2.11/tasks/using-debug-endpoints.md b/linkerd.io/content/2.11/tasks/using-debug-endpoints.md new file mode 100644 index 0000000000..3149369f63 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-debug-endpoints.md @@ -0,0 +1,59 @@ ++++ +title = "Control Plane Debug Endpoints" +description = "Linkerd's control plane components provide debug endpoints." ++++ + +All of the control plane components (with the exception of Grafana) expose +runtime profiling information through the path `/debug/pprof`, using Go's +[pprof](https://golang.org/pkg/net/http/pprof/) package. + +You can consume the provided data with `go tool pprof` to generate output in +many formats (PDF, DOT, PNG, etc). + +The following diagnostics are provided (a summary with links is provided at +`/debug/pprof`): + +- allocs: A sampling of all past memory allocations +- block: Stack traces that led to blocking on synchronization primitives +- cmdline: The command line invocation of the current program +- goroutine: Stack traces of all current goroutines +- heap: A sampling of memory allocations of live objects. You can specify the gc + GET parameter to run GC before taking the heap sample. +- mutex: Stack traces of holders of contended mutexes +- profile: CPU profile. You can specify the duration in the seconds GET + parameter. After you get the profile file, use the go tool pprof command to + investigate the profile. +- threadcreate: Stack traces that led to the creation of new OS threads +- trace: A trace of execution of the current program. You can specify the + duration in the seconds GET parameter. After you get the trace file, use the + go tool trace command to investigate the trace. + +## Example Usage + +This data is served over the `admin-http` port. +To find this port, you can examine the pod's yaml, or for the identity pod for +example, issue a command like so: + +```bash +kubectl -n linkerd get po \ + $(kubectl -n linkerd get pod -l linkerd.io/control-plane-component=identity \ + -o jsonpath='{.items[0].metadata.name}') \ + -o=jsonpath='{.spec.containers[*].ports[?(@.name=="admin-http")].containerPort}' +``` + +Then use the `kubectl port-forward` command to access that port from outside +the cluster (in this example the port is 9990): + +```bash +kubectl -n linkerd port-forward \ + $(kubectl -n linkerd get pod -l linkerd.io/control-plane-component=identity \ + -o jsonpath='{.items[0].metadata.name}') \ + 9990 +``` + +It is now possible to use `go tool` to inspect this data. For example to +generate a graph in a PDF file describing memory allocations: + +```bash +go tool pprof -seconds 5 -pdf http://localhost:9990/debug/pprof/allocs +``` diff --git a/linkerd.io/content/2.11/tasks/using-ingress.md b/linkerd.io/content/2.11/tasks/using-ingress.md new file mode 100644 index 0000000000..3e10a27834 --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-ingress.md @@ -0,0 +1,540 @@ ++++ +title = "Ingress traffic" +description = "Linkerd works alongside your ingress controller of choice." ++++ + +For reasons of simplicity and composability, Linkerd doesn't provide a built-in +ingress. Instead, Linkerd is designed to work with existing Kubernetes ingress +solutions. + +Combining Linkerd and your ingress solution requires two things: + +1. Configuring your ingress to support Linkerd. +2. Meshing your ingress pods so that they have the Linkerd proxy installed. + +Meshing your ingress pods will allow Linkerd to provide features like L7 +metrics and mTLS the moment the traffic is inside the cluster. (See +[Adding your service](../adding-your-service/) for instructions on how to mesh +your ingress.) + +Note that some ingress options need to be meshed in "ingress" mode. See details +below. + +Common ingress options that Linkerd has been used with include: + +- [Ambassador / Emissary-ingress](#ambassador) +- [Contour](#contour) +- [GCE](#gce) +- [Gloo](#gloo) +- [Haproxy]({{< ref "#haproxy" >}}) +- [Kong](#kong) +- [Nginx](#nginx) +- [Traefik](#traefik) + +For a quick start guide to using a particular ingress, please visit the section +for that ingress. If your ingress is not on that list, never fear—it likely +works anyways. See [Ingress details](#ingress-details) below. + +{{< note >}} +If your ingress terminates TLS, this TLS traffic (e.g. HTTPS calls from outside +the cluster) will pass through Linkerd as an opaque TCP stream and Linkerd will +only be able to provide byte-level metrics for this side of the connection. The +resulting HTTP or gRPC traffic to internal services, of course, will have the +full set of metrics and mTLS support. +{{< /note >}} + +## Ambassador (aka Emissary) {id="ambassador"} + +Ambassador can be meshed normally. An example manifest for configuring the +Ambassador / Emissary is as follows: + +```yaml +apiVersion: v1 +kind: Service +metadata: + name: web-ambassador + namespace: emojivoto + annotations: + getambassador.io/config: | + --- + apiVersion: getambassador.io/v2 + kind: Mapping + name: web-ambassador-mapping + service: http://web-svc.emojivoto.svc.cluster.local:80 + host: example.com + prefix: / +spec: + selector: + app: web-svc + ports: + - name: http + port: 80 + targetPort: http +``` + +For a more detailed guide, we recommend reading [Installing the Emissary +ingress with the Linkerd service +mesh](https://buoyant.io/2021/05/24/emissary-and-linkerd-the-best-of-both-worlds/). + +## Nginx + +Nginx can be meshed normally, but the +[`nginx.ingress.kubernetes.io/service-upstream`](https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#service-upstream) +annotation should be set to `true`. No further configuration is required. + +```yaml +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: emojivoto-web-ingress + namespace: emojivoto + annotations: + nginx.ingress.kubernetes.io/service-upstream: true +spec: + ingressClassName: nginx + defaultBackend: + service: + name: web-svc + port: + number: 80 +``` + +## Traefik + +Traefik should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +Instructions differ for 1.x and 2.x versions of Traefik. + +### Traefik 1.x + +The simplest way to use Traefik 1.x as an ingress for Linkerd is to configure a +Kubernetes `Ingress` resource with the +`ingress.kubernetes.io/custom-request-headers` like this: + +```yaml +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: emojivoto + annotations: + ingress.kubernetes.io/custom-request-headers: l5d-dst-override:web-svc.emojivoto.svc.cluster.local:80 +spec: + ingressClassName: traefik + rules: + - host: example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web-svc + port: + number: 80 +``` + +The important annotation here is: + +```yaml +ingress.kubernetes.io/custom-request-headers: l5d-dst-override:web-svc.emojivoto.svc.cluster.local:80 +``` + +Traefik will add a `l5d-dst-override` header to instruct Linkerd what service +the request is destined for. You'll want to include both the Kubernetes service +FQDN (`web-svc.emojivoto.svc.cluster.local`) *and* the destination +`servicePort`. + +To test this, you'll want to get the external IP address for your controller. If +you installed Traefik via Helm, you can get that IP address by running: + +```bash +kubectl get svc --all-namespaces \ + -l app=traefik \ + -o='custom-columns=EXTERNAL-IP:.status.loadBalancer.ingress[0].ip' +``` + +You can then use this IP with curl: + +```bash +curl -H "Host: example.com" http://external-ip +``` + +{{< note >}} +This solution won't work if you're using Traefik's service weights as +Linkerd will always send requests to the service name in `l5d-dst-override`. A +workaround is to use `traefik.frontend.passHostHeader: "false"` instead. +{{< /note >}} + +### Traefik 2.x + +Traefik 2.x adds support for path based request routing with a Custom Resource +Definition (CRD) called +[`IngressRoute`](https://docs.traefik.io/providers/kubernetes-crd/). + +If you choose to use `IngressRoute` instead of the default Kubernetes `Ingress` +resource, then you'll also need to use the Traefik's +[`Middleware`](https://docs.traefik.io/middlewares/headers/) Custom Resource +Definition to add the `l5d-dst-override` header. + +The YAML below uses the Traefik CRDs to produce the same results for the +`emojivoto` application, as described above. + +```yaml +apiVersion: traefik.containo.us/v1alpha1 +kind: Middleware +metadata: + name: l5d-header-middleware + namespace: traefik +spec: + headers: + customRequestHeaders: + l5d-dst-override: "web-svc.emojivoto.svc.cluster.local:80" +--- +apiVersion: traefik.containo.us/v1alpha1 +kind: IngressRoute +metadata: + annotations: + kubernetes.io/ingress.class: traefik + creationTimestamp: null + name: emojivoto-web-ingress-route + namespace: emojivoto +spec: + entryPoints: [] + routes: + - kind: Rule + match: PathPrefix(`/`) + priority: 0 + middlewares: + - name: l5d-header-middleware + services: + - kind: Service + name: web-svc + port: 80 +``` + +## GCE + +The GCE ingress should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +This example shows how to use a [Google Cloud Static External IP +Address](https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address) +and TLS with a [Google-managed +certificate](https://cloud.google.com/load-balancing/docs/ssl-certificates#managed-certs). + +```yaml +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: emojivoto + annotations: + ingress.kubernetes.io/custom-request-headers: "l5d-dst-override: web-svc.emojivoto.svc.cluster.local:80" + ingress.gcp.kubernetes.io/pre-shared-cert: "managed-cert-name" + kubernetes.io/ingress.global-static-ip-name: "static-ip-name" +spec: + ingressClassName: gce + rules: + - host: example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web-svc + port: + number: 80 +``` + +To use this example definition, substitute `managed-cert-name` and +`static-ip-name` with the short names defined in your project (n.b. use the name +for the IP address, not the address itself). + +The managed certificate will take about 30-60 minutes to provision, but the +status of the ingress should be healthy within a few minutes. Once the managed +certificate is provisioned, the ingress should be visible to the Internet. + +## Gloo + +Gloo should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +As of Gloo v0.13.20, Gloo has native integration with Linkerd, so that the +required Linkerd headers are added automatically. Assuming you installed Gloo +to the default location, you can enable the native integration by running: + +```bash +kubectl patch settings -n gloo-system default \ + -p '{"spec":{"linkerd":true}}' --type=merge +``` + +Gloo will now automatically add the `l5d-dst-override` header to every +Kubernetes upstream. + +Now simply add a route to the upstream, e.g.: + +```bash +glooctl add route --path-prefix=/ --dest-name booksapp-webapp-7000 +``` + +## Contour + +Contour should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +The following example uses the +[Contour getting started](https://projectcontour.io/getting-started/) documentation +to demonstrate how to set the required header manually. + +Contour's Envoy DaemonSet doesn't auto-mount the service account token, which +is required for the Linkerd proxy to do mTLS between pods. So first we need to +install Contour uninjected, patch the DaemonSet with +`automountServiceAccountToken: true`, and then inject it. Optionally you can +create a dedicated service account to avoid using the `default` one. + +```bash +# install Contour +kubectl apply -f https://projectcontour.io/quickstart/contour.yaml + +# create a service account (optional) +kubectl apply -f - << EOF +apiVersion: v1 +kind: ServiceAccount +metadata: + name: envoy + namespace: projectcontour +EOF + +# add service account to envoy (optional) +kubectl patch daemonset envoy -n projectcontour --type json -p='[{"op": "add", "path": "/spec/template/spec/serviceAccount", "value": "envoy"}]' + +# auto mount the service account token (required) +kubectl patch daemonset envoy -n projectcontour --type json -p='[{"op": "replace", "path": "/spec/template/spec/automountServiceAccountToken", "value": true}]' + +# inject linkerd first into the DaemonSet +kubectl -n projectcontour get daemonset -oyaml | linkerd inject - | kubectl apply -f - + +# inject linkerd into the Deployment +kubectl -n projectcontour get deployment -oyaml | linkerd inject - | kubectl apply -f - +``` + +Verify your Contour and Envoy installation has a running Linkerd sidecar. + +Next we'll deploy a demo service: + +```bash +linkerd inject https://projectcontour.io/examples/kuard.yaml | kubectl apply -f - +``` + +To route external traffic to your service you'll need to provide a HTTPProxy: + +```yaml +apiVersion: projectcontour.io/v1 +kind: HTTPProxy +metadata: + name: kuard + namespace: default +spec: + routes: + - requestHeadersPolicy: + set: + - name: l5d-dst-override + value: kuard.default.svc.cluster.local:80 + services: + - name: kuard + port: 80 + virtualhost: + fqdn: 127.0.0.1.nip.io +``` + +Notice the `l5d-dst-override` header is explicitly set to the target `service`. + +Finally, you can test your working service mesh: + +```bash +kubectl port-forward svc/envoy -n projectcontour 3200:80 +http://127.0.0.1.nip.io:3200 +``` + +{{< note >}} +You should annotate the pod spec with `config.linkerd.io/skip-outbound-ports: +8001`. The Envoy pod will try to connect to the Contour pod at port 8001 +through TLS, which is not supported under this ingress mode, so you need to +have the proxy skip that outbound port. +{{< /note >}} + +{{< note >}} +If you are using Contour with [flagger](https://github.com/weaveworks/flagger) +the `l5d-dst-override` headers will be set automatically. +{{< /note >}} + +### Kong + +Kong should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +This example will use the following elements: + +- The [Kong chart](https://github.com/Kong/charts) +- The [emojivoto] example application(../../getting-started/) + +Before installing emojivoto, install Linkerd and Kong on your cluster. When +injecting the Kong deployment, use the `--ingress` flag (or annotation). + +We need to declare these objects as well: + +- KongPlugin, a CRD provided by Kong +- Ingress + +```yaml +apiVersion: configuration.konghq.com/v1 +kind: KongPlugin +metadata: + name: set-l5d-header + namespace: emojivoto +plugin: request-transformer +config: + add: + headers: + - l5d-dst-override:$(headers.host).svc.cluster.local +--- +# apiVersion: networking.k8s.io/v1beta1 # for k8s < v1.19 +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: emojivoto + annotations: + konghq.com/plugins: set-l5d-header +spec: + ingressClassName: kong + rules: + - http: + paths: + - path: /api/vote + pathType: Prefix + backend: + service: + name: web-svc + port: + number: http + - path: /api/list + pathType: Prefix + backend: + service: + name: web-svc + port: + name: http +``` + +Here we are explicitly setting the `l5d-dst-override` in the `KongPlugin`. +Using [templates as +values](https://docs.konghq.com/hub/kong-inc/request-transformer/#template-as-value), +we can use the `host` header from requests and set the `l5d-dst-override` value +based off that. + +Finally, install emojivoto so that it's `deploy/vote-bot` targets the +ingress and includes a `host` header value for the `web-svc.emojivoto` service. + +Before applying the injected emojivoto application, make the following changes +to the `vote-bot` Deployment: + +```yaml +env: +# Target the Kong ingress instead of the Emojivoto web service +- name: WEB_HOST + value: kong-proxy.kong:80 +# Override the host header on requests so that it can be used to set the l5d-dst-override header +- name: HOST_OVERRIDE + value: web-svc.emojivoto +``` + +### Haproxy + +{{< note >}} +There are two different haproxy-based ingress controllers. This example is for +the [kubernetes-ingress controller by +haproxytech](https://www.haproxy.com/documentation/kubernetes/latest/) and not +the [haproxy-ingress controller](https://haproxy-ingress.github.io/). +{{< /note >}} + +Haproxy should be meshed with ingress mode enabled, i.e. with the +`linkerd.io/inject: ingress` annotation rather than the default `enabled`. + +The simplest way to use Haproxy as an ingress for Linkerd is to configure a +Kubernetes `Ingress` resource with the +`haproxy.org/request-set-header` annotation like this: + +```yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: web-ingress + namespace: emojivoto + annotations: + kubernetes.io/ingress.class: haproxy + haproxy.org/request-set-header: | + l5d-dst-override web-svc.emojivoto.svc.cluster.local:80 +spec: + rules: + - host: example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: web-svc + port: + number: 80 +``` + +Unfortunately, there is currently no support to do this dynamically in +a global config map by using the service name, namespace and port as variable. +This also means, that you can't combine more than one service ingress rule +in an ingress manifest as each one needs their own +`haproxy.org/request-set-header` annotation with hard coded value. + +## Ingress details + +In this section we cover how Linkerd interacts with ingress controllers in +general. + +In general, Linkerd can be used with any ingress controller. In order for +Linkerd to properly apply features such as route-based metrics and traffic +splitting, Linkerd needs the IP/port of the Kubernetes Service. However, by +default, many ingresses do their own endpoint selection and pass the IP/port of +the destination Pod, rather than the Service as a whole. + +Thus, combining an ingress with Linkerd takes one of two forms: + +1. Configure the ingress to pass the IP and port of the Service as the + destination, i.e. to skip its own endpoint selection. (E.g. see + [Nginx](#nginx) above.) + +2. If this is not possible, then configure the ingress to pass the Service + IP/port in a header such as `l5d-dst-override`, `Host`, or `:authority`, and + configure Linkerd in *ingress* mode. In this mode, it will read from one of + those headers instead. + +The most common approach in form #2 is to use the explicit `l5d-dst-override` header. + +{{< note >}} +If requests experience a 2-3 second delay after injecting your ingress +controller, it is likely that this is because the service of `type: +LoadBalancer` is obscuring the client source IP. You can fix this by setting +`externalTrafficPolicy: Local` in the ingress' service definition. +{{< /note >}} + +{{< note >}} +While the Kubernetes Ingress API definition allows a `backend`'s `servicePort` +to be a string value, only numeric `servicePort` values can be used with +Linkerd. If a string value is encountered, Linkerd will default to using port +80. +{{< /note >}} diff --git a/linkerd.io/content/2.11/tasks/using-psp.md b/linkerd.io/content/2.11/tasks/using-psp.md new file mode 100644 index 0000000000..8a5d1f792f --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-psp.md @@ -0,0 +1,121 @@ ++++ +title = "Linkerd and Pod Security Policies (PSP)" +description = "Using Linkerd with a pod security policies enabled." ++++ + +The Linkerd control plane comes with its own minimally privileged +[Pod Security Policy](https://kubernetes.io/docs/concepts/policy/pod-security-policy/) +and the associated RBAC resources. This Pod Security Policy is enforced only if +the `PodSecurityPolicy` admission controller is enabled. + +To view the definition of the control plane's Pod Security Policy, run: + +```bash +kubectl describe psp -l linkerd.io/control-plane-ns=linkerd +``` + +Adjust the value of the above label to match your control plane's namespace. + +Notice that to minimize attack surface, all Linux capabilities are dropped from +the control plane's Pod Security Policy, with the exception of the `NET_ADMIN` +and `NET_RAW` capabilities. These capabilities provide the `proxy-init` init +container with runtime privilege to rewrite the pod's `iptable`. Note that adding +these capabilities to the Pod Security Policy doesn't make the container a +[`privileged`](https://kubernetes.io/docs/concepts/workloads/pods/pod/#privileged-mode-for-pod-containers) +container. The control plane's Pod Security Policy prevents container privilege +escalation with the `allowPrivilegeEscalation: false` policy. To understand the +full implication of the `NET_ADMIN` and `NET_RAW` capabilities, refer to the +Linux capabilities [manual](https://www.man7.org/linux/man-pages/man7/capabilities.7.html). + +More information on the `iptables` rules used by the `proxy-init` init +container can be found on the [Architecture](../../reference/architecture/#linkerd-init) +page. + +If your environment disallows the operation of containers with escalated Linux +capabilities, Linkerd can be installed with its [CNI plugin](../../features/cni/), +which doesn't require the `NET_ADMIN` and `NET_RAW` capabilities. + +Linkerd doesn't provide any default Pod Security Policy for the data plane +because the policies will vary depending on the security requirements of your +application. The security context requirement for the Linkerd proxy sidecar +container will be very similar to that defined in the control plane's Pod +Security Policy. + +For example, the following Pod Security Policy and RBAC will work with the injected +`emojivoto` demo application: + +```yaml +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: linkerd-emojivoto-data-plane +spec: + allowPrivilegeEscalation: false + fsGroup: + ranges: + - max: 65535 + min: 10001 + rule: MustRunAs + readOnlyRootFilesystem: true + allowedCapabilities: + - NET_ADMIN + - NET_RAW + - NET_BIND_SERVICE + requiredDropCapabilities: + - ALL + runAsUser: + rule: RunAsAny + seLinux: + rule: RunAsAny + supplementalGroups: + ranges: + - max: 65535 + min: 10001 + rule: MustRunAs + volumes: + - configMap + - emptyDir + - projected + - secret + - downwardAPI + - persistentVolumeClaim +--- + +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: emojivoto-psp + namespace: emojivoto +rules: +- apiGroups: ['policy','extensions'] + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: ['linkerd-emojivoto-data-plane'] +--- + +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: emojivoto-psp + namespace: emojivoto +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: emojivoto-psp +subjects: +- kind: ServiceAccount + name: default + namespace: emojivoto +- kind: ServiceAccount + name: emoji + namespace: emojivoto +- kind: ServiceAccount + name: voting + namespace: emojivoto +- kind: ServiceAccount + name: web + namespace: emojivoto +``` + +Note that the Linkerd proxy only requires the `NET_ADMIN` and `NET_RAW` +capabilities when running without Linkerd CNI, and it's run with UID `2102`. diff --git a/linkerd.io/content/2.11/tasks/using-the-debug-container.md b/linkerd.io/content/2.11/tasks/using-the-debug-container.md new file mode 100644 index 0000000000..eb6f3466fe --- /dev/null +++ b/linkerd.io/content/2.11/tasks/using-the-debug-container.md @@ -0,0 +1,102 @@ ++++ +title = "Using the Debug Sidecar" +description = "Inject the debug container to capture network packets." ++++ + +Debugging a service mesh can be hard. When something just isn't working, is +the problem with the proxy? With the application? With the client? With the +underlying network? Sometimes, nothing beats looking at raw network data. + +In cases where you need network-level visibility into packets entering and +leaving your application, Linkerd provides a *debug sidecar* with some helpful +tooling. Similar to how [proxy sidecar +injection](../../features/proxy-injection/) works, you add a debug sidecar to +a pod by setting the `config.linkerd.io/enable-debug-sidecar: "true"` annotation +at pod creation time. For convenience, the `linkerd inject` command provides an +`--enable-debug-sidecar` option that does this annotation for you. + +(Note that the set of containers in a Kubernetes pod is not mutable, so simply +adding this annotation to a pre-existing pod will not work. It must be present +at pod *creation* time.) + +The debug sidecar image contains +[`tshark`](https://www.wireshark.org/docs/man-pages/tshark.html), `tcpdump`, +`lsof`, and `iproute2`. Once installed, it starts automatically logging all +incoming and outgoing traffic with `tshark`, which can then be viewed with +`kubectl logs`. Alternatively, you can use `kubectl exec` to access the +container and run commands directly. + +For instance, if you've gone through the [Linkerd Getting +Started](../../getting-started/) guide and installed the +*emojivoto* application, and wish to debug traffic to the *voting* service, you +could run: + +```bash +kubectl -n emojivoto get deploy/voting -o yaml \ + | linkerd inject --enable-debug-sidecar - \ + | kubectl apply -f - +``` + +to deploy the debug sidecar container to all pods in the *voting* service. +(Note that there's only one pod in this deployment, which will be recreated +to do this--see the note about pod mutability above.) + +You can confirm that the debug container is running by listing +all the containers in pods with the `voting-svc` label: + +```bash +kubectl get pods -n emojivoto -l app=voting-svc \ + -o jsonpath='{.items[*].spec.containers[*].name}' +``` + +Then, you can watch live tshark output from the logs by simply running: + +```bash +kubectl -n emojivoto logs deploy/voting linkerd-debug -f +``` + +If that's not enough, you can exec to the container and run your own commands +in the context of the network. For example, if you want to inspect the HTTP headers +of the requests, you could run something like this: + +```bash +kubectl -n emojivoto exec -it \ + $(kubectl -n emojivoto get pod -l app=voting-svc \ + -o jsonpath='{.items[0].metadata.name}') \ + -c linkerd-debug -- tshark -i any -f "tcp" -V -Y "http.request" +``` + +A real-world error message written by the proxy that the debug sidecar is +effective in troubleshooting is a `Connection Refused` error like this one: + + ```log +ERR! [