`mimir-distributed` helm chart not following `Restricted Pod Security Standard` as claimed by Grafana docs #5758

dorkamotorka · 2023-08-16T06:46:44Z

Describe the bug

As per Grafana Mimir documentation it should be possible to install mimir-distributed helm-chart by following the Kubernetes Restricted security policy. But when I'm deploying on the GKE Autopilot, this does not hold true. All component like ruler, compactor, ingester, Alertmanager, store gateway require this helm configuration:

  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0

in order to avoid errors like read-only filesystem, permission denied while accessing X directory. The configuration above is obviously against the best security practices in Kubernetes. I think anybody should be able to reproduce this scenario by just deploying the mimir-distributed helm chart onto the GKE Autopilot. Note, that I'm using GCS storage buckets for the components where the configuration allows it but still as per my investigation there are some temporary file the Mimir is trying to save onto a filesystem.

Output of helm version:

version.BuildInfo{Version:"v3.12.0", GitCommit:"c9f554d75773799f72ceef38c51210f1842a1dea", GitTreeState:"clean", GoVersion:"go1.20.4"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.16", GitCommit:"51e33fadff13065ae5518db94e84598293965939", GitTreeState:"clean", BuildDate:"2023-07-19T12:26:21Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.14-gke.2700", GitCommit:"20f1946282011a3f0cec885eaafe3decc9c367c9", GitTreeState:"clean", BuildDate:"2023-06-22T09:23:35Z", GoVersion:"go1.19.9 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}

To Reproduce

Deploy Mimir-distributed helm chart onto a GKE Autopilot.

Expected behavior

I expect to deploy mimir-distributed helm chart and comply with the latest security recommendations.

Environment

Infrastructure: GKE
Deployment tool: Helm

The text was updated successfully, but these errors were encountered:

dimitarvdimitrov · 2023-08-16T14:28:00Z

Can you share your values file? I'm a bit surprised this is the case because none of the components should be configured to write to the root file system with the default values.yaml

dorkamotorka · 2023-08-30T10:46:13Z

Hey @dimitarvdimitrov, here you go:

mimir:
  config: |
    usage_stats:
      installation_mode: helm

    activity_tracker:
      filepath: /active-query-tracker/activity.log

    server:
      log_format: "logfmt"
      log_level: "debug"
      grpc_server_max_concurrent_streams: 1000
      grpc_server_max_connection_age: 2m
      grpc_server_max_connection_age_grace: 5m
      grpc_server_max_connection_idle: 1m

    common:
      storage:
        backend: gcs

    multitenancy_enabled: true

    # Check https://grafana.com/docs/mimir/latest/references/configuration-parameters/#frontend when modifying
    frontend:
      # NOTE: This was modified from the *-headless service (Is there a downside?)
      {{- if .Values.query_scheduler.enabled }}
      scheduler_address: {{ template "mimir.fullname" . }}-query-scheduler.{{ .Release.Namespace }}.svc:{{ include "mimir.serverGrpcListenPort" . }}
      {{- end }}
      # Downstream URL of Mimir Querier, because some API calls just directly go to the downstream Querier
      downstream_url: http://{{ template "mimir.fullname" . }}-querier.{{ .Release.Namespace }}.svc:{{ include "mimir.serverHttpListenPort" . }}
      address: {{ template "mimir.fullname" . }}-query-frontend.{{ .Release.Namespace }}.svc
      port: {{ include "mimir.serverGrpcListenPort" . }}

    # Check https://grafana.com/docs/mimir/latest/references/configuration-parameters/#frontend_worker when modifying
    frontend_worker:
      # NOTE: This was modified from the *-headless service (Is there a downside?)
      {{- if .Values.query_scheduler.enabled }}
      scheduler_address: {{ template "mimir.fullname" . }}-query-scheduler.{{ .Release.Namespace }}.svc:{{ include "mimir.serverGrpcListenPort" . }}
      # NOTE: This was modified from the *-headless service (Is there a downside?)
      {{- else }}
      frontend_address: {{ template "mimir.fullname" . }}-query-frontend.{{ .Release.Namespace }}.svc:{{ include "mimir.serverGrpcListenPort" . }}
      {{- end }}

    blocks_storage:
      gcs:
        bucket_name: {{ .Values.blocks_bucket_name }}

    alertmanager_storage:
      gcs:
        bucket_name: {{ .Values.alert_bucket_name }}

    ruler_storage:
      gcs:
        bucket_name: {{ .Values.ruler_bucket_name }}

    ingester:
      ring:
        final_sleep: 0s
        num_tokens: 512
        tokens_file_path: /data/tokens
        unregister_on_shutdown: false
      
    ingester_client:
      grpc_client_config:
        max_recv_msg_size: 104857600
        max_send_msg_size: 104857600

    limits:
      # Limit queries to 500 days. You can override this on a per-tenant basis.
      max_total_query_length: 12000h
      # Adjust max query parallelism to 16x sharding, without sharding we can run 15d queries fully in parallel.
      # With sharding we can further shard each day another 16 times. 15 days * 16 shards = 240 subqueries.
      max_query_parallelism: 240
      # Avoid caching results newer than 10m because some samples can be delayed
      # This presents caching incomplete results
      max_cache_freshness: 10m

    memberlist:
      abort_if_cluster_join_fails: false
      compression_enabled: false
      join_members:
      - dns+{{ include "mimir.fullname" . }}-gossip-ring.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}:{{ include "mimir.memberlistBindPort" . }}
  
    querier:
      # With query sharding we run more but smaller queries. We must strike a balance
      # which allows us to process more sharded queries in parallel when requested, but not overload
      # queriers during non-sharded queries.
      max_concurrent: 16

    query_scheduler:
      # Increase from default of 100 to account for queries created by query sharding
      max_outstanding_requests_per_tenant: 800

    alertmanager:
      data_dir: /data
      enable_api: true
      external_url: /alertmanager
      {{- if .Values.alertmanager.fallbackConfig }}
      fallback_config_file: /configs/alertmanager_fallback_config.yaml
      {{- end }}

    ruler:
      # NOTE: This was modified from the *-headless service (Is there a downside?)
      alertmanager_url: dnssrvnoa+http://_http-metrics._tcp.{{ template "mimir.fullname" . }}-alertmanager.{{ .Release.Namespace }}.svc.{{ .Values.global.clusterDomain }}/alertmanager
      enable_api: true
      query_frontend:
        address: {{ template "mimir.fullname" . }}-query-frontend.{{ .Release.Namespace }}.svc:{{ include "mimir.serverGrpcListenPort" . }}

    runtime_config:
      file: /var/{{ include "mimir.name" . }}/runtime.yaml

compactor:
  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0
ingester:
  zoneAwareReplication:
    enabled: false
  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0
alertmanager:
  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0
ruler:
  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0
store_gateway:
  zoneAwareReplication:
    enabled: false
  containerSecurityContext:
    readOnlyRootFilesystem: false
    runAsNonRoot: false
    runAsUser: 0

minio:
  enabled: false

query_frontend:
  replicas: 1

query_scheduler:
  enabled: true
  replicas: 1

querier:
  replicas: 3

distributor:
  replicas: 1

overrides_exporter:
  enabled: false

nginx:
  enabled: false

gateway:
  enabledNonEnterprise: true
  ingress:
    enabled: false

dimitarvdimitrov · 2023-09-04T11:20:42Z

the need to enable root filesystem access is due to the default values of Mimir. Those use the current directory for storing files. These default values are overridden in the helm chart so that only attached volumes are used. So by default the helm chart doesn't need root filesystem access.

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 215 in 08d1c65

dir: /data/tsdb

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 205 in 08d1c65

sync_dir: /data/tsdb-sync

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 118 in 08d1c65

filepath: /active-query-tracker/activity.log

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 149 in 08d1c65

data_dir: /data

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 230 in 08d1c65

data_dir: "/data"

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 292 in 08d1c65

tokens_file_path: /data/tokens

mimir/operations/helm/charts/mimir-distributed/values.yaml

Line 342 in 08d1c65

rule_path: /data

However, since you've set the mimir.config value, these values do not propagate down to the rendered configmap. It's best if you set your configuration modifications via mimir.structuredConfig instead of mimir.config. Check out Manage the configuration of Grafana Mimir with Helm for more details.

I'm closing this because it seems that this is not an issue with the chart. Reopen if you think the chart is still non-compliant.

dimitarvdimitrov added the helm label Aug 16, 2023

dimitarvdimitrov closed this as completed Sep 4, 2023

EoinFarrell mentioned this issue Jan 3, 2025

Bug: [Jsonnet] Statefulsets have SecurityContext.runAsUser(0) set #10338

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`mimir-distributed` helm chart not following `Restricted Pod Security Standard` as claimed by Grafana docs #5758

`mimir-distributed` helm chart not following `Restricted Pod Security Standard` as claimed by Grafana docs #5758

dorkamotorka commented Aug 16, 2023

dimitarvdimitrov commented Aug 16, 2023

dorkamotorka commented Aug 30, 2023

dimitarvdimitrov commented Sep 4, 2023

mimir-distributed helm chart not following Restricted Pod Security Standard as claimed by Grafana docs #5758

mimir-distributed helm chart not following Restricted Pod Security Standard as claimed by Grafana docs #5758

Comments

dorkamotorka commented Aug 16, 2023

Describe the bug

To Reproduce

Expected behavior

Environment

dimitarvdimitrov commented Aug 16, 2023

dorkamotorka commented Aug 30, 2023

dimitarvdimitrov commented Sep 4, 2023

`mimir-distributed` helm chart not following `Restricted Pod Security Standard` as claimed by Grafana docs #5758

`mimir-distributed` helm chart not following `Restricted Pod Security Standard` as claimed by Grafana docs #5758