Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClickHouse Keeper in RO mode due to incorrect permissions on snapshots directory #1524

Open
linux-wizard opened this issue Oct 10, 2024 · 4 comments
Labels
Keeper ClickHouse Keeper issues

Comments

@linux-wizard
Copy link

I deployed ClickHouse keeper using clickhouse-operator 0.24.0 with 3 nodes and a PVC.
Unfortunately ClickHouse Keeper is in Read-Only mode because it failed to write to the snapshot directory /var/lib/clickhouse-keeper/coordination/logs/ as they have incorrect permissions.

Below is error message:

2024.10.07 16:52:07.388939 [ 1 ] {} <Error> void DB::Changelog::readChangelogAndInitWriter(uint64_t, uint64_t): Code: 76. DB::ErrnoException: Cannot open file /var/lib/clickhouse-keeper/coordination/logs/changelog_1_100000.bin: , errno: 13, strerror: Permission denied. (CANNOT_OPEN_FILE), Stack trace (when copying this message, always include the lines below):

I can deploy a working ClickHouse Keeper when not using PVC using clickhouse-operator 0.23.7

# ---
# # Fake Service to drop-in replacement Zookeeper with CHK
# apiVersion: v1
# kind: Service
# metadata:
#   # DNS would be like zookeeper.namespace.svc
#   name: zookeeper
#   labels:
#     app: zookeeper
# spec:
#   ports:
#     - port: 2181
#       name: client
#     - port: 7000
#       name: prometheus
#   selector:
#     app: clickhouse-keeper
#     what: node
---
apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"
metadata:
  name: xxxxxxx
  labels:
    app: clickhouse-keeper
spec:
  configuration:
    clusters:
      - name: "chk-3"
        layout:
          replicasCount: 3
    settings:
      logger/level: "trace"
      logger/console: "true"
      listen_host: "0.0.0.0"
      keeper_server/storage_path: /var/lib/clickhouse-keeper
      keeper_server/tcp_port: "2181"
      keeper_server/four_letter_word_white_list: "*"
      keeper_server/coordination_settings/raft_logs_level: "information"
      keeper_server/raft_configuration/server/port: "9444"
      prometheus/endpoint: "/metrics"
      prometheus/port: "7000"
      prometheus/metrics: "true"
      prometheus/events: "true"
      prometheus/asynchronous_metrics: "true"
      prometheus/status_info: "false"

  defaults:
    templates:
      # Templates are specified as default for all clusters
      podTemplate: default
    
  templates:
      podTemplates:
        - name: default
          spec:
            # affinity removed to allow use in single node test environment
            affinity:
              podAntiAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  - labelSelector:
                      matchExpressions:
                        - key: "app"
                          operator: In
                          values:
                            - clickhouse-keeper
                    topologyKey: "kubernetes.io/hostname"
            containers:
              - name: clickhouse-keeper
                imagePullPolicy: IfNotPresent
                image: "clickhouse/clickhouse-keeper:24-alpine"
                resources:
                  requests:
                    memory: "256M"
                    cpu: "1"
                  limits:
                    memory: "4Gi"
                    cpu: "2"
      # volumeClaimTemplates:
      #   - name: both-paths
      #     spec:
      #       storageClassName: gp3-retain
      #       accessModes:
      #         - ReadWriteOnce
      #       resources:
      #         requests:
      #           storage: 10Gi

It seems that by default /var/lib/clickhouse-keeper/coordination/{logs,snapshots} are ownded by root, but we need to ensure that everyone has write access.
Below are permissions when not using PVC

chk-edp-global-finance-1:/# ls -ltrh /var/lib/clickhouse-keeper/
total 8K     
drwxr-xr-x    4 root     root          35 Oct  9 11:32 coordination
-rw-r-----    1 clickhou clickhou      36 Oct  9 11:32 uuid
drwxr-x---    2 clickhou clickhou       6 Oct  9 11:32 rocksdb
-rw-r-----    1 clickhou clickhou      23 Oct  9 11:32 state
drwxr-x---    2 clickhou clickhou      31 Oct  9 11:32 preprocessed_configs
chk-edp-global-finance-1:/# ls -ltrh /var/lib/clickhouse-keeper/coordination/
total 0      
drwxrwxrwx    2 root     root          38 Oct  9 11:32 snapshots
drwxrwxrwx    2 root     root          41 Oct  9 11:32 logs

However I do believe it will be better to have these directories owned by root:clickhouse with rwxrwx--- permissions (770)

@alex-zaitsev alex-zaitsev added the Keeper ClickHouse Keeper issues label Oct 10, 2024
@alex-zaitsev
Copy link
Member

Would it help if you add securityContext as described here? #1370

Note, that CHK is not compatible between 0.23.7 and 0.24.0 -- see migration guide: https://github.com/Altinity/clickhouse-operator/blob/0.24.0/docs/keeper_migration_from_23_to_24.md

@chengjoey
Copy link
Contributor

Would it help if you add securityContext as described here? #1370

+1, this should be helpful

spec:
  securityContext:
    fsGroup: 101
    fsGroupChangePolicy: OnRootMismatch
    runAsGroup: 101
    runAsUser: 101

@alex-zaitsev
Copy link
Member

@chengjoey , we are hesitant to ingest it in the code by default. But maybe it is a good thing to do

@jaitaiwan
Copy link

Imo it should really be added by default if that's the permissions etc the container requires to be run. I can't think of any reason that this would be disadvantageous?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Keeper ClickHouse Keeper issues
Projects
None yet
Development

No branches or pull requests

4 participants