Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The operator does not watch clickhouse Keeper StatefulSets #1597

Open
mandreasik opened this issue Dec 12, 2024 · 2 comments
Open

The operator does not watch clickhouse Keeper StatefulSets #1597

mandreasik opened this issue Dec 12, 2024 · 2 comments

Comments

@mandreasik
Copy link

Hi,

I'm using clickhouse-operator version 0.24.0 and I've encountered the following issue:

The operator does not watch clickhouse Keeper StatefulSets.

I accidentally deleted one of the clickhouse Keeper StatefulSets, and the operator did not take any actions. This left me with a broken cluster as the pod vanished along with the StatefulSet.

I tried to force operator to take action by modifying sth in the clickHouseKeeperInstallation settings section, but after that all pods were killed, operator started only first one clickhouse-keeper and hanged forever.

On the other hand, clickhouse instances have this feature. If you delete a StatefulSet, it immediately gets restored in the cluster, as one would expect from the operator.

How to reproduce:

  1. Use the example cluster configuration: https://github.com/Altinity/clickhouse-operator/blob/master/docs/chk-examples/02-extended-3-nodes.yaml
  2. Delete one of the StatefulSets.
  3. Check the CHK resource status:
NAME       CLUSTERS   HOSTS   STATUS      HOSTS-COMPLETED   AGE
extended   1          3       Completed                     4m57s
  1. Check the pods:
kubectl get pods -l clickhouse-keeper.altinity.com/chk=extended
NAME                          READY   STATUS    RESTARTS   AGE
chk-extended-cluster1-0-0-0   1/1     Running   0          5m20s
chk-extended-cluster1-0-1-0   1/1     Running   0          5m20s

Congratulations, you have a degraded cluster ;/

@alex-zaitsev
Copy link
Member

alex-zaitsev commented Dec 27, 2024

@mandreasik , StatefulSet deletion is not detected for CHI as well. You need to trigger reconcile to get those recreated.
The best way to do it is adding .spec.TaskID with some unique string -- that will trigger reconcile and create missing objects if any.

p.s. Use 0.24.2, there was a number of fixes for CHK reliability

@mandreasik
Copy link
Author

@alex-zaitsev thank you for a workaround of this issue. I have double checked and you're right that CHI also does not detect deletion of StatefulSet.

To be honest I would really like to see operator do a job of detecting such case instead of doing it manually. I hope some day it will be implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants