-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Etcd Pods Crashloop in GreptimeDB Deployment leading to data loss #5218
Comments
It's unexpected. The etcd is designed to withstand machine failures. An etcd cluster automatically recovers from temporary failures (e.g., machine reboots) and tolerates up to (N-1)/2 permanent failures for a cluster of N members. Accroding to the etcd DR document https://etcd.io/docs/v3.3/op-guide/recovery/ |
Can you give me more context on how to simulate the network partition? I can use your scenarios to try to reproduce the issue. |
@cbisht31 For my experience, it might be:
|
@killme2008 @zyy17 seems like this issue is fixed in the latest |
What type of bug is this?
Crash
What subsystems are affected?
Distributed Cluster, Frontend, Datanode
Minimal reproduce step
What did you expect to see?
What did you see instead?
What operating system did you use?
Windows 10 x64
What version of GreptimeDB did you use?
0.9.5
Relevant log output and stack trace
The text was updated successfully, but these errors were encountered: