-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s.io] Rescheduler [Serial] should ensure that critical pod is scheduled in case there is no resources available {Kubernetes e2e suite} #32531
Comments
[FLAKE-PING] @mtaufen This flaky-test issue would love to have more attention. |
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-staging/31/ Failed: [k8s.io] Rescheduler [Serial] should ensure that critical pod is scheduled in case there is no resources available {Kubernetes e2e suite}
|
Failed: [k8s.io] Rescheduler [Serial] should ensure that critical pod is scheduled in case there is no resources available {Kubernetes e2e suite}
|
cc/ @piosz |
Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778) Fixed potential OutOfSync of nodeInfo. The cloned NodeInfo still share the same resource objects in cache; it may make `requestedResource` and Pods OutOfSync, for example, if the pod was deleted, the `requestedResource` is updated by Pods are not in cloned info. Found this when investigating #32531 , but seems not the root cause, as nodeInfo are readonly in predicts & priorities. Sample codes for `&(*)`: ``` package main import ( "fmt" ) type Resource struct { A int } type Node struct { Res *Resource } func main() { r1 := &Resource { A:10 } n1 := &Node{Res: r1} r2 := &(*n1.Res) r2.A = 11 fmt.Printf("%t, %d %d\n", r1==r2, r1, r2) } ``` Output: ``` true, &{11} &{11} ```
@davidopp please correct me if I'm wrong but tolerations/taints is already migrated to fields in HEAD. This means that to fix this issue we need to migrate rescheduler to use fields. I'll do it very soon. |
@piosz Do you think this issue indicates a regression that should block 1.6, which would require that a fix be available asap? Or is it a test-only issue that can be moved to the 1.6.1 or 1.7 milestone? |
@marun the former one. See kubernetes-retired/contrib#2382 |
+1 |
#42686 was closed in favor of this issue, since we expect the bot to re-open this |
Status: @piosz is working on a fix |
Automatic merge from submit-queue (batch tested with PRs 43106, 43110) Bumped rescheduler version to 0.3.0 fix #32531 kubernetes-retired/contrib#2474 needs to be merged first cc @ethernetdan @marun @k82cn @aveshagarwal
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gke-serial/2248/
Failed: [k8s.io] Rescheduler [Serial] should ensure that critical pod is scheduled in case there is no resources available {Kubernetes e2e suite}
Previous issues for this test: #31277 #31347 #31710 #32260
The text was updated successfully, but these errors were encountered: