-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e for work suspension resume #5354
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #5354 +/- ##
==========================================
+ Coverage 29.01% 29.36% +0.34%
==========================================
Files 632 632
Lines 43862 43862
==========================================
+ Hits 12728 12878 +150
+ Misses 30218 30050 -168
- Partials 916 934 +18
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Just a reminder, the tests are failing:
|
@RainbowMango thanks for the logs. As mentioned in the PR description, there seems to be a bug in code because suspended dispatching is still allowing updates/deletion. I'm debugging this issue right now but wanted to push up the e2e changes in case anyone had ideas on why it's failing. |
e932996
to
ad2e9a7
Compare
32b7ddd
to
c3d539f
Compare
Hi @a7i Is there anything I can do for you? |
Thanks @XiShanYongYe-Chang Here is the test case:
=> Observe that Deployment is not deleted ✅
=> Observe that Deployment is deleted ❌ This is because when the Deployment is deleted (step 4) in the karmada control plane then:
And then you can't update a resource with deletion timestamp so I cannot unpause it. |
Signed-off-by: Amir Alavi <[email protected]>
/retest |
I think I figured it out 🤞🏼 |
/retest |
Hi @a7i I tested this case and the work resource ended up being left behind. This is not supposed to happen. This should be caused by my comment. The desired behavior is probably what you started with: the pause operation cannot affect the deletion of the resource. |
Then in that case we don't need an e2e test for deletion resume. I can fix up the intended behavior in a separate PR and remove the invalid test-case from this PR. Thoughts? |
Thanks a lot @a7i I agree with you. I'm sorry my previous comment wasn't well thought out and caused this problem. |
All good. I most certainly appreciate your feedback and guidance on this feature! ❤️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: XiShanYongYe-Chang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @a7i, I found a occasional e2e failure: https://github.com/karmada-io/karmada/actions/runs/10553025531/job/29232824137?pr=5423 • [FAILED] [300.256 seconds]
[Suspend] clusterPropagation testing suspend the ClusterPropagationPolicy dispatching [It] suspends Work
/home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:1077
Timeline >>
STEP: Creating ClusterPropagationPolicy(clusterrole-tmxsv) @ 08/26/24 03:46:47.688
STEP: Creating ClusterRole(system:test-clusterrole-tmxsv) @ 08/26/24 03:46:47.701
STEP: Updating ClusterPropagationPolicy(clusterrole-tmxsv) spec @ 08/26/24 03:46:47.705
[FAILED] in [It] - /home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:1085 @ 08/26/24 03:51:47.742
STEP: Removing ClusterPropagationPolicy(clusterrole-tmxsv) @ 08/26/24 03:51:47.929
STEP: Remove ClusterRole(system:test-clusterrole-tmxsv) @ 08/26/24 03:51:47.936
<< Timeline
[FAILED] Timed out after 300.000s.
Expected
<bool>: false
to equal
<bool>: true
In [It] at: /home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:1085 @ 08/26/24 03:51:47.742
Full Stack Trace
github.com/karmada-io/karmada/test/e2e.init.func7.4.2()
/home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:1085 +0x288 ClusterResourceBinding{
"kind": "ClusterResourceBinding",
"apiVersion": "work.karmada.io/v1alpha2",
"metadata": {
"name": "system.test-clusterrole-tmxsv-clusterrole",
"uid": "8b0c2f62-7b21-43cc-95fa-6656c01fa6e1",
"resourceVersion": "21296",
"generation": 4,
"creationTimestamp": "2024-08-26T03:46:47Z",
"deletionTimestamp": "2024-08-26T03:51:47Z",
"deletionGracePeriodSeconds": 0,
"labels": {
"clusterresourcebinding.karmada.io/permanent-id": "4ede7d27-97b9-475b-abf0-3b87ee46a16c"
},
"annotations": {
"policy.karmada.io/applied-placement": "{\"clusterAffinity\":{\"clusterNames\":[\"member1\"]},\"clusterTolerations\":[{\"key\":\"cluster.karmada.io/not-ready\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":30},{\"key\":\"cluster.karmada.io/unreachable\",\"operator\":\"Exists\",\"effect\":\"NoExecute\",\"tolerationSeconds\":30}]}"
},
"ownerReferences": [{
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "ClusterRole",
"name": "system:test-clusterrole-tmxsv",
"uid": "6ab3bd00-f68a-4ff5-9ac2-45a21b81eea1",
"controller": true,
"blockOwnerDeletion": true
}],
"finalizers": ["karmada.io/cluster-resource-binding-controller"]
},
"spec": {
"resource": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "ClusterRole",
"name": "system:test-clusterrole-tmxsv",
"uid": "6ab3bd00-f68a-4ff5-9ac2-45a21b81eea1",
"resourceVersion": "11592"
},
"clusters": [{
"name": "member1"
}],
"placement": {
"clusterAffinity": {
"clusterNames": ["member1"]
},
"clusterTolerations": [{
"key": "cluster.karmada.io/not-ready",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 30
}, {
"key": "cluster.karmada.io/unreachable",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 30
}]
},
"schedulerName": "default-scheduler",
"conflictResolution": "Abort"
},
"status": {
"schedulerObservedGeneration": 3,
"lastScheduledTime": "2024-08-26T03:50:16Z",
"conditions": [{
"type": "Scheduled",
"status": "True",
"lastTransitionTime": "2024-08-26T03:46:47Z",
"reason": "Success",
"message": "Binding has been scheduled successfully."
}, {
"type": "FullyApplied",
"status": "True",
"lastTransitionTime": "2024-08-26T03:46:47Z",
"reason": "FullyAppliedSuccess",
"message": "All works have been successfully applied"
}],
"aggregatedStatus": [{
"clusterName": "member1",
"applied": true,
"health": "Unknown"
}]
}
} It seem like the crb is not as expected and the work doesn't exist, but I can't find the root cause, do you have some inspire? |
I will take a look today. Thank you for the logs! |
@a7i another failure, same case, see: https://github.com/karmada-io/karmada/actions/runs/10579723458/job/29313224967?pr=4045 |
Unfortunately I'm having a hard time reproducing locally:
|
Giving this a try: #5440 |
I didn't reproduce it locally too, is it possible that different e2e cases affect each other? |
Yes, that's my thinking as well. ClusterRole is cluster-level so we cannot isolate it to a single namespace. We can conclude this, because the same test for PropagationPolicy is passing. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adding more e2e tests based on the proposal
Which issue(s) this PR fixes:
Part of #5217
Special notes for your reviewer:
Note that 2 of the tests are NOT passing, which indicates that there's a bug in execution / work sync somewhere
Does this PR introduce a user-facing change?: