-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050
Comments
The only thing that comes to my mind is that maybe the deletion of resources is somewhat too fast? When the failure occurs, both - OCP and test host is located in the same datacenter minimizing the network delays. When I execute it from my local machine, it always passes, but the test execution happens on the other end of the globe. Additional note is that test failures are pretty much random - sometimes 2, 4 or 6 tests fail of the whol 50+ tests testsuite. /cc @rcernich |
I played a bit with parameters and I made it work by setting the I'm not sure what to do with this ticket, should I close it or you want to investigate why it actually failed to remove resources? For me it looks like a bug and above can be seen as workaround. |
Is the issue related to the order in which the resources are deleted? It almost seems like the user is getting nuked before it removes everything else. Regarding random failures, I think there are some things in OpenShift/k8s that occur asynchronously. If we could add some blocking logic so we wait until things are complete, that might help eliminate these random types of failures. (For example, routes are updated asynchronously and we often have issues where tests fail because the route hasn't been updated in haproxy.) |
i'm facing the same problem after using arquillian cube openshift with an openshift cluster in version 3.7. (didn't had the problem in 3.6) after some debugging i think the problem is related to a race condition issue in the fabric8 kubernetes client. After deleting (non-cascaded) the DeploymentConfig the DeploymentConfigReaper tries to delete all managed ReplicationControllers of the deleted DeploymentConfig. The newly created ReplicationControllerOperationsImpl then performs a deletion with an implicit scaleDownToZero. This by loading the current state of the ReplicationController at the beginning of the operation (T1), performing the operation on the local stored object and than evaluates the patch. To evaluate the patch - it loads the current state of the ReplicationController again (T2). This gets problematically when kubernetes itself modifies the related ReplicationController after the deletion of the DeploymentConfig too, by removing the 'metadata/ownerReferences' to the deleted DeploymentConfig. If this internal modification falls between T1 and T2 the fabric8 client tries to re-add the ownerReference during the patching of the 'scaleDown' - as is detects the diff.
in conjunction with the fact, that kubernetes since 1.7 checks the ownerReference with the admission controller, this results in the error. so maybe it's better to exclude the ownerReferences from the patch-set as it is also recommended in the kubernetes documentation.
|
So this is a f8 kubernetes client issue that should be controlled there? Maybe we could open an issue there so they are aware. |
I've opened this an issue upstream as I'm also experiencing the same behaviour |
looks like there is a v4.0.4 with a fix. |
Issue Overview
I see some (random) errors like this:
This execution is part of CI run in Jenkins. This is not happening on my laptop. I have the exact same
oc
version available here and there. The OCP (3.9) environment I use is the same. The user used for the OCP env is the same and the commands executed are the same. Difference is in the host OS (Fedora locally, RHEL on CI) and in the location of these two machines.My assumption is that tests pass, the error happens only when Cube tries to clean up the environment.
Any clue?
The text was updated successfully, but these errors were encountered: