Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050

goldmann · 2018-05-09T09:39:42Z

Issue Overview

I see some (random) errors like this:

09:30:34  [ERROR] org.jboss.test.arquillian.ce.sso.SsoEap64SecureDeploymentsTest  Time elapsed: 1.996 s  <<< ERROR!
09:30:34  io.fabric8.kubernetes.clnt.v3_1.KubernetesClientException: Failure executing: PATCH at: https://MASTER:8443/api/v1/namespaces/test-sso-1dddeeef/replicationcontrollers/eap-app-1. Message: Forbidden! User arq-sso/MASTER:8443 doesn't have permission. replicationcontrollers "eap-app-1" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: User "arq-sso" cannot update deploymentconfigs/finalizers.apps.openshift.io in project "test-sso-1dddeeef", <nil>.
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.OperationSupport.requestFailure(OperationSupport.java:470)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:407)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.OperationSupport.handlePatch(OperationSupport.java:279)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.handlePatch(BaseOperation.java:783)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.HasMetadataOperation$3.apply(HasMetadataOperation.java:156)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.HasMetadataOperation$3.apply(HasMetadataOperation.java:152)
09:30:34  	at io.fabric8.kubernetes.api.model.v3_1.DoneableReplicationController.done(DoneableReplicationController.java:26)
09:30:34  	at io.fabric8.kubernetes.api.model.v3_1.DoneableReplicationController.done(DoneableReplicationController.java:5)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:163)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.RollableScalableResourceOperation.patch(RollableScalableResourceOperation.java:175)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.HasMetadataOperation$1.apply(HasMetadataOperation.java:60)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.HasMetadataOperation$1.apply(HasMetadataOperation.java:50)
09:30:34  	at io.fabric8.kubernetes.api.model.v3_1.DoneableReplicationController.done(DoneableReplicationController.java:26)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.ReplicationControllerOperationsImpl.withReplicas(ReplicationControllerOperationsImpl.java:85)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.ReplicationControllerOperationsImpl.withReplicas(ReplicationControllerOperationsImpl.java:42)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.RollableScalableResourceOperation.scale(RollableScalableResourceOperation.java:77)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.RollableScalableResourceOperation.scale(RollableScalableResourceOperation.java:45)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.ScalableReaper.reap(ScalableReaper.java:30)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:621)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:70)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:662)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.deleteList(BaseOperation.java:696)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:636)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:70)
09:30:34  	at io.fabric8.openshift.clnt.v3_1.dsl.internal.DeploymentConfigOperationsImpl$DeploymentConfigReaper.reap(DeploymentConfigOperationsImpl.java:133)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:621)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:70)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:662)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.base.BaseOperation.delete(BaseOperation.java:650)
09:30:34  	at io.fabric8.openshift.clnt.v3_1.handlers.DeploymentConfigHandler.delete(DeploymentConfigHandler.java:63)
09:30:34  	at io.fabric8.openshift.clnt.v3_1.handlers.DeploymentConfigHandler.delete(DeploymentConfigHandler.java:33)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.KubernetesListOperationsImpl.delete(KubernetesListOperationsImpl.java:157)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.KubernetesListOperationsImpl.delete(KubernetesListOperationsImpl.java:149)
09:30:34  	at io.fabric8.kubernetes.clnt.v3_1.dsl.internal.KubernetesListOperationsImpl.delete(KubernetesListOperationsImpl.java:47)
09:30:34  	at org.arquillian.cube.openshift.impl.fabric8.F8OpenShiftAdapter.deleteTemplate(F8OpenShiftAdapter.java:438)
09:30:34  	at org.arquillian.cube.openshift.impl.resources.OpenShiftResourceFactory.deleteTemplates(OpenShiftResourceFactory.java:230)
09:30:34  	at org.arquillian.cube.openshift.impl.resources.OpenShiftResourceFactory.deleteEnvironment(OpenShiftResourceFactory.java:248)
09:30:34  	at org.arquillian.cube.openshift.impl.CEEnvironmentProcessor.deleteEnvironment(CEEnvironmentProcessor.java:133)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
09:30:34  	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
09:30:34  	at java.lang.reflect.Method.invoke(Method.java:498)
09:30:34  	at org.jboss.arquillian.core.impl.ObserverImpl.invoke(ObserverImpl.java:86)
09:30:34  	at org.jboss.arquillian.core.impl.EventContextImpl.invokeObservers(EventContextImpl.java:103)
09:30:34  	at org.jboss.arquillian.core.impl.EventContextImpl.proceed(EventContextImpl.java:90)
09:30:34  	at org.jboss.arquillian.test.impl.TestContextHandler.createClassContext(TestContextHandler.java:83)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
09:30:34  	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
09:30:34  	at java.lang.reflect.Method.invoke(Method.java:498)
09:30:34  	at org.jboss.arquillian.core.impl.ObserverImpl.invoke(ObserverImpl.java:86)
09:30:34  	at org.jboss.arquillian.core.impl.EventContextImpl.proceed(EventContextImpl.java:95)
09:30:34  	at org.jboss.arquillian.test.impl.TestContextHandler.createSuiteContext(TestContextHandler.java:69)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
09:30:34  	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
09:30:34  	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
09:30:34  	at java.lang.reflect.Method.invoke(Method.java:498)
09:30:34  	at org.jboss.arquillian.core.impl.ObserverImpl.invoke(ObserverImpl.java:86)
09:30:34  	at org.jboss.arquillian.core.impl.EventContextImpl.proceed(EventContextImpl.java:95)
09:30:34  	at org.jboss.arquillian.core.impl.ManagerImpl.fire(ManagerImpl.java:133)
09:30:34  	at org.jboss.arquillian.core.impl.ManagerImpl.fire(ManagerImpl.java:105)
09:30:34  	at org.jboss.arquillian.test.impl.EventTestRunnerAdaptor.afterClass(EventTestRunnerAdaptor.java:95)
09:30:34  	at org.jboss.arquillian.junit.Arquillian$3$1.evaluate(Arquillian.java:183)
09:30:34  	at org.jboss.arquillian.junit.Arquillian.multiExecute(Arquillian.java:350)
09:30:34  	at org.jboss.arquillian.junit.Arquillian.access$200(Arquillian.java:54)
09:30:34  	at org.jboss.arquillian.junit.Arquillian$3.evaluate(Arquillian.java:177)
09:30:34  	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
09:30:34  	at org.jboss.arquillian.junit.Arquillian.run(Arquillian.java:115)
09:30:34  	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
09:30:34  	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
09:30:34  	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
09:30:34  	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
09:30:34  	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
09:30:34  	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
09:30:34  	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
09:30:34  	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)

This execution is part of CI run in Jenkins. This is not happening on my laptop. I have the exact same oc version available here and there. The OCP (3.9) environment I use is the same. The user used for the OCP env is the same and the commands executed are the same. Difference is in the host OS (Fedora locally, RHEL on CI) and in the location of these two machines.

My assumption is that tests pass, the error happens only when Cube tries to clean up the environment.

Any clue?

The text was updated successfully, but these errors were encountered:

goldmann · 2018-05-09T11:47:36Z

The only thing that comes to my mind is that maybe the deletion of resources is somewhat too fast? When the failure occurs, both - OCP and test host is located in the same datacenter minimizing the network delays. When I execute it from my local machine, it always passes, but the test execution happens on the other end of the globe.

Additional note is that test failures are pretty much random - sometimes 2, 4 or 6 tests fail of the whol 50+ tests testsuite.

/cc @rcernich

goldmann · 2018-05-09T12:26:38Z

I played a bit with parameters and I made it work by setting the namespace.cleanup.enabled parameter to false and namespace.destroy.enabled to true. This removes the project created by test suite but does not try to remove any resources in that project upfront. These resource are removed as part of the project removal, so this configuration works for me well.

I'm not sure what to do with this ticket, should I close it or you want to investigate why it actually failed to remove resources? For me it looks like a bug and above can be seen as workaround.

rcernich · 2018-05-09T16:20:18Z

Is the issue related to the order in which the resources are deleted? It almost seems like the user is getting nuked before it removes everything else.

Regarding random failures, I think there are some things in OpenShift/k8s that occur asynchronously. If we could add some blocking logic so we wait until things are complete, that might help eliminate these random types of failures. (For example, routes are updated asynchronously and we often have issues where tests fail because the route hasn't been updated in haproxy.)

mhofstetter · 2018-05-15T10:22:18Z

i'm facing the same problem after using arquillian cube openshift with an openshift cluster in version 3.7. (didn't had the problem in 3.6)

after some debugging i think the problem is related to a race condition issue in the fabric8 kubernetes client.

After deleting (non-cascaded) the DeploymentConfig the DeploymentConfigReaper tries to delete all managed ReplicationControllers of the deleted DeploymentConfig.

The newly created ReplicationControllerOperationsImpl then performs a deletion with an implicit scaleDownToZero. This by loading the current state of the ReplicationController at the beginning of the operation (T1), performing the operation on the local stored object and than evaluates the patch. To evaluate the patch - it loads the current state of the ReplicationController again (T2).

This gets problematically when kubernetes itself modifies the related ReplicationController after the deletion of the DeploymentConfig too, by removing the 'metadata/ownerReferences' to the deleted DeploymentConfig.

If this internal modification falls between T1 and T2 the fabric8 client tries to re-add the ownerReference during the patching of the 'scaleDown' - as is detects the diff.

[
{"op":"add",
"path":"/metadata/ownerReferences",
"value":[
    {
        "apiVersion":"apps.openshift.io/v1",
        "kind":"DeploymentConfig",
        "blockOwnerDeletion":true,
        "controller":true,
        "name":"test",
        "uid":"dfa8774a-5375-11e8-8b90-0050569bf613"
    }
    ]
}
]

in conjunction with the fact, that kubernetes since 1.7 checks the ownerReference with the admission controller, this results in the error.

so maybe it's better to exclude the ownerReferences from the patch-set as it is also recommended in the kubernetes documentation.

If an object’s ownerReferences field is set by a controller (such as Deployment or ReplicaSet), blockOwnerDeletion is set automatically and you do not need to manually modify this field.

lordofthejars · 2018-05-15T10:56:34Z

So this is a f8 kubernetes client issue that should be controlled there? Maybe we could open an issue there so they are aware.

lburgazzoli · 2018-08-01T07:56:41Z

I've opened this an issue upstream as I'm also experiencing the same behaviour

gtully · 2018-09-13T15:34:05Z

looks like there is a v4.0.4 with a fix.
fabric8io/kubernetes-client@05f5c4f

This was referenced Jul 31, 2018

Error 500 when deleting integrations syndesisio/syndesis#3151

Closed

Deleting fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on fabric8io/kubernetes-client#1162

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050

Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050

goldmann commented May 9, 2018

goldmann commented May 9, 2018

goldmann commented May 9, 2018

rcernich commented May 9, 2018

mhofstetter commented May 15, 2018

lordofthejars commented May 15, 2018

lburgazzoli commented Aug 1, 2018

gtully commented Sep 13, 2018

Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050

Deleting environment fails with: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on #1050

Comments

goldmann commented May 9, 2018

Issue Overview

goldmann commented May 9, 2018

goldmann commented May 9, 2018

rcernich commented May 9, 2018

mhofstetter commented May 15, 2018

lordofthejars commented May 15, 2018

lburgazzoli commented Aug 1, 2018

gtully commented Sep 13, 2018