Refactor Hive Directory #3765

razo7 · 2024-08-08T14:05:00Z

Which issue this PR addresses:

Group the Hive files from hack directory to hack/hive directory.
Add main function, and use hack/utils.sh functions
Hive installation was timed out using the hack/hive-dev-install.sh script, so we remove the redundant timeout waiting.

.
.
.
clusterrolebinding.rbac.authorization.k8s.io/hive-operator-rolebinding created
deployment.apps/hive-operator created
deployment.apps/hive-operator condition met
Error from server (NotFound): deployments.apps "hive-controllers" not found

What this PR does / why we need it:

Simplify Hive scripts and creation and also use common functions.
Aligning with production Hive installation by not waiting for the hive-controllers deployment to be Available and the pod to be Ready. It's the responsibility of the operator to manage its resources, and it is an overkill that even causes the Hive installation to fail on waiting. Rerun the script was succeeded
Fixing the Hive installation timeout issue would help Containerized Full RP Dev Automation #3764 automation.

Test plan for issue:

Not needed

Is there any documentation that needs to be updated for this PR?

Yes, due to the change of files location.

How do you know this will function as expected in production?

razo7 · 2024-08-08T14:06:05Z

/azp run ci, e2e

azure-pipelines · 2024-08-08T14:06:19Z

Azure Pipelines successfully started running 2 pipeline(s).

hawkowl

I don't think we want to remove it entirely, but increase it to some amount that Hive will surely be operational by. If we do want to just trust in the operator, then we should instead put in some output of the script that indicates how to tell when Hive is operational and you can use it (e.g. "check readiness via kubectl x y z to monitor rollout").

tsatam · 2024-08-12T13:15:20Z

These waits/timeouts were intentionally added in order to solve for issues we were facing when deploying Hive into ARO production clusters, where the deployment itself was "healthy" but the Hive operator was degraded due to various issues (namely missing OpenShift-specific CRDs in the AKS clusters we run Hive in).

Without these checks, the script would exit "successfully" and thus our deployment would be marked as successful, despite Hive being degraded.

That being said, I don't necessarily think we need to retain this specific solution to that problem, and there's probably a better long-term solution (e.g. release pipeline takes Hive health metrics into account).

For now I think we could make this change if truly desired, since this script isn't itself used for production Hive deployments. I just backported the changes we made to the production deployment script here.

razo7 · 2024-08-27T10:43:27Z

hack/hive/hive-dev-install.sh

+	echo "$PULL_SECRET" > /tmp/.tmp-secret
+	# Using dry-run allows updates to work seamlessly
+	$KUBECTL create secret generic hive-global-pull-secret --from-file=.dockerconfigjson=/tmp/.tmp-secret --type=kubernetes.io/dockerconfigjson --namespace $HIVE_OPERATOR_NS -o yaml --dry-run=client | $KUBECTL apply -f - 2>/dev/null
+	rm -f /tmp/.tmp-secret


Not used in production, so it might be possible to remove it from here as well.

razo7 · 2024-08-27T12:06:40Z

/azp run ci, e2e

azure-pipelines · 2024-08-27T12:06:53Z

Azure Pipelines successfully started running 2 pipeline(s).

razo7 · 2024-08-28T04:59:58Z

/azp run e2e

azure-pipelines · 2024-08-28T05:00:10Z

Azure Pipelines successfully started running 1 pipeline(s).

hack/hive/hive-config/generate.go

Group the Hive files under hack directory to hack/hive

Group the Hive files under hack directory to hack/hive, and refactor Hive instllation using main function and utils.sh

razo7 · 2024-09-10T08:42:41Z

/azp run ci, e2e

azure-pipelines · 2024-09-10T08:42:54Z

Azure Pipelines successfully started running 2 pipeline(s).

Trust in the operator installation and print two options to monitor Hive deployment rollout

Use double quote to prevent word splitting, break long line into multiple, use '-n' over '! -z', simpler if check, use consistent function declaration syntax, trap outside main and after clenup is declared

razo7 · 2024-09-11T10:21:03Z

/azp run ci, e2e

azure-pipelines · 2024-09-11T10:21:15Z

Azure Pipelines successfully started running 2 pipeline(s).

tiguelu · 2024-09-11T10:31:43Z

I don't think we want to remove it entirely, but increase it to some amount that Hive will surely be operational by. If we do want to just trust in the operator, then we should instead put in some output of the script that indicates how to tell when Hive is operational and you can use it (e.g. "check readiness via kubectl x y z to monitor rollout").

Thank you Amber. A log line has been added to print commands to check readiness.

tiguelu

All requested changes have been addressed. LGTM.

* Move Hive hack files under one directory Group the Hive files under hack directory to hack/hive * Refactor Hive installation and hack files location Group the Hive files under hack directory to hack/hive, and refactor Hive installation using main function and utils.sh * Print troubleshooting for Hive deployment rollout Trust in the operator installation and print two options to monitor Hive deployment rollout * Small fixes for hive installation script Use double quote to prevent word splitting, break long line into multiple, use '-n' over '! -z', simpler if check, use consistent function declaration syntax, trap outside main and after cleanup is declared

razo7 changed the title ~~Remove timeout for Hive installation~~ Remove Timeout for Hive Installation Aug 11, 2024

hawkowl requested changes Aug 12, 2024

View reviewed changes

razo7 mentioned this pull request Aug 14, 2024

Containerized Full RP Dev Automation #3764

Open

razo7 force-pushed the time-out-hive-installation branch from 5d0538a to ffbf745 Compare August 27, 2024 10:05

razo7 changed the title ~~Remove Timeout for Hive Installation~~ Refactor Hive Directory Aug 27, 2024

razo7 force-pushed the time-out-hive-installation branch 2 times, most recently from df61f07 to 4045606 Compare August 27, 2024 10:36

razo7 force-pushed the time-out-hive-installation branch from 4045606 to a51641c Compare August 27, 2024 10:38

razo7 commented Aug 27, 2024

View reviewed changes

razo7 commented Aug 28, 2024

View reviewed changes

hack/hive/hive-config/generate.go Show resolved Hide resolved

SudoBrendan requested changes Sep 5, 2024

View reviewed changes

hack/hive/hive-config/generate.go Show resolved Hide resolved

razo7 added 2 commits September 8, 2024 09:54

Move Hive hack files under one directory

1c393b5

Group the Hive files under hack directory to hack/hive

Refactor Hive instllation and hack files location

e5739dd

Group the Hive files under hack directory to hack/hive, and refactor Hive instllation using main function and utils.sh

razo7 force-pushed the time-out-hive-installation branch from a51641c to e5739dd Compare September 8, 2024 06:54

razo7 requested a review from bitoku as a code owner September 8, 2024 06:54

razo7 added 2 commits September 11, 2024 13:15

Print troubleshooting for Hive deployment rollout

c460acf

Trust in the operator installation and print two options to monitor Hive deployment rollout

small fixes for hive installation script

398b186

Use double quote to prevent word splitting, break long line into multiple, use '-n' over '! -z', simpler if check, use consistent function declaration syntax, trap outside main and after clenup is declared

tiguelu approved these changes Sep 11, 2024

View reviewed changes

tiguelu requested review from hawkowl and SudoBrendan September 11, 2024 12:24

tiguelu merged commit abf4167 into master Sep 11, 2024
24 checks passed

tiguelu deleted the time-out-hive-installation branch September 11, 2024 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Hive Directory #3765

Refactor Hive Directory #3765

razo7 commented Aug 8, 2024 •

edited

Loading

razo7 commented Aug 8, 2024

azure-pipelines bot commented Aug 8, 2024

hawkowl left a comment

tsatam commented Aug 12, 2024

razo7 Aug 27, 2024

razo7 commented Aug 27, 2024

azure-pipelines bot commented Aug 27, 2024

razo7 commented Aug 28, 2024

azure-pipelines bot commented Aug 28, 2024

razo7 commented Sep 10, 2024

azure-pipelines bot commented Sep 10, 2024

razo7 commented Sep 11, 2024

azure-pipelines bot commented Sep 11, 2024

tiguelu commented Sep 11, 2024

tiguelu left a comment

Refactor Hive Directory #3765

Refactor Hive Directory #3765

Conversation

razo7 commented Aug 8, 2024 • edited Loading

Which issue this PR addresses:

What this PR does / why we need it:

Test plan for issue:

Is there any documentation that needs to be updated for this PR?

How do you know this will function as expected in production?

razo7 commented Aug 8, 2024

azure-pipelines bot commented Aug 8, 2024

hawkowl left a comment

Choose a reason for hiding this comment

tsatam commented Aug 12, 2024

razo7 Aug 27, 2024

Choose a reason for hiding this comment

razo7 commented Aug 27, 2024

azure-pipelines bot commented Aug 27, 2024

razo7 commented Aug 28, 2024

azure-pipelines bot commented Aug 28, 2024

razo7 commented Sep 10, 2024

azure-pipelines bot commented Sep 10, 2024

razo7 commented Sep 11, 2024

azure-pipelines bot commented Sep 11, 2024

tiguelu commented Sep 11, 2024

tiguelu left a comment

Choose a reason for hiding this comment

razo7 commented Aug 8, 2024 •

edited

Loading