Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: add new wait components "node_ready" and "no_pressure" #7611

Closed
wants to merge 52 commits into from

Conversation

medyagh
Copy link
Member

@medyagh medyagh commented Apr 11, 2020

closes #7610 and #6820

Before this PR:

when we were waiting for different components we didn't communicate to the user what we were waiting on.

😄  minikube v1.9.2 on Darwin 10.13.6
✨  Using the docker driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🔥  Creating Kubernetes in docker container with (CPUs=2) (2 available), Memory=4000MB (7964MB available) ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🌟  Enabling addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"

After this PR :

  • new default wait for node_health which checks for disk pressure, memory pressure, pid pressure and network and if Node is Ready or not
  • add add node health for soft start
  • add new UI for Wait components (indented letting user know what we are waiting on)

(note the errors belllow are dummy errors, I filpped the if statements to make them produce errors)

default start (doesnt check for node pressure on first start but it does on a soft start)

😄  minikube v1.9.2 on Darwin 10.13.6
✨  Using the docker driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating docker container (CPUs=2, Memory=4000MB) ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🕵️   Verifying Kubernetes Components:
    🔎 verifying api server ...
    🔎 verifying system pods ...
🌟  Enabling addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"

--wait=true with no pressure

Imperfection: (need suggestion how to fix)
enabling addons talks over verifying in parallel.

medmac@~/workspace/minikube (disk_pressure) $ make && ./out/minikube start --driver=docker --wait=true
make: `out/minikube' is up to date.
😄  minikube v1.9.2 on Darwin 10.13.6
✨  Using the docker driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating docker container (CPUs=2, Memory=4000MB) ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🕵️   Verifying Kubernetes Components:
    🔎 verifying api server ...
    🔎 verifying system pods ...
    🔎 verifying default service account ...
🌟  Enabling addons: default-storageclass, storage-provisioner
    🔎 verifying apps running ...
    🔎 verifying node pressure ...
    🔎 verifying node status
🏄  Done! kubectl is now configured to use "minikube"


--wait=true with pressure errors

medmac@~/workspace/minikube (disk_pressure) $ make && ./out/minikube start --driver=docker --wait=true
make: `out/minikube' is up to date.
😄  minikube v1.9.2 on Darwin 10.13.6
✨  Using the docker driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating docker container (CPUs=2, Memory=4000MB) ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🕵️   Verifying Kubernetes Components:
    🔎 verifying api server ...
    🔎 verifying system pods ...
🌟  Enabling addons: default-storageclass, storage-provisioner
    🔎 verifying default service account ...
    🔎 verifying apps running ...
    🔎 verifying node pressure ...

❗  The node minikube has ran out of disk space.
💡  Please free up disk or prune images.
🛑  Consider increasing Docker Desktop's disk size.
📘  Documentation: https://docs.docker.com/docker-for-mac/space/


💣  failed to start node: startup failed: Wait failed: verifying no_pressure: node has unwanted condition "MemoryPressure" : Reason "KubeletHasSufficientMemory" Message: "kubelet has sufficient memory available"

😿  minikube is exiting due to an error. If the above message is not useful, open an issue:
👉  https://github.com/kubernetes/minikube/issues/new/choose

Soft start (bare no flags) (warns and ends in 4s)

😄  minikube v1.9.2 on Darwin 10.13.6
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🏃  Updating the running docker "minikube" container ...
🐳  Preparing Kubernetes v1.18.0 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16

❗  The node minikube has ran out of disk space.
💡  Please free up disk or prune images.
🛑  Consider increasing Docker Desktop's disk size.
📘  Documentation: https://docs.docker.com/docker-for-mac/space/

🌟  Enabling addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"
real    0m4.618s
user    0m1.510s

note 1:

I tried to cheat and defer the enable addons to be printed later, that might be okay for many cases but sometimes it might print while it saying verifying ...
but I actually fix it for real in this other PR #7705

aditional notes

on my machine the 'no_pressure' flag does NOT add any significat time to minikube start.
I suggest after this PR has some miles on it, we could make it a default wait

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 11, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 11, 2020
@medyagh medyagh changed the title wip: add disk_health wait component wip: add node_health wait component Apr 11, 2020
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 11, 2020
@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: 79028670-7be4-11ea-a6ea-cf6f0436e876

@medyagh medyagh changed the title wip: add node_health wait component add "node" wait component for detecting disk , memory, pid, network pressure Apr 11, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 11, 2020
@medyagh medyagh requested a review from tstromberg April 11, 2020 11:50
@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: e4bb1110-7bea-11ea-a6ea-cf6f0436e876

@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: 2d0d2a70-7beb-11ea-a6ea-cf6f0436e876

@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: a51b4420-7beb-11ea-a6ea-cf6f0436e876

@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: f163ec10-7beb-11ea-a6ea-cf6f0436e876

@TravisBuddy
Copy link

Travis tests have failed

Hey @medyagh,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: a7170560-7bf1-11ea-a6ea-cf6f0436e876

@medyagh
Copy link
Member Author

medyagh commented Apr 11, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Apr 11, 2020
@minikube-pr-bot
Copy link

kvm2 Driver
docker Driver

@medyagh medyagh changed the title add "node" wait component for detecting disk , memory, pid, network pressure "verify node pressure" and add "wait for node ready" to --wait Apr 11, 2020
@medyagh medyagh changed the title "verify node pressure" and add "wait for node ready" to --wait add "verify node pressure" and add "wait for node ready" Apr 11, 2020
@medyagh
Copy link
Member Author

medyagh commented Apr 11, 2020

/re-test-this-please

Copy link
Contributor

@tstromberg tstromberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! It doesn't look too painful after all.

Added some comments because I would really love to see us allow for some sort of automatic cleanup if minikube start encounters disk pressure.

pkg/minikube/bootstrapper/bsutil/kverify/kverify.go Outdated Show resolved Hide resolved
pkg/minikube/bootstrapper/bsutil/kverify/kverify.go Outdated Show resolved Hide resolved
pkg/minikube/bootstrapper/kubeadm/kubeadm.go Outdated Show resolved Hide resolved
site/content/en/docs/commands/start.md Outdated Show resolved Hide resolved
@tstromberg
Copy link
Contributor

Two comments on the output:

  • We should make sure the addons output does not appear in the middle of the verifying output. Maybe just include Enabled addons: .... afterwards.

This line is overly verbose:

* node minikube has unwanted condition DiskPressure : Reason KubeletHasNoDiskPressure Message: kubelet has no disk pressure

Instead, perhaps just:

  • minikube reported: kubelet has no disk pressure

That is assuming the reasons make sense.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 16, 2020
@k8s-ci-robot
Copy link
Contributor

@medyagh: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 16, 2020
@minikube-pr-bot
Copy link

kvm2 Driver
Times for minikube: [64.29482813300001 65.98805650300001 63.00980550100001]
Average time for minikube: 64.43089671233334

Times for Minikube (PR 7611): [63.513170417999994 65.143545664 56.62860894599999]
Average time for Minikube (PR 7611): 61.76177500933333

Averages Time Per Log

+--------------------------------+-----------+--------------------+
|              LOG               | MINIKUBE  | MINIKUBE (PR 7611) |
+--------------------------------+-----------+--------------------+
| * minikube v1.9.2 on Debian    |  0.058408 |           0.057294 |
|                           9.11 |           |                    |
| * Using the kvm2 driver based  |  0.020181 |           0.020573 |
| on existing profile            |           |                    |
| * Starting control plane node  |  0.002920 |           0.002599 |
| minikube in cluster minikube   |           |                    |
| * Creating kvm2 VM (CPUs=2,    | 41.229088 |          38.709599 |
| Memory=3700MB, Disk=20000MB)   |           |                    |
| ...                            |           |                    |
| * Preparing Kubernetes v1.18.0 | 20.762859 |          21.090829 |
| on Docker 19.03.8 ...          |           |                    |
| * Enabling addons:             |  2.272912 |           0.522651 |
| default-storageclass,          |           |                    |
| storage-provisioner            |           |                    |
| * Done! kubectl is now         |  0.081545 |           0.078756 |
| configured to use "minikube"   |           |                    |
|                                |  0.002983 |           1.186066 |
+--------------------------------+-----------+--------------------+

docker Driver
Times for minikube: [24.968650715000003 25.5098627 24.742061878]
Average time for minikube: 25.073525097666664

Times for Minikube (PR 7611): [25.774788335000004 33.53963895300001 25.423349158999997]
Average time for Minikube (PR 7611): 28.245925482333334

Averages Time Per Log

+----------------------------------------+-----------+--------------------+
|                  LOG                   | MINIKUBE  | MINIKUBE (PR 7611) |
+----------------------------------------+-----------+--------------------+
| * minikube v1.9.2 on Debian            |  0.071507 |           0.070723 |
|                                   9.11 |           |                    |
| * Using the docker driver              |  0.002421 |           0.002117 |
| based on existing profile              |           |                    |
| * Starting control plane node          |  0.054036 |           0.054425 |
| minikube in cluster minikube           |           |                    |
| * Creating docker container            |  7.173350 |           7.519346 |
| (CPUs=2, Memory=3700MB) ...            |           |                    |
| * Preparing Kubernetes v1.18.0         |  0.000198 |           0.000206 |
| on Docker 19.03.2 ...                  |           |                    |
|   -                                    | 16.472782 |          19.061878 |
| kubeadm.pod-network-cidr=10.244.0.0/16 |           |                    |
| * Enabling addons:                     |  1.236761 |           0.054342 |
| default-storageclass,                  |           |                    |
| storage-provisioner                    |           |                    |
| * Done! kubectl is now                 |  0.058925 |           0.059857 |
| configured to use "minikube"           |           |                    |
|                                        |  0.003545 |           0.001400 |
+----------------------------------------+-----------+--------------------+

@medyagh medyagh changed the title add new wait components "node_ready" and "no_pressure" wip: add new wait components "node_ready" and "no_pressure" Apr 17, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2020
@medyagh medyagh closed this Apr 17, 2020
@medyagh medyagh deleted the disk_pressure branch May 2, 2020 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warn user on start if node is under disk pressure
7 participants