Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of node reboot while there were running VMs #790

Merged
merged 2 commits into from
Nov 8, 2018

Conversation

jellonek
Copy link
Contributor

@jellonek jellonek commented Oct 26, 2018

It's done by removal from metadata store sandboxes with missing netns on host during Virtlet startup, so later in GC process any VM defined in libvirt will be also removed as orphaned - same with its other resources.

This process does not involve GC of network using CNI, as there is no netns for particular sandbox, so network GC has to be done according to particular CNI plugin rules in such situation.

Closes #783


This change is Reviewable

Copy link
Contributor

@lukaszo lukaszo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 approvals obtained (waiting on @jellonek)


pkg/libvirttools/gc.go, line 125 at r1 (raw file):

func netNsExists(path string) bool {
	_, err := os.Stat(path)
	return !os.IsNotExist(err)

Is this enough? File may exists but may not be a ns. It happened before we were using mount propagation

Copy link
Contributor Author

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 2 approvals obtained (waiting on @jellonek)


pkg/libvirttools/gc.go, line 125 at r1 (raw file):

Previously, lukaszo (Łukasz Oleś) wrote…

Is this enough? File may exists but may not be a ns. It happened before we were using mount propagation

IMO it should be enough if everything is properly configured. IMO we should not add additional check for misconfiguration there, the better place for that would be pre flight check in diag.

Copy link
Contributor

@lukaszo lukaszo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 change requests, 0 of 2 approvals obtained

Copy link
Contributor

@ivan4th ivan4th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 approvals obtained (waiting on @jellonek)


pkg/libvirttools/gc.go, line 125 at r1 (raw file):

func netNsExists(path string) bool {
	_, err := os.Stat(path)
	return !os.IsNotExist(err)

First of all, if there' a non-nil error for which IsNotExist is false, there should be a warning logged by Virtlet. Second, netns dir is not smth too special, there's a set of directories where network namespaces are bind-mounted. It's easy to imagine smth going wrong with /var/run/netns cleanup on reboot so some dirs remain in place while losing their bind mounts, and "preflight" may not catch this (too special case, the checker will grow too complex if we verify every such detail). I'd strongly suggest that we check for a mountpoint here. It can be done using mountpoint -q DIR command, for example (pure-Go way is more involved https://github.com/kubernetes/kubernetes/blob/05f277c6bca09f6cab5aaa25417db20ef1296dea/pkg/util/mount/mount.go#L253:6 )

Copy link
Contributor Author

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 1 of 2 approvals obtained (waiting on @ivan4th)


pkg/libvirttools/gc.go, line 125 at r1 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

First of all, if there' a non-nil error for which IsNotExist is false, there should be a warning logged by Virtlet. Second, netns dir is not smth too special, there's a set of directories where network namespaces are bind-mounted. It's easy to imagine smth going wrong with /var/run/netns cleanup on reboot so some dirs remain in place while losing their bind mounts, and "preflight" may not catch this (too special case, the checker will grow too complex if we verify every such detail). I'd strongly suggest that we check for a mountpoint here. It can be done using mountpoint -q DIR command, for example (pure-Go way is more involved https://github.com/kubernetes/kubernetes/blob/05f277c6bca09f6cab5aaa25417db20ef1296dea/pkg/util/mount/mount.go#L253:6 )

Ok, I'm adding this additional logging.
IMO if there is something leaved in /var/run/netns it means system issue and imo virtlet is not a tool to detect system issues. Virtlet should check only for entries created by itself. Afaik the entry in this subdir must be a valid mount with nsfs (so ok, i'm adding this check) and i'm not sure if that would be working with bind mounts (i can imagine the case when whole dir is bind mounted but iproute2 (and other things which are using /var/run/netns based on ideas from iproute2) are expecting there nsfs on entries.

Copy link
Contributor Author

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 1 of 2 approvals obtained (waiting on @ivan4th)


pkg/libvirttools/gc.go, line 125 at r1 (raw file):

Previously, jellonek (Piotr Skamruk) wrote…

Ok, I'm adding this additional logging.
IMO if there is something leaved in /var/run/netns it means system issue and imo virtlet is not a tool to detect system issues. Virtlet should check only for entries created by itself. Afaik the entry in this subdir must be a valid mount with nsfs (so ok, i'm adding this check) and i'm not sure if that would be working with bind mounts (i can imagine the case when whole dir is bind mounted but iproute2 (and other things which are using /var/run/netns based on ideas from iproute2) are expecting there nsfs on entries.

Refactored. Logs and checking for mounted ns on entry added. Done.

Copy link
Contributor

@ivan4th ivan4th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 7 files at r2.
Reviewable status: 1 change requests, 1 of 2 approvals obtained (waiting on @jellonek)


pkg/utils/mountinfo.go, line 35 at r2 (raw file):

// MountPointChecker provides methods to check if directory entry is a mount point,
// what is its source and its filesystem

MountPointChecker is used to check if a directory entry is a mount point and returns its source info and filesystem type


pkg/utils/mountinfo.go, line 37 at r2 (raw file):

// what is its source and its filesystem
type MountPointChecker interface {
	CheckMountPointInfo(string) (mountEntry, bool)

Does it make sense for an exported method to return an unexported struct? Also, exported methods need to be documented


pkg/utils/mountinfo.go, line 38 at r2 (raw file):

type MountPointChecker interface {
	CheckMountPointInfo(string) (mountEntry, bool)
	IsPathAnNs(string) bool

Exported methods need to be documented.


pkg/utils/mountinfo.go, line 47 at r2 (raw file):

var _ MountPointChecker = mountPointChecker{}

// NewMountPointChecker returns new instance of MountPointChecker

a new instance


pkg/utils/mountinfo.go, line 65 at r2 (raw file):

		if err != nil {
			return mountPointChecker{}, err
		}

switch perhaps would look a tad better? But I don't insist


pkg/utils/mountinfo.go, line 70 at r2 (raw file):

		line = strings.Trim(line, "\n")

		// split and interpret entries acording to 3.5 paragraph in

parse entries according to section 3.5 in


pkg/utils/mountinfo.go, line 82 at r2 (raw file):

// CheckMountPointInfo chekcs if entry is a mountpoint and if so returns
// mountInfo for it

checks (typo)

if the path is a mountpoint with nsfs filesystem type

also, this doc should probably be moved to the interface and here it should be "CheckMountPointInfo implements CheckMountPointInfo method of MountPointChecker interface"


pkg/utils/mountinfo.go, line 88 at r2 (raw file):

}

// IsPathAnNs verifies if provided path is mountpoint with nsfs filesystem type

if the path is a mountpoint with nsfs filesystem type

also, this doc should probably be moved to the interface and here it should be "IsPathNs implements IsPathAnNs method of MountPointChecker interface"


pkg/utils/mountinfo.go, line 92 at r2 (raw file):

	_, err := os.Stat(path)
	if err != nil {
		if os.IsNotExist(err) {

maybe !os.IsNotExist(err) ? b.c. if IsNotExist() returns true this means that (the lack of) existence is verified


pkg/utils/mountinfo.go, line 111 at r2 (raw file):

type fakeMountPointChecker struct {
}

struct {} (on a single line) would look a bit better IMO


pkg/utils/mountinfo.go, line 113 at r2 (raw file):

}

// FakeMountPointChecker is defined there for static type checking and for unittest

how does it help with static type checking outside the unittests? IIUC the code above it is buildable on all platforms (although it can really work only on Linux)


pkg/utils/mountinfo.go, line 121 at r2 (raw file):

}

// IsPathAnNs is fake implementation for MountPointChecker interface

is a fake implementation of IsPathAnd

Copy link
Contributor Author

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 1 of 2 approvals obtained (waiting on @ivan4th)


pkg/utils/mountinfo.go, line 35 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

MountPointChecker is used to check if a directory entry is a mount point and returns its source info and filesystem type

Done


pkg/utils/mountinfo.go, line 37 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

Does it make sense for an exported method to return an unexported struct? Also, exported methods need to be documented

Yes, it makes sense - exported fields from such structs are still accessible to other packages. Only instance of such type can not be created outside of this package.
Docstring copied from below.


pkg/utils/mountinfo.go, line 38 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

Exported methods need to be documented.

Docstring copied from below.


pkg/utils/mountinfo.go, line 47 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

a new instance

Done.


pkg/utils/mountinfo.go, line 65 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

switch perhaps would look a tad better? But I don't insist

Yeah, good idea.


pkg/utils/mountinfo.go, line 70 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

parse entries according to section 3.5 in

Done.


pkg/utils/mountinfo.go, line 82 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

checks (typo)

if the path is a mountpoint with nsfs filesystem type

also, this doc should probably be moved to the interface and here it should be "CheckMountPointInfo implements CheckMountPointInfo method of MountPointChecker interface"

This method does not do a validation of if filesystem type is nsfs, it only returns mountInfo for mountpoints or empty value (and false as second value) if it's not found in mountinfo.
Typo fixed and rephrased docstring moved to struct definition.


pkg/utils/mountinfo.go, line 88 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

if the path is a mountpoint with nsfs filesystem type

also, this doc should probably be moved to the interface and here it should be "IsPathNs implements IsPathAnNs method of MountPointChecker interface"

Done.


pkg/utils/mountinfo.go, line 92 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

maybe !os.IsNotExist(err) ? b.c. if IsNotExist() returns true this means that (the lack of) existence is verified

You are correct. In case when IsNotExist it should silently return false and should log error only for other cases.


pkg/utils/mountinfo.go, line 111 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

struct {} (on a single line) would look a bit better IMO

Done.


pkg/utils/mountinfo.go, line 113 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

how does it help with static type checking outside the unittests? IIUC the code above it is buildable on all platforms (although it can really work only on Linux)

You are right - it's not for static type checking, it's only for gotest (in manager and in libvirttools). Done.


pkg/utils/mountinfo.go, line 121 at r2 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

is a fake implementation of IsPathAnd

Done.

Copy link
Contributor

@ivan4th ivan4th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 7 files at r2, 1 of 1 files at r3.
Reviewable status: 1 change requests, 1 of 2 approvals obtained

@ivan4th ivan4th merged commit d9cde84 into master Nov 8, 2018
@ivan4th ivan4th deleted the jell/fixreboots branch November 8, 2018 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants