Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Docker does not send umount after mount failure. #320

Closed
kerneltime opened this issue May 5, 2016 · 10 comments
Closed

Docker does not send umount after mount failure. #320

kerneltime opened this issue May 5, 2016 · 10 comments
Assignees
Milestone

Comments

@kerneltime
Copy link
Contributor

We need to check with docker for the behavior of the engine on mount failures.
There seems to be relevant discussion here: https://github.com/docker/docker/pull/21015/files
This issue is to follow up with Docker and figure out the right recovery of refcount status after mount failure.

  1. Should we decrement on mount failure (which makes umount handling complicated)
  2. We depend on docker always sending umounts.
@kerneltime kerneltime self-assigned this May 5, 2016
@kerneltime kerneltime added this to the TP milestone May 5, 2016
@pdhamdhere
Copy link
Contributor

We confirmed Docker indeed sends unmount request for failed mounts on more than one setup. Ritesh is going to give this a try on clean environment. Will re-open or file a new issue if we see this again.

@kerneltime
Copy link
Contributor Author

This is a bug in docker. if -w is passed during run, internally docker does not clean up (send unmount).
Normal Log on failure (includes docker and plugin log)

May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.879817062Z" level=debug msg="Calling POST /v1.23/containers/create"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.880548779Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"cat\",\"/data/file\"],\"Domainname\":\"\",\"Entrypoint\":null,\"Env\":[],\"HostConfig\":{\"AutoRemove\":false,\"Binds\":[\"radio2016:/data\"],\"BlkioBps\":0,\"BlkioDeviceReadBps\":null,\"BlkioDeviceReadIOps\":null,\"BlkioDeviceWriteBps\":null,\"BlkioDeviceWriteIOps\":null,\"BlkioIOps\":0,\"BlkioWeight\":0,\"BlkioWeightDevice\":null,\"CapAdd\":null,\"CapDrop\":null,\"Cgroup\":\"\",\"CgroupParent\":\"\",\"ConsoleSize\":[0,0],\"ContainerIDFile\":\"\",\"CpuCount\":0,\"CpuPercent\":0,\"CpuPeriod\":0,\"CpuQuota\":0,\"CpuShares\":0,\"CpusetCpus\":\"\",\"CpusetMems\":\"\",\"Devices\":[],\"DiskQuota\":0,\"Dns\":[],\"DnsOptions\":[],\"DnsSearch\":[],\"ExtraHosts\":null,\"GroupAdd\":null,\"IpcMode\":\"\",\"Isolation\":\"\",\"KernelMemory\":0,\"Links\":null,\"LogConfig\":{\"Config\":{},\"Type\":\"\"},\"Memory\":0,\"MemoryReservation\":0,\"MemorySwap\":0,\"MemorySwappiness\":-1,\"NetworkMode\":\"default\",\"OomKillDisable\":false,\"OomScoreAdj\":0,\"PidMode\":\"\",\"PidsLimit\":0,\"PortBindings\":{},\"Privileged\":false,\"PublishAllPorts\":false,\"ReadonlyRootfs\":false,\"RestartPolicy\":{\"MaximumRetryCount\":0,\"Name\":\"no\"},\"SandboxSize\":0,\"SecurityOpt\":null,\"ShmSize\":0,\"StorageOpt\":null,\"UTSMode\":\"\",\"Ulimits\":null,\"UsernsMode\":\"\",\"VolumeDriver\":\"\",\"VolumesFrom\":null},\"Hostname\":\"\",\"Image\":\"busybox\",\"Labels\":{},\"NetworkingConfig\":{\"EndpointsConfig\":{}},\"OnBuild\":null,\"OpenStdin\":false,\"StdinOnce\":false,\"Tty\":false,\"User\":\"\",\"Volumes\":{},\"WorkingDir\":\"\"}"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.899373490Z" level=debug msg="container mounted via layerStore: /var/lib/docker/overlay/7a018b4d33979d6aedfdf638cc7175a754dd66149e05a561ed66535add263e6d/merged"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.899957317Z" level=debug msg="copying image data from 1b1dc474a6c8a81e5ca978545414dc7ae699895f00bc1776324e219f0d47730d:/data, to radio2016"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.904220227Z" level=debug msg="Calling POST /v1.23/containers/1b1dc474a6c8a81e5ca978545414dc7ae699895f00bc1776324e219f0d47730d/attach?stderr=1&stdout=1&stream=1"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.905298577Z" level=debug msg="attach: stdout: begin"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.905978497Z" level=debug msg="attach: stderr: begin"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.906917754Z" level=debug msg="Calling POST /v1.23/containers/1b1dc474a6c8a81e5ca978545414dc7ae699895f00bc1776324e219f0d47730d/start"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.910935261Z" level=debug msg="container mounted via layerStore: /var/lib/docker/overlay/7a018b4d33979d6aedfdf638cc7175a754dd66149e05a561ed66535add263e6d/merged"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.911741477Z" level=debug msg="Assigning addresses for endpoint cranky_wilson's interface on network bridge"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.912290781Z" level=debug msg="RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[])"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.919231621Z" level=debug msg="Assigning addresses for endpoint cranky_wilson's interface on network bridge"
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.928945451Z" level=debug msg="Programming external connectivity on endpoint cranky_wilson (d2bc58f7bb466dd11ab29368e474175816561b0daa7dc5658e33d1d23318602c)"
2016-05-06 20:07:07.935254635 +0000 UTC [INFO] Mounting volume name=radio2016 
2016-05-06 20:07:07.935335565 +0000 UTC [DEBUG] vmdkOps.Attach name=radio2016
2016-05-06 20:07:07.935731702 +0000 UTC [WARNING] Failed to connect to ESX over vsocket
2016-05-06 20:07:07.935816918 +0000 UTC [ERROR] Failed to mount error="Failed to connect to ESX over vsocket" name=radio2016 
May 06 20:07:07 photon-machine docker[2095]: time="2016-05-06T20:07:07.942648069Z" level=debug msg="Revoking external connectivity on endpoint cranky_wilson (d2bc58f7bb466dd11ab29368e474175816561b0d
aa7dc5658e33d1d23318602c)"
May 06 20:07:08 photon-machine docker[2095]: time="2016-05-06T20:07:08.092797949Z" level=debug msg="Releasing addresses for endpoint cranky_wilson's interface on network bridge"May 06 20:07:08 photon-machine docker[2095]: time="2016-05-06T20:07:08.093481286Z" level=debug msg=
"ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.2)"2016-05-06 20:07:08.100071566 +0000 UTC [INFO] Unmounting Volume name=radio2016 
2016-05-06 20:07:08.100140506 +0000 UTC [ERROR] Failed to unmount volume. Now trying to detach... mountpoint="/mnt/vmdk/radio2016" error="invalid argument" 
2016-05-06 20:07:08.100182593 +0000 UTC [DEBUG] vmdkOps.Detach name=radio2016
2016-05-06 20:07:08.100557026 +0000 UTC [WARNING] Failed to connect to ESX over vsocket
2016-05-06 20:07:08.10065241 +0000 UTC [ERROR] Failed to unmount name=radio2016 error="Failed to co
nnect to ESX over vsocket" 

Logs for bug during failure ( includes docker and plugin logs)

root@photon-machine [ ~ ]# docker run  --rm -v radio2016:/data -w /data busybox cat /data/file
May 06 20:07:22 photon-machine docker[2095]: time="2016-05-06T20:07:22.226977123Z" level=debug msg="Calling POST /v1.23/containers/create"
May 06 20:07:22 photon-machine docker[2095]: time="2016-05-06T20:07:22.227550840Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"cat\",\"/data/file\"],\"Domainname\":\"\",\"Entrypoint\":null,\"Env\":[],\"HostConfig\":{\"AutoRemove\":false,\"Binds\":[\"radio2016:/data\"],\"BlkioBps\":0,\"BlkioDeviceReadBps\":null,\"BlkioDeviceReadIOps\":null,\"BlkioDeviceWriteBps\":null,\"BlkioDeviceWriteIOps\":null,\"BlkioIOps\":0,\"BlkioWeight\":0,\"BlkioWeightDevice\":null,\"CapAdd\":null,\"CapDrop\":null,\"Cgroup\":\"\",\"CgroupParent\":\"\",\"ConsoleSize\":[0,0],\"ContainerIDFile\":\"\",\"CpuCount\":0,\"CpuPercent\":0,\"CpuPeriod\":0,\"CpuQuota\":0,\"CpuShares\":0,\"CpusetCpus\":\"\",\"CpusetMems\":\"\",\"Devices\":[],\"DiskQuota\":0,\"Dns\":[],\"DnsOptions\":[],\"DnsSearch\":[],\"ExtraHosts\":null,\"GroupAdd\":null,\"IpcMode\":\"\",\"Isolation\":\"\",\"KernelMemory\":0,\"Links\":null,\"LogConfig\":{\"Config\":{},\"Type\":\"\"},\"Memory\":0,\"MemoryReservation\":0,\"MemorySwap\":0,\"MemorySwappiness\":-1,\"NetworkMode\":\"default\",\"OomKillDisable\":false,\"OomScoreAdj\":0,\"PidMode\":\"\",\"PidsLimit\":0,\"PortBindings\":{},\"Privileged\":false,\"PublishAllPorts\":false,\"ReadonlyRootfs\":false,\"RestartPolicy\":{\"MaximumRetryCount\":0,\"Name\":\"no\"},\"SandboxSize\":0,\"SecurityOpt\":null,\"ShmSize\":0,\"StorageOpt\":null,\"UTSMode\":\"\",\"Ulimits\":null,\"UsernsMode\":\"\",\"VolumeDriver\":\"\",\"VolumesFrom\":null},\"Hostname\":\"\",\"Image\":\"busybox\",\"Labels\":{},\"NetworkingConfig\":{\"EndpointsConfig\":{}},\"OnBuild\":null,\"OpenStdin\":false,\"StdinOnce\":false,\"Tty\":false,\"User\":\"\",\"Volumes\":{},\"WorkingDir\":\"/data\"}"
May 06 20:07:22 photon-machine docker[2095]: time="2016-05-06T20:07:22.244314184Z" level=debug msg="container mounted via layerStore: /var/lib/docker/overlay/752bb2f7a90c487baf52ffa85d81dfd5b93a9cb9e1faee323a939c2481416600/merged"
May 06 20:07:22 photon-machine docker[2095]: time="2016-05-06T20:07:22.245067104Z" level=debug msg="copying image data from a04dc67c3a068a1a3022fe5d8dc23c10c6389265e1b61b1a74c8ed30865dc741:/data, to radio2016"
2016-05-06 20:07:22.245899957 +0000 UTC [INFO] Mounting volume name=radio2016 
2016-05-06 20:07:22.245975121 +0000 UTC [DEBUG] vmdkOps.Attach name=radio2016
2016-05-06 20:07:22.246375617 +0000 UTC [WARNING] Failed to connect to ESX over vsocket
2016-05-06 20:07:22.246423964 +0000 UTC [ERROR] Failed to mount name=radio2016 error="Failed to connect to ESX over vsocket" 
May 06 20:07:22 photon-machine docker[2095]: time="2016-05-06T20:07:22.251241798Z" level=error msg="Handler for POST /v1.23/containers/create returned error: VolumeDriver.Mount: Failed to connect to ESX over vsocket"
docker: Error response from daemon: VolumeDriver.Mount: Failed to connect to ESX over vsocket.
See 'docker run --help'.

@kerneltime kerneltime reopened this May 6, 2016
@kerneltime
Copy link
Contributor Author

Will search existing bugs if one is not filed will open a issue with docker.

@kerneltime
Copy link
Contributor Author

Opened moby/moby#22564

@pdhamdhere
Copy link
Contributor

This is a Docker issue and not a TP blocker. User can workaround this issue by either skipping -w or using nocopy. Moving to GA.

@pdhamdhere pdhamdhere modified the milestones: v1 GA, TP May 17, 2016
kerneltime pushed a commit that referenced this issue May 19, 2016
@pdhamdhere
Copy link
Contributor

CC @govint

@govint
Copy link
Contributor

govint commented Aug 16, 2016

Behavior is the same with Docker 1.12,

First time, docker reaches to the plugin to mount the disk, it fails.

Thereafter docker runs the container creating a folder on the local host FS and starts the container with the given working dir.

The plugin has reference counted already the volume and won't let the volume to be removed.

A restart of the plugin is what it takes to clear up the issue.

$ docker run --rm -it -w /data -v newvol156:/data busybox
docker: Error response from daemon: VolumeDriver.Mount: Mount failed.
See 'docker run --help'.
$ docker run --rm -it -w /data -v newvol156:/data busybox
/data #
/data # pwd
/data

$ docker volume rm newvol156
Error response from daemon: Error while removing volume newvol156: remove newvol156: VolumeDriver.Remove: Remove failure - volume is still mounted. volume=newvol156, refcount=1

$ sudo service docker-volume-vsphere restart

docker-volume-vsphere stop/waiting
docker-volume-vsphere start/running, process 17437

$ docker volume rm newvol156
newvol156

I'll check if Docker is planning on fixing this or will try submitting a change to docker issue #22564

@govint
Copy link
Contributor

govint commented Oct 4, 2016

Creating a new issue for the plugin, Docker is fixing this with a change to never send an unmount for a failed mount (docker PR 27116). Closing this issue.

@govint govint closed this as completed Oct 4, 2016
@kerneltime kerneltime reopened this Oct 29, 2016
@kerneltime
Copy link
Contributor Author

Using this issue to track the work we need to do once Docker releases moby/moby#27116
If you have filed an issue for that work then reference it here and close the issue with the label wontfix or invalid as no code change was part of closing this issue.

@govint
Copy link
Contributor

govint commented Nov 23, 2016

Closing as #608 will address changes to the plugin to handle new docker behavior on mount failure.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants