0ct0pu5/moby

Author	SHA1	Message	Date
Shishir Mahajan	0e633ee14a	Fixes Issue # 23418: Race condition between device deferred removal and resume device. Problem Description: An example scenario that involves deferred removal 1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and mounted at some point in time during image creation. 2. While image creation is in progress, a privileged container is started from another image and the host's mount name space is shared with this container ('docker run --privileged -v /:/host'). 3. Image creation completes and the base device gets unmounted. However, as the privileged container still holds a reference on the base image mount point, the base device cannot be removed right away. So it gets flagged for deferred removal. 4. Next, the privileged container terminates and thus its reference to the base image mount point gets released. The base device (which is flagged for deferred removal) may now be cleaned up by the device-mapper. This opens up an opportunity for a race between a 'kworker' thread (executing the do_deferred_remove() function) and the Docker daemon (executing the CreateSnapDevice() function). This PR cancel the deferred removal, if the device is marked for it. And reschedule the deferred removal later after the device is resumed successfully. Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-08-02 10:33:58 -04:00
yuzou	d4a2bcc9ac	Add detail error logs when 'Unknown Device' error happens if devicemapper storage is used. Signed-off-by: yuzou <zouyu7@huawei.com>	2016-06-30 13:06:14 +08:00
Shishir Mahajan	cac6658da0	Modularize dm.use_deferred_removal and dm.use_deferred_deletion logic. Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-06-13 12:05:46 -04:00
Yong Tang	a72b45dbec	Fix logrus formatting This fix tries to fix logrus formatting by removing `f` from `logrus.[Error\|Warn\|Debug\|Fatal\|Panic\|Info]f` when formatting string is not present. This fix fixes #23459. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2016-06-11 13:16:55 -07:00
Antonio Murdaca	44ccbb317c	*: fix logrus.Warn[f] Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2016-06-11 19:42:38 +02:00
Antonio Murdaca	b18062122d	graphtest: fix cleanup logic device Base should not exists on failure: --- FAIL: TestDevmapperCreateBase (0.06s) graphtest_unix.go:122: stat /tmp/docker-graphtest-079240530/devicemapper/mnt/Base/rootfs/a subdir: no such file or directory --- FAIL: TestDevmapperCreateSnap (0.00s) graphtest_unix.go:219: devmapper: device Base already exists. it should be: --- FAIL: TestDevmapperCreateBase (0.25s) graphtest_unix.go:122: stat /tmp/docker-graphtest-828994195/devicemapper/mnt/Base/rootfs/a subdir: no such file or directory --- FAIL: TestDevmapperCreateSnap (0.13s) graphtest_unix.go:122: stat /tmp/docker-graphtest-828994195/devicemapper/mnt/Snap/rootfs/a subdir: no such file or directory Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2016-05-31 20:08:57 +02:00
Shishir Mahajan	09d0720e2f	Fixes Issue # 22992: docker commit failing. 1) docker create / run / start: this would create a snapshot device and mounts it onto the filesystem. So the first time GET operation is called. it will create the rootfs directory and return the path to rootfs 2) Now when I do docker commit. It will call the GET operation second time. This time the refcount will check that the count > 1 (count=2). so the rootfs already exists, it will just return the path to rootfs. Earlier it was just returning the mp: /var/lib/docker/devicemapper/mnt/{ID} and hence the inconsistent paths error. Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-05-27 14:35:46 -04:00
Michael Crosby	5b6b8df0c1	Add reference counting to aufs Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-05-23 15:57:23 -07:00
Michael Crosby	1ba05cdb6a	Add fast path for fsmagic supported drivers For things that we can check if they are mounted by using their fsmagic we should use that and for others do it the slow way. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-05-23 15:57:23 -07:00
Michael Crosby	009ee16bef	Restore ref count Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2016-05-23 15:57:23 -07:00
Brian Goff	227c83826a	Merge pull request #21945 from rhvgoyal/export-min-free-space Export Mininum Thin Pool Free Space through docker info	2016-05-02 20:20:08 -04:00
David Calavera	8a0d2d8e57	Merge pull request #22168 from cpuguy83/22116_hack_in_layer_refcounts Add refcounts to graphdrivers that use fsdiff	2016-04-22 15:17:12 -07:00
Brian Goff	7342060b07	Add refcounts to graphdrivers that use fsdiff This makes sure fsdiff doesn't try to unmount things that shouldn't be. Note: This is intended as a temporary solution to have as minor a change as possible for 1.11.1. A bigger change will be required in order to support container re-attach. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2016-04-21 12:19:57 -04:00
Vivek Goyal	55a9b8123d	Export Mininum Thin Pool Free Space through docker info Right now there is no way to know what's the minimum free space threshold daemon is applying. It would be good to export it through docker info and then user knows what's the current value. Also this could be useful to higher level management tools which can look at this value and setup their own internal thresholds for image garbage collection etc. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2016-04-21 15:42:23 +00:00
mYmNeo	34a66a14af	Grow the container rootfs when it is necessary Signed-off-by: mYmNeo <thomassong@tencent.com>	2016-04-12 09:27:47 +08:00
Shishir Mahajan	45dc5b46e2	parseStorageOpt: return size rather than updating devInfo.Size field Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-04-11 10:34:13 -04:00
Stefan J. Wernli	ef5bfad321	Adding readOnly parameter to graphdriver Create method Since the layer store was introduced, the level above the graphdriver now differentiates between read/write and read-only layers. This distinction is useful for graphdrivers that need to take special steps when creating a layer based on whether it is read-only or not. Adding this parameter allows the graphdrivers to differentiate, which in the case of the Windows graphdriver, removes our dependence on parsing the id of the parent for "-init" in order to infer this information. This will also set the stage for unblocking some of the layer store unit tests in the next preview build of Windows. Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>	2016-04-06 13:52:53 -07:00
Sebastiaan van Stijn	b8f38747e6	Improve udev unsupported error message Show a different message if a dynamic binary is running, but doesn't have udev sync support. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2016-04-01 13:31:44 -07:00
Shishir Mahajan	b16decfccf	CLI flag for docker create(run) to change block device size. Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-03-28 10:05:18 -04:00
Brian Goff	65d79e3e5e	Move layer mount refcounts to mountedLayer Instead of implementing refcounts at each graphdriver, implement this in the layer package which is what the engine actually interacts with now. This means interacting directly with the graphdriver is no longer explicitly safe with regard to Get/Put calls being refcounted. In addition, with the containerd, layers may still be mounted after a daemon restart since we will no longer explicitly kill containers when we shutdown or startup engine. Because of this ref counts would need to be repopulated. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2016-03-23 14:42:52 -07:00
Tonis Tiigi	e91de9fb9d	Revert "Move layer mount refcounts to mountedLayer" This reverts commit `563d0711f8`. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2016-03-23 00:33:02 -07:00
Tõnis Tiigi	92a3ece35a	Merge pull request #21107 from cpuguy83/one_ctr_to_rule_them_all Move layer mount refcounts to mountedLayer	2016-03-22 21:19:00 -07:00
Brian Goff	563d0711f8	Move layer mount refcounts to mountedLayer Instead of implementing refcounts at each graphdriver, implement this in the layer package which is what the engine actually interacts with now. This means interacting directly with the graphdriver is no longer explicitly safe with regard to Get/Put calls being refcounted. In addition, with the containerd, layers may still be mounted after a daemon restart since we will no longer explicitly kill containers when we shutdown or startup engine. Because of this ref counts would need to be repopulated. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2016-03-22 11:36:28 -04:00
Kenfe-Mickael Laventure	8af4f89cba	Remove unneeded references to execDriver This includes: - updating the docs - removing dangling variables Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>	2016-03-21 13:06:08 -07:00
Vivek Goyal	4141a00921	Fix the assignment to wrong variable We should be assigning value to minFreeMetadata instead of minFreeData. This is copy/paste error. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2016-03-17 15:19:08 +00:00
Brian Goff	37a1fadae6	Merge pull request #21097 from thaJeztah/dont-run-without-udev-sync Fail when devicemapper doesn't support udev-sync	2016-03-14 21:18:01 -04:00
Sebastiaan van Stijn	de64171510	Fail when devicemapper doesn't support udev-sync Now what we provide dynamic binaries for all plaforms, we shouldn't try to run docker without udev sync support. This change changes the previous warning to an Error, unless the user explicitly overrides the warning, in which case they're at their own risk. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2016-03-10 19:13:44 +01:00
Vivek Goyal	2e222f69b3	devmapper: Add a new option dm.min_free_space Once thin pool gets full, bad things can happen. Especially in case of xfs it is possible that xfs keeps on retrying IO infinitely (for certain kind of IO) and container hangs. One way to mitigate the problem is that once thin pool is about to get full, start failing some of the docker operations like pulling new images or creation of new containers. That way user will get warning ahead of time and can try to rectify it by creating more free space in thin pool. This can be done either by deleting existing images/containers or by adding more free space to thin pool. This patch adds a new option dm.min_free_space to devicemapper graph driver. Say one specifies dm.min_free_space=10%. This means atleast 10% of data and metadata blocks should be free in pool before new device creation is allowed, otherwise operation will fail. By default min_free_space is 10%. User can change it by specifying dm.min_free_space=X% on command line. A value of 0% will disable the check. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2016-03-07 20:27:39 +00:00
Stefan Weil	2eee613326	Fix some typos in comments and strings Most of them were found and fixed by codespell. Signed-off-by: Stefan Weil <sw@weilnetz.de>	2016-02-22 20:27:15 +01:00
Sebastiaan van Stijn	661d75f398	Merge pull request #19123 from shishir-a412ed/rootfs_size_configurable daemon option (--storage-opt dm.basesize) for increasing the base device size on daemon restart	2016-01-13 13:22:08 -08:00
Shishir Mahajan	e47112d3e8	daemon option (--storage-opt dm.basesize) for increasing the base device size on daemon restart Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-01-13 13:57:31 -05:00
Vivek Goyal	2dccb562df	Mark device ID free only if device actually got deleted Right now if somebody has enabled deferred device deletion, then deleteTransaction() returns success even if device could not be deleted. It has been marked for deferred deletion. Right now we will mark device ID free and potentially use it again when somebody tries to create new container. And that's wrong. Device ID is not free yet. It will become free once devices has actually been deleted by the goroutine later. So move the location of call to markDeviceIDFree() to a place where we know device actually got deleted and was not marked for deferred deletion. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2016-01-11 18:57:37 +00:00
Vincent Batts	af59752712	loopback: separate loop logic from devicemapper The loopback logic is not technically exclusive to the devicemapper driver. This reorganizes the code such that the loopback code is usable outside of the devicemapper package and driver. Signed-off-by: Vincent Batts <vbatts@redhat.com>	2015-12-18 10:57:43 -05:00
David Calavera	4fef42ba20	Replace pkg/units with docker/go-units. Signed-off-by: David Calavera <david.calavera@gmail.com>	2015-12-16 12:26:49 -05:00
Antonio Murdaca	f22ee02c6d	devmapper: store base device fs type After the very first init of the graph `docker info` correctly shows the base fs type under `Backing Filesystem`. This information isn't stored anywhere. After a restart (w/o erasing `/var/lib/docker`) `docker info` shows an empty string under `Backing Filesystem`. This patch records the base fs type after the first run in the metadata or, to fix old devices that don't have this info in the metadata, just probe the fs type of the base device at graph startup. Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2015-12-15 09:33:19 +01:00
Chris Dituri	0aa6ace6e6	Make daemon/graphdriver/devmapper log messages with a common, consistent prefix. Closes #16667 Uses the prefix "devmapper:" for all the fmt and logrus error, debug, and info messages. Signed-off-by: Chris Dituri <csdituri@gmail.com>	2015-12-14 21:35:13 -06:00
Justas Brazauskas	927b334ebf	Fix typos found across repository Signed-off-by: Justas Brazauskas <brazauskasjustas@gmail.com>	2015-12-13 18:04:12 +02:00
Christopher Jones	7c077c2c34	Fixed typo change deivce to device. This changes deivce to device in daemon, test and docs. Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>	2015-12-10 15:23:05 -06:00
Antonio Murdaca	037cbcec98	devmapper: remove unused var Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2015-12-10 08:28:02 +01:00
Liu Hua	f7bdb97357	Fix Put without Get in devicemapper Signed-off-by: Liu Hua <sdu.liu@huawei.com>	2015-12-03 22:22:25 +08:00
Vivek Goyal	a489e685c0	devmapper: Log start and end of filesystem creation ext4 filesystem creation can take a long time on 100G thin device and systemd might time out and kill docker service. Often user is left thinking why docker is taking so long and logs don't give any hint. Log an info message in journal for start and end of filesystem creation. That way a user can look at logs and figure out that filesystem creation is taking long time. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-12-01 13:05:46 +00:00
Michael Crosby	1ecb9a40db	Merge pull request #17974 from anusha-ragunathan/fsMagic Fix devmapper backend in docker info	2015-11-17 11:44:48 -08:00
Alexander Morozov	4dda67b801	Merge pull request #16452 from rhatdan/btrfs-selinux Relabel BTRFS Content on container Creation	2015-11-17 11:03:40 -08:00
Sebastiaan van Stijn	cf824d9749	Merge pull request #17479 from coolljt0725/show_warning Show warning when user specify dm.basesize for already initialized devicemapper driver	2015-11-15 08:51:33 +01:00
Anusha Ragunathan	fdc2641c2b	Fix devmapper backend in docker info Signed-off-by: Anusha Ragunathan <anusha@docker.com>	2015-11-13 21:05:47 -08:00
Dan Walsh	1716d497a4	Relabel BTRFS Content on container Creation This change will allow us to run SELinux in a container with BTRFS back end. We continue to work on fixing the kernel/BTRFS but this change will allow SELinux Security separation on BTRFS. It basically relabels the content on container creation. Just relabling -init directory in BTRFS use case. Everything looks like it works. I don't believe tar/achive stores the SELinux labels, so we are good as far as docker commit. Tested Speed on startup with BTRFS on top of loopback directory. BTRFS not on loopback should get even better perfomance on startup time. The more inodes inside of the container image will increase the relabel time. This patch will give people who care more about security the option of runnin BTRFS with SELinux. Those who don't want to take the slow down can disable SELinux either in individual containers or for all containers by continuing to disable SELinux in the daemon. Without relabel: > time docker run --security-opt label:disable fedora echo test test real 0m0.918s user 0m0.009s sys 0m0.026s With Relabel test real 0m1.942s user 0m0.007s sys 0m0.030s Signed-off-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Dan Walsh <dwalsh@redhat.com>	2015-11-11 14:49:27 -05:00
Vivek Goyal	07ff17fb85	devmapper: Switch to xfs as default filesystem if supported If platform supports xfs filesystem then use xfs as default filesystem for container rootfs instead of ext4. Reason being that ext4 is pre-allcating lot of metadata (around 1.8GB on 100G thin volume) and that can take long enough on AWS storage that systemd times out and docker fails to start. If one disables pre-allocation of ext4 metadata, then it will be allocated when containers are mounted and we will have multiple copies of metadata per container. For a 100G thin device, it was around 1.5GB of metadata per container. ext4 has an optimization to skip zeroing if discards are issued and underlying device guarantees that zero will be returned when discarded blocks are read back. devicemapper thin devices don't offer that guarantee so ext4 optimization does not kick in. In fact given discards are optional and can be dropped on the floor if need be, it looks like it might not be possible to guarantee that all the blocks got discarded and if read back zero will be returned. Signed-off-by: Anusha Ragunathan <anusha@docker.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-11-11 12:07:35 -05:00
Vivek Goyal	83a34e000b	devmapper: Warn if user specified a filesytem and base device already has fs If user wants to use a filesystem it can be specified using dm.fs=<filesystem> option. It is possible that docker already had base image and a filesystem on that. Later if user wants to change file system using dm.fs= option and restarts docker, that's not possible. Warn user about it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-11-11 12:07:35 -05:00
Lei Jitang	e035d27223	Show warning when user specify dm.basesize for already initialized devicemapper drive Signed-off-by: Lei Jitang <leijitang@huawei.com>	2015-11-10 14:50:19 +08:00
Vivek Goyal	2c8b7c597a	devmapper: Provide more error information if blkid fails Right now if blkid fails we are just logging a debug message and don;t return the actual error to caller. Caller gets the error message that thin pool base device UUID verification failed and it might give impression that thin pool changed. But that's not the case. Thin pool is in such a state that we could not even query the thin device UUID. Retrun error message appropriately to make situation more clear. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2015-11-06 08:21:20 -05:00

1 2 3 4 5 ...

257 commits