Commit graph

212 commits

Author SHA1 Message Date
Brian Goff
bccebdac18 Merge pull request #32468 from coolljt0725/clean_thin
devicemapper: remove thin pool if 'initDevmapper' failed
2017-06-13 07:34:15 -05:00
Brian Goff
05ad14fc1b Merge pull request #31104 from cpuguy83/dm_lvmsetup
Add option to auto-configure blkdev for devmapper
2017-05-05 07:35:24 -04:00
Brian Goff
5ef07d79c4 Add option to auto-configure blkdev for devmapper
Instead of forcing users to manually configure a block device to use
with devmapper, this gives the user the option to let the devmapper
driver configure a device for them.

Adds several new options to the devmapper storage-opts:

- dm.directlvm_device="" - path to the block device to configure for
  direct-lvm
- dm.thinp_percent=95 - sets the percentage of space to use for
  storage from the passed in block device
- dm.thinp_metapercent=1 - sets the percentage of space to for metadata
  storage from the passed in block device
- dm.thinp_autoextend_threshold=80 - sets the threshold for when `lvm`
  should automatically extend the thin pool as a percentage of the total
  storage space
- dm.thinp_autoextend_percent=20 - sets the percentage to increase the
  thin pool by when an autoextend is triggered.

Defaults are taken from
[here](https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/#/configure-direct-lvm-mode-for-production)

The only option that is required is `dm.directlvm_device` for docker to
set everything up.

Changes to these settings are not currently supported and will error
out.
Future work could support allowing changes to these values.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2017-05-03 13:49:15 -04:00
Antonio Murdaca
abbbf91498
Switch to using opencontainers/selinux for selinux bindings
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-04-24 21:29:47 +02:00
Lei Jitang
ea22d7ab91 devicemapper: remove thin pool if 'initDevmapper' failed
if initDevmapper failed after creating thin-pool, the thin-pool will not be removed,
this would cause we can't use the same lvm to create another thin-pool.

Signed-off-by: Lei Jitang <leijitang@huawei.com>
2017-04-10 13:11:39 -04:00
Lei Jitang
6e25bb2ed6 devicemapper: fix suspend removed device
when doing devices.cancelDeferredRemoval, the device could have been removed
and return ErrEnxio, but it continue to check if it is need to do suspend.
doSuspend := devinfo != nil && devinfo.Exists != 0 uses a devinfo which is
get before devices.cancelDeferredRemoval(baseInfo), it is outdate, the device
has been removed and there is no need to do suspend. If do suspend it will return
devicemapper: Error running deviceSuspend dm_task_run failed.

Signed-off-by: Lei Jitang <leijitang@huawei.com>
2017-03-01 21:58:14 +08:00
Tonis Tiigi
e9864cc0ec Update storage driver options link
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-02-16 18:32:17 -08:00
Vincent Demeester
e880120fe2 Merge pull request #28500 from rhvgoyal/remove-unused-param
devmapper: get rid of unused device id argument in unregisterDevice()
2017-01-12 18:02:00 +01:00
unclejack
3a42518042 daemon: return directly without ifs where possible
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
2016-12-14 22:36:58 +02:00
Vivek Goyal
288f933ed5 devmapper: get rid of unused device id argument in unregisterDevice()
There is no need to populate device id during unregisterDevice(). Nobody
makes use of this information. We just need to remove file associated
with device and that file is looked up using the hash and not the
device id which is used for thin pool operations.

So get rid of device id argument to unregisterDevice().

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-11-16 16:16:46 -05:00
Daehyeok Mun
fa710e504b Fix logrus formatting
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.

Fixed issue #23459

Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
2016-10-31 22:05:01 -06:00
Elena Morozova
64238fef8c all: replace loop with single append
Signed-off-by: Elena Morozova <lelenanam@gmail.com>
2016-10-13 13:31:52 -07:00
Brian Goff
c2f57291ac Merge pull request #26898 from YuPengZTE/devErrorsNew
In error, the first letter is low-case letter
2016-09-27 10:10:57 -04:00
YuPengZTE
110ab746ba In error, the first letter is low-case letter
Signed-off-by: YuPengZTE <yu.peng36@zte.com.cn>
2016-09-27 10:40:07 +08:00
Yanqiang Miao
664ad19486 Optimized debug print in the 'deviceset.go'
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
2016-09-12 15:34:17 +08:00
Vivek Goyal
6cc55dd65b devmapper: Fail to start container if xfs_nospace_max_retries can't be enforced
We just introduced a new tunable dm.xfs_nospace_max_retries. But this tunable
will work only on new kernels where xfs supports this feature. On older
kernels xfs does not allow tuning this behavior.

There are two issues. First one is that if xfsSetNospaceRetries() fails,
it returns error but leaves the device activated and mounted. We should
be unmounting the device and deactivate it before returning.
 
Second issue is, if docker is started on older kernel, with
dm.xfs_nospace_max_retries specified, then docker will silently ignore the
fact that /sys file to tweak this behavior is not present and will continue.
But I think it might be better to fail container creation/start if kernel
does not support this feature.

This patch fixes it. After this patch, user will get an error like following
when container is run.

# docker run -ti fedora bash
docker: Error response from daemon: devmapper: user specified daemon option dm.xfs_nospace_max_retries but it does not seem to be supported on this system :open /sys/fs/xfs/dm-5/error/metadata/ENOSPC/max_retries: no such file or directory.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-09-07 14:03:01 -04:00
Vivek Goyal
4f0017b9ad devmapper: Provide a knob dm.xfs_nospace_max_retries
When xfs filesystem is being used on top of thin pool, xfs can get ENOSPC
errors from thin pool when thin pool is full. As of now xfs retries the
IO and keeps on retrying and does not give up. This can result in container
application being stuck for a very long time. In fact I have seen instances
of unkillable processes. So that means once thin pool is full and process
gets stuck, container can't be stopped/killed either and only option left
seems to be power recycle of the box.

In another instance, writer did not block but failed after a while. But
when I tried to exit/stop the container, unmounting xfs hanged and only
thing I could do was power cycle the machine.

Now upstream kernel has committed patches where it allows user space to
customize user space behavior in case of errors. One of the knobs is
max_retries, which specifies how many times an IO should be retried
when ENOSPC is encountered.

This patch sets provides a tunable knob (dm.xfs_nospace_max_retries) so
that user can specify value for max_retries and tune xfs behavior. If
one sets this value to 0, xfs will not retry IO when ENOSPC error is
encountered. It will instead give up and shutdown filesystem.

This knob can be useful if one is running into unkillable
processes/containers issue on top of xfs.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-09-01 11:38:09 -04:00
Shishir Mahajan
0e633ee14a Fixes Issue # 23418: Race condition between device deferred removal and resume device.
Problem Description:

An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).

This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.

Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-08-02 10:33:58 -04:00
yuzou
d4a2bcc9ac Add detail error logs when 'Unknown Device' error happens if devicemapper storage is used.
Signed-off-by: yuzou <zouyu7@huawei.com>
2016-06-30 13:06:14 +08:00
Shishir Mahajan
cac6658da0 Modularize dm.use_deferred_removal and dm.use_deferred_deletion logic.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-06-13 12:05:46 -04:00
Yong Tang
a72b45dbec Fix logrus formatting
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.

This fix fixes #23459.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-06-11 13:16:55 -07:00
Antonio Murdaca
44ccbb317c *: fix logrus.Warn[f]
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-06-11 19:42:38 +02:00
Antonio Murdaca
b18062122d graphtest: fix cleanup logic
device Base should not exists on failure:

--- FAIL: TestDevmapperCreateBase (0.06s)
    graphtest_unix.go:122: stat
/tmp/docker-graphtest-079240530/devicemapper/mnt/Base/rootfs/a subdir:
no such file or directory
--- FAIL: TestDevmapperCreateSnap (0.00s)
    graphtest_unix.go:219: devmapper: device Base already
exists.

it should be:

--- FAIL: TestDevmapperCreateBase (0.25s)
	graphtest_unix.go:122: stat
/tmp/docker-graphtest-828994195/devicemapper/mnt/Base/rootfs/a subdir:
no such file or directory
--- FAIL: TestDevmapperCreateSnap (0.13s)
	graphtest_unix.go:122: stat
/tmp/docker-graphtest-828994195/devicemapper/mnt/Snap/rootfs/a subdir:
no such file or directory

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-05-31 20:08:57 +02:00
Brian Goff
227c83826a Merge pull request #21945 from rhvgoyal/export-min-free-space
Export Mininum Thin Pool Free Space through docker info
2016-05-02 20:20:08 -04:00
Vivek Goyal
55a9b8123d Export Mininum Thin Pool Free Space through docker info
Right now there is no way to know what's the minimum free space threshold
daemon is applying. It would be good to export it through docker info and
then user knows what's the current value. Also this could be useful to
higher level management tools which can look at this value and setup their
own internal thresholds for image garbage collection etc.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-04-21 15:42:23 +00:00
mYmNeo
34a66a14af Grow the container rootfs when it is necessary
Signed-off-by: mYmNeo <thomassong@tencent.com>
2016-04-12 09:27:47 +08:00
Shishir Mahajan
45dc5b46e2 parseStorageOpt: return size rather than updating devInfo.Size field
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-04-11 10:34:13 -04:00
Sebastiaan van Stijn
b8f38747e6 Improve udev unsupported error message
Show a different message if a dynamic binary
is running, but doesn't have udev sync support.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-01 13:31:44 -07:00
Shishir Mahajan
b16decfccf CLI flag for docker create(run) to change block device size.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-03-28 10:05:18 -04:00
Brian Goff
65d79e3e5e Move layer mount refcounts to mountedLayer
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.

In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-03-23 14:42:52 -07:00
Tonis Tiigi
e91de9fb9d Revert "Move layer mount refcounts to mountedLayer"
This reverts commit 563d0711f8.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-03-23 00:33:02 -07:00
Brian Goff
563d0711f8 Move layer mount refcounts to mountedLayer
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.

In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-03-22 11:36:28 -04:00
Vivek Goyal
4141a00921 Fix the assignment to wrong variable
We should be assigning value to minFreeMetadata instead of minFreeData. This
is copy/paste error.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-03-17 15:19:08 +00:00
Brian Goff
37a1fadae6 Merge pull request #21097 from thaJeztah/dont-run-without-udev-sync
Fail when devicemapper doesn't support udev-sync
2016-03-14 21:18:01 -04:00
Sebastiaan van Stijn
de64171510 Fail when devicemapper doesn't support udev-sync
Now what we provide dynamic binaries for all plaforms,
we shouldn't try to run docker without udev sync support.

This change changes the previous warning to an Error,
unless the user explicitly overrides the warning, in
which case they're at their own risk.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-03-10 19:13:44 +01:00
Vivek Goyal
2e222f69b3 devmapper: Add a new option dm.min_free_space
Once thin pool gets full, bad things can happen. Especially in case of xfs
it is possible that xfs keeps on retrying IO infinitely (for certain kind
of IO) and container hangs. 

One way to mitigate the problem is that once thin pool is about to get full,
start failing some of the docker operations like pulling new images or
creation of new containers. That way user will get warning ahead of time
and can try to rectify it by creating more free space in thin pool. This
can be done either by deleting existing images/containers or by adding more
free space to thin pool.

This patch adds a new option dm.min_free_space to devicemapper graph
driver. Say one specifies dm.min_free_space=10%. This means atleast
10% of data and metadata blocks should be free in pool before new device
creation is allowed, otherwise operation will fail.

By default min_free_space is 10%. User can change it by specifying
dm.min_free_space=X% on command line. A value of 0% will disable the
check.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-03-07 20:27:39 +00:00
Stefan Weil
2eee613326 Fix some typos in comments and strings
Most of them were found and fixed by codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-02-22 20:27:15 +01:00
Sebastiaan van Stijn
661d75f398 Merge pull request #19123 from shishir-a412ed/rootfs_size_configurable
daemon option (--storage-opt dm.basesize) for increasing the base device size on daemon restart
2016-01-13 13:22:08 -08:00
Shishir Mahajan
e47112d3e8 daemon option (--storage-opt dm.basesize) for increasing the base device size on daemon restart
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
2016-01-13 13:57:31 -05:00
Vivek Goyal
2dccb562df Mark device ID free only if device actually got deleted
Right now if somebody has enabled deferred device deletion, then
deleteTransaction() returns success even if device could not be deleted. It
has been marked for deferred deletion. Right now we will mark device ID free
and potentially use it again when somebody tries to create new container. And
that's wrong. Device ID is not free yet. It will become free once devices
has actually been deleted by the goroutine later.

So move the location of call to markDeviceIDFree() to a place where we know
device actually got deleted and was not marked for deferred deletion.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2016-01-11 18:57:37 +00:00
Vincent Batts
af59752712 loopback: separate loop logic from devicemapper
The loopback logic is not technically exclusive to the devicemapper
driver. This reorganizes the code such that the loopback code is usable
outside of the devicemapper package and driver.

Signed-off-by: Vincent Batts <vbatts@redhat.com>
2015-12-18 10:57:43 -05:00
David Calavera
4fef42ba20 Replace pkg/units with docker/go-units.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-16 12:26:49 -05:00
Antonio Murdaca
f22ee02c6d devmapper: store base device fs type
After the very first init of the graph `docker info` correctly shows the
base fs type under `Backing Filesystem`. This information isn't stored
anywhere. After a restart (w/o erasing `/var/lib/docker`) `docker info`
shows an empty string under `Backing Filesystem`.
This patch records the base fs type after the first run in the metadata
or, to fix old devices that don't have this info in the metadata, just
probe the fs type of the base device at graph startup.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2015-12-15 09:33:19 +01:00
Chris Dituri
0aa6ace6e6 Make daemon/graphdriver/devmapper log messages with a common, consistent prefix.
Closes #16667

Uses the prefix "devmapper:" for all the fmt and logrus error, debug, and info messages.

Signed-off-by: Chris Dituri <csdituri@gmail.com>
2015-12-14 21:35:13 -06:00
Justas Brazauskas
927b334ebf Fix typos found across repository
Signed-off-by: Justas Brazauskas <brazauskasjustas@gmail.com>
2015-12-13 18:04:12 +02:00
Christopher Jones
7c077c2c34 Fixed typo change deivce to device.
This changes deivce to device in daemon, test and docs.

Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
2015-12-10 15:23:05 -06:00
Liu Hua
f7bdb97357 Fix Put without Get in devicemapper
Signed-off-by: Liu Hua <sdu.liu@huawei.com>
2015-12-03 22:22:25 +08:00
Vivek Goyal
a489e685c0 devmapper: Log start and end of filesystem creation
ext4 filesystem creation can take a long time on 100G thin device and
systemd might time out and kill docker service. Often user is left thinking
why docker is taking so long and logs don't give any hint. Log an info
message in journal for start and end of filesystem creation. That way
a user can look at logs and figure out that filesystem creation is
taking long time.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2015-12-01 13:05:46 +00:00
Michael Crosby
1ecb9a40db Merge pull request #17974 from anusha-ragunathan/fsMagic
Fix devmapper backend in docker info
2015-11-17 11:44:48 -08:00
Sebastiaan van Stijn
cf824d9749 Merge pull request #17479 from coolljt0725/show_warning
Show warning when user specify dm.basesize for already initialized devicemapper driver
2015-11-15 08:51:33 +01:00