As part of making graphdrivers support pluginv2, a PluginGetter
interface was necessary for cleaner separation and avoiding import
cycles.
This commit creates a PluginGetter interface and makes pluginStore
implement it. Then the pluginStore object is created in the daemon
(rather than by the plugin manager) and passed to plugin init as
well as to the different subsystems (eg. graphdrivers, volumedrivers).
A side effect of this change was that some code was moved out of
experimental. This is good, since plugin support will be stable soon.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
We just introduced a new tunable dm.xfs_nospace_max_retries. But this tunable
will work only on new kernels where xfs supports this feature. On older
kernels xfs does not allow tuning this behavior.
There are two issues. First one is that if xfsSetNospaceRetries() fails,
it returns error but leaves the device activated and mounted. We should
be unmounting the device and deactivate it before returning.
Second issue is, if docker is started on older kernel, with
dm.xfs_nospace_max_retries specified, then docker will silently ignore the
fact that /sys file to tweak this behavior is not present and will continue.
But I think it might be better to fail container creation/start if kernel
does not support this feature.
This patch fixes it. After this patch, user will get an error like following
when container is run.
# docker run -ti fedora bash
docker: Error response from daemon: devmapper: user specified daemon option dm.xfs_nospace_max_retries but it does not seem to be supported on this system :open /sys/fs/xfs/dm-5/error/metadata/ENOSPC/max_retries: no such file or directory.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
When xfs filesystem is being used on top of thin pool, xfs can get ENOSPC
errors from thin pool when thin pool is full. As of now xfs retries the
IO and keeps on retrying and does not give up. This can result in container
application being stuck for a very long time. In fact I have seen instances
of unkillable processes. So that means once thin pool is full and process
gets stuck, container can't be stopped/killed either and only option left
seems to be power recycle of the box.
In another instance, writer did not block but failed after a while. But
when I tried to exit/stop the container, unmounting xfs hanged and only
thing I could do was power cycle the machine.
Now upstream kernel has committed patches where it allows user space to
customize user space behavior in case of errors. One of the knobs is
max_retries, which specifies how many times an IO should be retried
when ENOSPC is encountered.
This patch sets provides a tunable knob (dm.xfs_nospace_max_retries) so
that user can specify value for max_retries and tune xfs behavior. If
one sets this value to 0, xfs will not retry IO when ENOSPC error is
encountered. It will instead give up and shutdown filesystem.
This knob can be useful if one is running into unkillable
processes/containers issue on top of xfs.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
In the common case where the user is using /var/lib/docker and
an image with less than 60 layers, forking is not needed. Calculate
whether absolute paths can be used and avoid forking to mount in
those cases.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
If we are running in a user namespace, don't try to mknod as
it won't be allowed. libcontainer will bind-mount the host's
devices over files in the container anyway, so it's not needed.
The chrootarchive package does a chroot (without mounting /proc) before
its work, so we cannot check /proc/self/uid_map when we need to. So
compute it in advance and pass it along with the tar options.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Problem Description:
An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).
This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
Truncated dir name can't give any useful information, print whole dir
name will.
Bad debug log is like this:
```
DEBU[2449] aufs error unmounting /var/lib/doc: no such file or directory
```
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Use module name logrus instead of log.
Use logrus.[Error|Warn|Debug|Fatal|Panic|Info]f instead of w/o f
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Add option to skip kernel check for older kernels which have been patched to support multiple lower directories in overlayfs.
Fixes#24023
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Now that Windows base images can be loaded directly into docker via "docker load" of a specialized tar file (with docker pull support on the horizon) we no longer have need of the custom images code path that loads images from a shared central location. Removing that code and it's call points.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Currently when overlay creates a whiteout file then the overlay2 layer is archived,
the correct tar header will be created for the whiteout file, but the tar logic will then attempt to open the file causing a failure.
When tar encounters such failures the file is skipped and excluded for the archive, causing the whiteout to be ignored.
By skipping the copy of empty files, no open attempt will be made on whiteout files.
Fixes#23863
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This patch introduces a new experimental engine-level plugin management
with a new API and command line. Plugins can be distributed via a Docker
registry, and their lifecycle is managed by the engine.
This makes plugins a first-class construct.
For more background, have a look at issue #20363.
Documentation is in a separate commit. If you want to understand how the
new plugin system works, you can start by reading the documentation.
Note: backwards compatibility with existing plugins is maintained,
albeit they won't benefit from the advantages of the new system.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Symlinks are currently not getting cleaned up when removing layers since only the root directory is removed.
On remove, read the link file and remove the associated link from the link directory.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Diff apply is sometimes producing a different change list causing the tests to fail.
Overlay has a known issue calculating diffs of files which occur within the same second they were created.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Adds a new overlay driver which uses multiple lower directories to create the union fs.
Additionally it uses symlinks and relative mount paths to allow a depth of 128 and stay within the mount page size limit.
Diffs and done directly over a single directory allowing diffs to be done efficiently and without the need fo the naive diff driver.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)