if there is no cookie set in dm task, or flag DM_UDEV_DISABLE_LIBRARY_FALLBACK
is cleared for a DM_DEV_REMOVE task, libdevmapper will fallback to clean up the
symlink under /dev/mapper by itself, no matter the device removal is executed
immediately or deferred by the kernel.In some cases, the removal is deferred by the
kernel, while the symlink is deleted directly by libdevmapper, when docker tries to
activate the device again, the deferred removal will be canceld, but the symlink will
not show up again, so docker's attempt to mount the device by the symlink will fail,
and it will eventually leads to a `docker start/diff` error.
Fixes#24671
Signed-off-by: Ji.Zhilong <zhilongji@gmail.com>
Problem Description:
An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).
This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
The loopback logic is not technically exclusive to the devicemapper
driver. This reorganizes the code such that the loopback code is usable
outside of the devicemapper package and driver.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Closes#16667
Uses the prefix "devicemapper:" for all the fmt and logrus error, debug, and info messages.
Signed-off-by: Chris Dituri <csdituri@gmail.com>
Finally here is the patch to implement deferred deletion functionality.
Deferred deleted devices are marked as "Deleted" in device meta file.
First we try to delete the device and only if deletion fails and user has
enabled deferred deletion, device is marked for deferred deletion.
When docker starts up again, we go through list of deleted devices and
try to delete these again.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Often it happens that docker is not able to shutdown/remove the thin
pool it created because some device has leaked into some mount name
space. That means device is in use and that means pool can't be removed.
Docker will leave pool as it is and exit. Later when user starts the
docker, it finds pool is already there and docker uses it. But docker
does not know it is same pool which is using the loop devices. Now
docker thinks loop devices are not being used. That means it does not
display the data correctly in "docker info", giving user wrong information.
This patch tries to detect if loop devices as created by docker are
being used for pool and fills in the right details in "docker info".
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Deferred reove functionality was added to library later. So in old version
of library it did not report deferred_remove field.
Create a new function which also gets deferred_remove field and it will be
called only on newer version of library.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
If a device has been scheduled for deferred deactivation and container
is started again and we need to activate device again, we need to cancel
the deferred deactivation which is already scheduled on the device.
Create a method for the same.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
A lot of time device mapper devices leak across mount namespace which docker
does not know about and when docker tries to deactivate/delete device,
operation fails as device is open in some mount namespace.
Create a mechanism where one can defer the device deactivation/deletion
so that docker operation does not fail and device automatically goes
away when last reference to it is dropped.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
UdevWait() is deferred and takes uint cookie as an argument. As arguments
to deferred functions are calculated at the time of call, it is possible
that any update to cookie later by libdm are not taken into account when
UdevWait() is called. Hence use a pointer to uint as argument to UdevWait()
function.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
devmapper graph driver retries device removal 1000 times in case of failure
and if this fills up console with 1000 messages (when daemon is running in
debug mode). So remove these debug messages.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
DM_UDEV_DISABLE_LIBRARY_FALLBACK is disabled by most applications today
when using device-mapper, and ensuring that device-mapper is in sync
with udev. This flag instructs devicemapper to not fallback to creating
the device nodes itself. In the case of udev sync not being supported,
devicemapper will attempt to create the devices in a timely manner,
regardless of udev.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Currently devicemapper CreateDevice and CreateSnapDevice keep on retrying
device creation till a suitable device id is found.
With new transaction mechanism we need to store device id in transaction
before it has been created.
So change the logic in such a way that caller decides the devices Id to
use. If that device Id is not available, caller bumps up the device Id
and retries.
That way caller can update transaciton too when it tries a new Id. Transaction
related patches will come later in the series.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Currently we set up a cookie and upon failure not call UdevWait(). This
does not cleanup the cookie and associated semaphore and system will
soon max out on total number of semaphores.
To avoid this, call UdevWait() even in failure path which in turn will
cleanup associated semaphore.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Otherwise udev can unecessarily execute various rules (and issue
scanning IO, etc) against the thin-pool -- which can never be a
top-level device.
Docker-DCO-1.1-Signed-off-by: Mike Snitzer <snitzer@redhat.com> (github: snitm)