It is easy for one to use docker for a while, shut it down and restart
docker with different set of storage options for device mapper driver
which will effectively change the thin pool. That means any of the
metadata stored in /var/lib/docker/devicemapper/metadata/ is not valid
for the new pool and user will run into various kind of issues like
container not found in the pool etc.
Users think that their images or containers are lost but it might just
be the case of configuration issue. People might use wrong metadata
with wrong pool.
To detect such situations, save UUID of base image and once docker
starts later, query and compare the UUID of base image with the
stored one. If they don't match, fail the initialization with the
error that UUID failed to match.
That way user will be forced to cleanup /var/lib/docker/ directory
and start docker again.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Export image/container metadata stored in graph driver. Right now 3 fields
DeviceId, DeviceSize and DeviceName are being exported from devicemapper.
Other graph drivers can export fields as they see fit.
This data can be used to mount the thin device outside of docker and tools
can look into image/container and do some kind of inspection.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Right now devicemapper mounts thin device using online discards by default
and passes mount option "discard". Generally people discourage usage of
online discards as they can be a drain on performance. Instead it is
recommended to use fstrim once in a while to reclaim the space.
In case of containers, we recommend to keep data volumes separate. So
there might not be lot of rm, unlink operations going on and there might
not be lot of space being freed by containers. So it might not matter
much if we don't reclaim that free space in pool.
User can still pass mount option explicitly using dm.mountopt=discard to
enable discards if they would like to.
So this is more like setting the containers by default for better performance
instead of better space efficiency in pool. And user can change the behavior
if they don't like default behavior.
Reported-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
If device is being reactivated before it could go away and deferred
deactivation is scheduled on it, cancel it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This will help with debugging as one could just do "docker info" and figure
out of deferred removal is enabled or not.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Provide a new command line knob dm.deferred_device_removal which will enable
deferred device deactivation if driver and library support it.
This patch also checks for library support and driver version.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This provides an override for forcing the daemon to still attempt
running the devicemapper driver even when udev sync is not supported.
Intended to be a very clear impairment for those choosing to use it. If
udev sync is false, there will still be an error in the daemon logs,
even when the override is in place. The docs have an explicit WARNING.
Including link to the docs for users that encounter this daemon error
during an upgrade.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Right now we try device removal at the interval of 10ms and keep on trying
till either device is removed or 10 seconds are over. That means if device
is busy, we will try 1000 times in those 10 seconds.
Sounds too high a frequency of deivce removal retrial. All the logs are
filled easily. I think it is a good idea to slow down a bit and retry at
the interval of 100ms instead of 10ms.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
During device removal, we are first waiting for device to close() in a tight
loop for 10 seconds. I am not sure why do we need it. First of all we come
here once the umount() is successful so device should be free. For some reason
of device is temporarily busy, then removeDevice() logic retries device removal
logic in a loop for 10 seconds and that should cover it. Can't see why one
more 10 seoncds loop is required before attempting device removal.
One loop should be able to cover all the temporary device busy conditions and
if condition is not temporary then 10 seconds loop is not going to help anyway.
So instead of two loops of 10 seconds each, I am converting it to a single
loop of 20 seconds. May be 10 second loop is good enough but for now I am
keeping it 20 seconds to avoid any regressions.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Currently in device removal path (device deactivation), we wait
for 10 seconds for devive to actually go away. waitRemove().
In current code this is not required. If dm removal task has completed
and one has done the wait on udev cookie, then device is gone and there
is no need to write another loop to wait for device removal.
This patch removes the waitRemove() which waits for 10 seconds after
device removal. This seems unnecessary.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
devmapper graph driver retries device removal 1000 times in case of failure
and if this fills up console with 1000 messages (when daemon is running in
debug mode). So remove these debug messages.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
There are issues with libdm logging. Right now if docker daemon is run
in debug mode, logging by libdm is too verbose. And if a device can't
be removed, thousands of messages fill the console and one can not see
what's going on.
This patch removes devicemapper.LogInitVerbose() call as that call will
only work if docker was not registering its own log handler with libdm.
For some reason docker registers one with libdm and libdm hands over
all the messages to docker (including debug ones). And now it is up to
devmapper backend to figure out which ones should go to console and
which ones should not.
So by default log only fatal messages from libdm. One can easily modify
the code to change it for debugging purposes.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
when initializing the devmapper driver, attempt to sync udev and device
mapper. If udev sync is not supported, print a warning. Eventually we'll
likely bail here to avoid unpredictable behavior for users.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Presenly the "Data file:" shows either the loopback _file_ or the block device.
With this, the "Data file:" will always show the device, and if it is a
loopback, then there will additionally be a "Data loop file:".
(Same for "Metadata file:")
Signed-off-by: Vincent Batts <vbatts@redhat.com>
syscall.Unmount failed sometimes when user interrupted exporting,
for example a Ctrl-C, or pipe to commands which closed the pipe early,
like "docker export <container_name> | file -"; this syscall.Unmount
could sometimes return EBUSY and didn't actually umount the filesystem;
which would cause a following export command fail to mount;
change to lazy Unmount with MNT_DETACH can fix the problem, this is
the same behavior as in Shutdown;
```text
time="2015-01-03T21:27:26Z" level=error msg="Warning: error unmounting device
34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e: device or resource busy"
```
```
$ docker export thirsty_ardinghelli | file -
/dev/stdin: POSIX tar archive
time="2015-01-03T21:58:17Z" level=fatal msg="write /dev/stdout: broken pipe"
$ docker export thirsty_ardinghelli
time="2015-01-03T21:54:33Z" level=fatal msg="Error: thirsty_ardinghelli: Error getting container
34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e from driver devicemapper:
Error mounting '/dev/mapper/docker-253:0-3148372-34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e'
on '/var/lib/docker/devicemapper/mnt/34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e': device or resource busy"
```
Signed-off-by: Derek Che <drc@yahoo-inc.com>
Use transaction logic during device deletion and do rollback if transaction
is not complete. Following is the sequence of events.
- Open transaction and save to metafile
- Delete device from pool
- Delete device metadata file from disk
- Close Transaction
If docker crashes without closing transaction then rollback will take
place upon next docker start.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Finally this patch uses the notion of transaction for device or snapshot
device creation.
Following is sequence of event.
- Open a trasaction and save details in a file.
- Create a new device/snapshot device
- If a new device id is used, refresh transaction with new device id details.
- Create device metadata file
- Close transaction.
If docker crashes anywhere in between without closing transaction, then
upon next start, docker will figure out that there was a pending transaction
and it will roll back transaction. That is it will do following.
- Delete Device from pool
- Delete device metadata file
- Remove transaction file to mark no transaction is pending.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Finally, we seem to have all the bits to keep track of all used device
Ids and find a free device Id to use when creating a new device. Start
using it.
Ideally we should completely move away from retry logic when pool returns
-EEXISTS. For now I have retained that logic and I simply output a warning.
When things are stable, we should be able to get rid of it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Open code createDevice() and createSnapDevice() and move all the logic
in the caller.
This is a sheer code reorganization so that all device Id allocation
logic is in one function. That way in case of erros, one can easily
cleanup and mark device Id free again. (Later patches benefit from
it).
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Right now we are accessing devices.NextDeviceId directly and also
incrementing it at various places.
Instead provide a helper function which is responsile for
incrementing NextDeviceId and return next deviceId.
This is just code structuring. This will help later once we
convert this function to find a free device Id and it goes
through a bitmap of used/free device Ids.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
When docker starts, build a used/free Device Id map from the per
device meta files we already have. These meta files have the data
which device Ids are in use. Parse these files and mark device as
used in the map.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Currently devicemapper backend does not keep track of used device Ids in
the pool. It tries a device Id and if that device Id exists in pool, it
tries with a different Id and keeps on doing this in a loop till it succeeds.
This worked fine so far but now we are moving to transaction based
device creation and deletion. We will keep deviceId information in
transaction which will be rolled back if docker crashed before transaction
was complete.
If we store a deviceId in transaction and later figure out it already
existed in pool and docker crashed, then we will rollback and remove
that existing device Id from pool (which we should not have).
That means, we should know free device Id in pool in advance before
we put that device Id in transaction.
Hence this patch creates a bitmap (one bit each for a deviceId), and
sets the bit if device Id is used otherwise resets it. This patch
is just preparing the ground right now. Actual usage will follow
in later patches.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Right now setupBaseImage() uses deleteDevice() to delete uninitialized
base image while rest of the code uses DeleteDevice(). Change it and
use a common function everywhere for the sake of uniformity.
I can't see what harm can be done by doing little extra locking done
by DeleteDevice().
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Very soon we will have the notion of an open transaction and keep its
details in a metafile.
When a new transaction is opened, we allocate a new transaction Id,
do the device creation/deletion and then we will close the transaction.
I thought that OpenTransactionId better represents the semantics of
transaction Id associated with an open transaction instead of NewtransactionId.
This patch just does the renaming. No functionality change.
I have also introduced a structure "Transaction" which will keep all
the details associated with a transaction. Later patches will add more
fields in this structure.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Currently new transaction Id is created using allocateTransactionId()
function. This function takes NewTransactionId and bumps up by one
to create NewTransactionId.
I think ideally we should be bumping up devices.TransactionId by 1
to come up with NewTransactionId. Because idea is that devices.TransactionId
contains the current pool transaction Id and to come up with a new
transaction Id bump it up by one.
Current code is not wrong as we are keeping NewTransactionId and
TransactionId in sync. But it will be more direct if we look at
devices.TransactionId to come up with NewTransactionId. That way
we don't have to even initialize NewTransactionId during startup
as first time somebody wants to do a transaction, it will be
allocated fresh.
So simplify the code a bit. No functionality change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>