Make sure adapter.removeNetworks executes during task Remove
adapter.removeNetworks was being skipped for cases when
isUnknownContainer(err) was true after adapter.remove was executed
This fix eliminates the nil return case forcing the function
to continue executing unless there is a true error
Fixes https://github.com/moby/moby/issues/39225
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Commit 5cd62852fa added a lock around call to unix.Mount() to
avoid the race in aufs kernel code related to xino file creation
and removal. While this is going to be fixed in the kernel, we still
need to support the current aufs, so some kind of fix is required.
A think a better fix (rather than a lock) is to add a random suffix
to the file name (note it is and was a separate file per mount,
never mind the same file name -- the file is created/opened and
removed instantly, so each mount deals with its own file).
With the suffix added, the possibility to hit the race is extremely
low, and we don't have to do any locking.
Note we don't add any more characters, instead we're replacing
`xino` with four random characters in the 0-9a-z range.
See also: https://sourceforge.net/p/aufs/mailman/message/36674769/
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Currently the API spec would allow `"443/tcp": [null]`, but what should
be allowed is `"443/tcp": null`
Signed-off-by: Dominic Tubach <dominic.tubach@to.com>
Errors were being ignored and always telling the user that the path
doesn't exist even if it was some other problem, such as a permission
error.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This test is dependent on the search results returned by Docker Hub, which
can change at any moment, and causes this test to be unpredictable.
Removing this test instead of trying to catch up with Docker Hub any time
the results change, because it's effectively testing Docker Hub, and not
the daemon.
Unit tests are already in place to test the core functionality of the daemon,
so it should be safe to remove this test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Trying to start a container that is already running is not an
error condition, so a `304 Not Modified` should be returned instead
of a `409 Conflict`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`chmod` is a legacy syscall, and not present on arm64, which
caused this test to fail.
Add `fchmodat` to the profile so that this test can run both
on x64 and arm64.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Running a bundled aufs benchmark sometimes results in this warning:
> WARN[0001] Couldn't run auplink before unmount /tmp/aufs-tests/aufs/mnt/XXXXX error="exit status 22" storage-driver=aufs
If we take a look at what aulink utility produces on stderr, we'll see:
> auplink:proc_mnt.c:96: /tmp/aufs-tests/aufs/mnt/XXXXX: Invalid argument
and auplink exits with exit code of 22 (EINVAL).
Looking into auplink source code, what happens is it tries to find a
record in /proc/self/mounts corresponding to the mount point (by using
setmntent()/getmntent_r() glibc functions), and it fails.
Some manual testing, as well as runtime testing with lots of printf
added on mount/unmount, as well as calls to check the superblock fs
magic on mount point (as in graphdriver.Mounted(graphdriver.FsMagicAufs, target)
confirmed that this record is in fact there, but sometimes auplink
can't find it. I was also able to reproduce the same error (inability
to find a mount in /proc/self/mounts that should definitely be there)
using a small C program, mocking what `auplink` does:
```c
#include <stdio.h>
#include <err.h>
#include <mntent.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *fp;
struct mntent m, *p;
char a[4096];
char buf[4096 + 1024];
int found =0, lines = 0;
if (argc != 2) {
fprintf(stderr, "Usage: %s <mountpoint>\n", argv[0]);
exit(1);
}
fp = setmntent("/proc/self/mounts", "r");
if (!fp) {
err(1, "setmntent");
}
setvbuf(fp, a, _IOLBF, sizeof(a));
while ((p = getmntent_r(fp, &m, buf, sizeof(buf)))) {
lines++;
if (!strcmp(p->mnt_dir, argv[1])) {
found++;
}
}
printf("found %d entries for %s (%d lines seen)\n", found, argv[1], lines);
return !found;
}
```
I have also wrote a few other C proggies -- one that reads
/proc/self/mounts directly, one that reads /proc/self/mountinfo instead.
They are also prone to the same occasional error.
It is not perfectly clear why this happens, but so far my best theory
is when a lot of mounts/unmounts happen in parallel with reading
contents of /proc/self/mounts, sometimes the kernel fails to provide
continuity (i.e. it skips some part of file or mixes it up in some
other way). In other words, this is a kernel bug (which is probably
hard to fix unless some other interface to get a mount entry is added).
Now, there is no real fix, and a workaround I was able to come up
with is to retry when we got EINVAL. It usually works on the second
attempt, although I've once seen it took two attempts to go through.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Do not use filepath.Walk() as there's no requirement to recursively
go into every directory under mnt -- a (non-recursive) list of
directories in mnt is sufficient.
With filepath.Walk(), in case some container will fail to unmount,
it'll go through the whole container filesystem which is both
excessive and useless.
This is similar to commit f1a4592297 ("devmapper.shutdown:
optimize")
While at it, raise the priority of "unmount error" message from debug
to a warning. Note we don't have to explicitly add `m` as unmount error (from
pkg/mount) will have it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case there are a big number of layers, so that mount data won't fit
into a single memory page (4096 bytes on most platforms, which is good
enough for about 40 layers, depending on how long graphdriver root path
is), we supply additional layers with O_REMOUNT, as described in aufs
documentation.
Problem is, the current implementation does that one layer at a time
(i.e. there is one mount syscall per each additional layer).
Optimize the code to supply as many layers as we can fit in one page
(basically reusing the same code as for the original mount).
Note, per aufs docs, "[a]t remount-time, the options are interpreted
in the given order, e.g. left to right" so we should be good.
Tested on an image with ~100 layers.
Before (35 syscalls):
> [pid 22756] 1556919088.686955 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", "aufs", 0, "br:/mnt/volume_sfo2_09/docker-au"...) = 0 <0.000504>
> [pid 22756] 1556919088.687643 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c451b0, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000105>
> [pid 22756] 1556919088.687851 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c451ba, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000098>
> ..... (~30 lines skipped for clarity)
> [pid 22756] 1556919088.696182 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c45310, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000266>
After (2 syscalls):
> [pid 24352] 1556919361.799889 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/8e7ba189e347a834e99eea4ed568f95b86cec809c227516afdc7c70286ff9a20", "aufs", 0, "br:/mnt/volume_sfo2_09/docker-au"...) = 0 <0.001717>
> [pid 24352] 1556919361.801761 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/8e7ba189e347a834e99eea4ed568f95b86cec809c227516afdc7c70286ff9a20", 0xc000dbecb0, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.001358>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Apparently there is some kind of race in aufs kernel module code,
which leads to the errors like:
[98221.158606] aufs au_xino_create2:186:dockerd[25801]: aufs.xino create err -17
[98221.162128] aufs au_xino_set:1229:dockerd[25801]: I/O Error, failed creating xino(-17).
[98362.239085] aufs au_xino_create2:186:dockerd[6348]: aufs.xino create err -17
[98362.243860] aufs au_xino_set:1229:dockerd[6348]: I/O Error, failed creating xino(-17).
[98373.775380] aufs au_xino_create:767:dockerd[27435]: open /dev/shm/aufs.xino(-17)
[98389.015640] aufs au_xino_create2:186:dockerd[26753]: aufs.xino create err -17
[98389.018776] aufs au_xino_set:1229:dockerd[26753]: I/O Error, failed creating xino(-17).
[98424.117584] aufs au_xino_create:767:dockerd[27105]: open /dev/shm/aufs.xino(-17)
So, we have to have a lock around mount syscall.
While at it, don't call the whole Unmount() on an error path, as
it leads to bogus error from auplink flush.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Use mount.Unmount() which ignores EINVAL ("not mounted") error,
and provides better error diagnostics (so we don't have to explicitly
add target to error messages).
2. Since we're ignoring "not mounted" error, we can call
multiple unmounts without any locking -- but since "auplink flush"
is still involved and can produce an error in logs, let's keep
the check for fs being mounted (it's just a statfs so should be fast).
2. While at it, improve the "can't unmount" error message in Put().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Both mount and unmount calls are already protected by fine-grained
(per id) locks in Get()/Put() introduced in commit fc1cf1911b
("Add more locking to storage drivers"), so there's no point in
having a global lock in mount/unmount.
The only place from which unmount is called without any locking
is Cleanup() -- this is to be addressed in the next patch.
This reverts commit 824c24e680.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As pointed out by Tonis, there's a race between ReleaseRWLayer()
and GetRWLayer():
```
----- goroutine 1 ----- ----- goroutine 2 -----
ReleaseRWLayer()
m := ls.mounts[l.Name()]
...
m.deleteReference(l)
m.hasReferences()
... GetRWLayer()
... mount := ls.mounts[id]
ls.driver.Remove(m.mountID)
ls.store.RemoveMount(m.name) return mount.getReference()
delete(ls.mounts, m.Name())
----------------------- -----------------------
```
When something like this happens, GetRWLayer will return
an RWLayer without a storage. Oops.
There might be more races like this, and it seems the best
solution is to lock by layer id/name by using pkg/locker.
With this in place, name collision could not happen, so remove
the part of previous commit that protected against it in
CreateRWLayer (temporary nil assigmment and associated rollback).
So, now we have
* layerStore.mountL sync.Mutex to protect layerStore.mount map[]
(against concurrent access);
* mountedLayer's embedded `sync.Mutex` to protect its references map[];
* layerStore.layerL (which I haven't touched);
* per-id locker, to avoid name conflicts and concurrent operations
on the same rw layer.
The whole rig seems to look more readable now (mutexes use is
straightforward, no nested locks).
Reported-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is an additon to commit 1fea38856a ("Remove v1.10 migrator")
aka PR #38265. Since that one, CreateRWLayerByGraphID() is not
used anywhere, so let's drop it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
full diff: 9ff9b57c34...5ac07abef4
brings in:
- docker/libnetwork#2376 Forcing a nil IP specified in PortBindings to IPv4zero (0.0.0.0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sometimes this test fails (allegedly due to problems with Docker Hub),
but it fails later than it should, for example:
> 01:20:34.845 assertion failed: expression is false: strings.Count(outSearchCmdStars, "[OK]") <= strings.Count(outSearchCmd, "[OK]"): The quantity of images with stars should be less than that of all images: <...>
This, with non-empty list of images following, means that the initial
`docker search busybox` command returned not enough results. So, add
a check that `docker search busybox` returns something.
While at it,
* raise the number of stars to 10;
* simplify check for number of lines (no need to count [OK]'s);
* improve error message.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>