`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.
Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
RHEL/CentOS 3.10 kernels report that kernel-memory accounting is supported,
but it actually does not work.
Runc (when compiled for those kernels) will be compiled without kernel-memory
support, so even though the daemon may be reporting that it's supported,
it actually is not.
This cause tests to fail when testing against a daemon that's using a runc
version without kmem support.
For now, skip these tests based on the kernel version reported by the daemon.
This should fix failures such as:
```
FAIL: /go/src/github.com/docker/docker/integration-cli/docker_cli_run_unix_test.go:499: DockerSuite.TestRunWithKernelMemory
assertion failed:
Command: /usr/bin/docker run --kernel-memory 50M --name test1 busybox cat /sys/fs/cgroup/memory/memory.kmem.limit_in_bytes
ExitCode: 0
Error: <nil>
Stdout: 9223372036854771712
Stderr: WARNING: You specified a kernel memory limit on a kernel older than 4.0. Kernel memory limits are experimental on older kernels, it won't work as expected and can cause your system to be unstable.
Failures:
Expected stdout to contain "52428800"
FAIL: /go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go:125: DockerSuite.TestUpdateKernelMemory
/go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go:136:
...open /go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go: no such file or directory
... obtained string = "9223372036854771712"
... expected string = "104857600"
----------------------------------------------------------------------
FAIL: /go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go:139: DockerSuite.TestUpdateKernelMemoryUninitialized
/go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go:149:
...open /go/src/github.com/docker/docker/integration-cli/docker_cli_update_unix_test.go: no such file or directory
... value = nil
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
* Replaces `cocks` with `cerf` as the former might be perceived as
offensive by some people (as pointed out by @jeking3
[here](https://github.com/moby/moby/pull/37157#commitcomment-31758059))
* Removes a duplicate entry for `burnell`
* Re-arranges the entry for `sutherland` to ensure that the names are in
sorted order
* Adds entries for `shamir` and `wilbur`
Signed-off-by: Debayan De <debayande@users.noreply.github.com>
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It has been pointed out that we're ignoring EINVAL from umount(2)
everywhere, so let's move it to a lower-level function. Also, its
implementation should be the same for any UNIX incarnation, so
let's consolidate it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
syscall.Stat (and Lstat), unlike functions from os pkg,
return "raw" errors (like EPERM or EINVAL), and those are
propagated up the function call stack unchanged, and gets
logged and/or returned to the user as is.
Wrap those into os.PathError{} so the error message will
at least have function name and file name.
Note we use Capitalized function names to distinguish
between functions in os and ours.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Ubuntu kernel supports overlayfs in user namespaces.
However, Docker had previously crafting overlay opaques directly
using mknod(2) and setxattr(2), which are not supported in userns.
Tested with LXD, Ubuntu 18.04, kernel 4.15.0-36-generic #39-Ubuntu.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.
This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.
Additional test case has been added.
This fix fixes 37038.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
If fixes an error in sameFsTime which was using `==` to compare two times. The correct way is to use go's built-in timea.Equals(timeb).
In changes_windows, it uses sameFsTime to compare mTim of a `system.StatT` to allow TestChangesDirsMutated to operate correctly now.
Note there is slight different between the Linux and Windows implementations of detecting changes. Due to https://github.com/moby/moby/issues/9874,
and the fix at https://github.com/moby/moby/pull/11422, Linux does not consider a change to the directory time as a change. Windows on NTFS
does. See https://github.com/moby/moby/pull/37982 for more information. The result in `TestChangesDirsMutated`, `dir3` is NOT considered a change
in Linux, but IS considered a change on Windows. The test mutates dir3 to have a mtime of +1 second.
With a handful of tests still outstanding, this change ports most of the unit tests under pkg/archive to Windows.
It provides an implementation of `copyDir` in tests for Windows. To make a copy similar to Linux's `cp -a` while preserving timestamps
and links to both valid and invalid targets, xcopy isn't sufficient. So I used robocopy, but had to circumvent certain exit codes that
robocopy exits with which are warnings. Link to article describing this is in the code.
This function ensures the argument is the mount point
(i.e. if it's not, it bind mounts it to itself).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. There is no need to specify rw argument -- bind mounts are
read-write by default.
2. There is no point in parsing /proc/self/mountinfo after performing
a mount, especially if we don't check whether the fs is mounted or
not -- the only outcome from it could be an error from our mountinfo
parser, which makes no sense in this context.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.
Reported by Huawei PSIRT.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.
No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).
Patch generated with:
> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The code in Close() that removes the watches was not working,
because it first sets `w.closed = true` and then calls w.close(),
which starts with
```
if w.closed {
return errPollerClosed
}
```
Fix by setting w.closed only after calling w.remove() for all the
files being watched.
While at it, remove the duplicated `delete(w.watches, name)` code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There is no need to wait for up to 200ms in order to close
the file descriptor once the chClose is received.
This commit might reduce the chances for occasional "The process
cannot access the file because it is being used by another process"
error on Windows, where an opened file can't be removed.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
* Add cool, crazy, charming, magical and sweet as a adjectives (Aug 18)
* Add four male scientists to the list - faraday, maxwell, sutherland, and moore (Aug 21)
* Add four female scientists to the list - cannon, moser and rhodes (Aug 28)
Signed-off-by: Yadnyawalkya Tale <yadnyawalkyatale@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Since go-1.11beta1 archive/tar, tar headers with Typeflag == TypeRegA
(numeric 0) (which is the default unless explicitly initialized) are
modified to have Typeflag set to either tar.TypeReg (character value
'0', not numeric 0) or tar.TypeDir (character value '5') [1].
This results in different Typeflag value in the resulting header,
leading to a different Checksum, and causing the following test
case errors:
> 12:09:14 --- FAIL: TestTarSums (0.05s)
> 12:09:14 tarsum_test.go:393: expecting
> [tarsum+sha256:8bf12d7e67c51ee2e8306cba569398b1b9f419969521a12ffb9d8875e8836738],
> but got
> [tarsum+sha256:75258b2c5dcd9adfe24ce71eeca5fc5019c7e669912f15703ede92b1a60cb11f]
> ... (etc.)
All the other code explicitly sets the Typeflag field, but this test
case is not, causing the incompatibility with Go 1.11. Therefore,
the fix is to set TypeReg explicitly, and change the expected checksums
in test cases).
Alternatively, we can vendor archive/tar again (for the 100th time),
but given that the issue is limited to the particular test case it
does not make sense.
This fixes the test for all Go versions.
[1] https://go-review.googlesource.com/c/go/+/85656
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Addresses https://github.com/moby/moby/pull/35089#issuecomment-367802698.
This change enables the daemon to automatically select an image under LCOW
that can be used if the API doesn't specify an explicit platform.
For example:
FROM supertest2014/nyan
ADD Dockerfile /
And docker build . will download the linux image (not a multi-manifest image)
And similarly docker pull ubuntu will match linux/amd64
This makes it a bit simpler to remove this interface for v2 plugins
and not break external projects (libnetwork and swarmkit).
Note that before we remove the `Client()` interface from `CompatPlugin`
libnetwork and swarmkit must be updated to explicitly check for the v1
client interface as is done int his PR.
This is just a minor tweak that I realized is needed after trying to
implement the needed changes on libnetwork.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Sometimes docker-master CI fails on rhel4+selinux configuration,
like this:
--- FAIL: TestMount (0.12s)
--- FAIL: TestMount/none-remount,size=128k (0.01s)
mounter_linux_test.go:209: unexpected mount option "seclabel" expected "rw,size=128k"
--- FAIL: TestMount/none-remount,ro,size=128k (0.01s)
mounter_linux_test.go:209: unexpected mount option "seclabel" expected "ro,size=128k"
Earlier, commit 8bebd42df2 (PR #34965) fixed this failure,
but not entirely (i.e. the test is now flaky). It looks like
either selinux detection code is not always working (it won't
work in d-in-d), or the kernel might or might not add 'seclabel'
option).
As the subject of this test case is definitely not selinux,
it can just ignore the option added by it.
While at it, fix error messages:
- add missing commas;
- fix a typo;
- allow for clear distinction between mount
and vfs (per-superblock) options.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In pkg/term/proxy.go and pkg/term/proxy_test.go, check if escapeKeys is empty and if it is, return the one key read
Signed-off-by: Patrik Cyvoct <patrik@ptrk.io>
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".
Many vendored packages still use x/net/context, so vendor entry remains
for now.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It should check `os.Geteuid` with `uid` instead of `os.Getegid`.
On the container (where the tests run), the uid and gid seems to be
the same, thus this doesn't fail.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
govet complains (when using standard "context" package):
> the cancel function returned by context.WithTimeout should be called,
> not discarded, to avoid a context leak (vet)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Now, every Unmount() call takes a burden to parse the whole nine yards
of /proc/self/mountinfo to figure out whether the given mount point is
mounted or not (and returns an error in case parsing fails somehow).
Instead, let's just call umount() and ignore EINVAL, which results
in the same behavior, but much better performance.
Note that EINVAL is returned from umount(2) not only in the case when
`target` is not mounted, but also for invalid flags. As the flags are
hardcoded here, it can't be the case.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The mountinfo parser implemented via `fmt.Sscanf()` is slower than the one
using `strings.Split()` and `strconv.Atoi()`. This rewrite helps to speed it
up to a factor of 8x, here is a result from go bench:
> BenchmarkParsingScanf-4 300 22294112 ns/op
> BenchmarkParsingSplit-4 3000 2780703 ns/op
I tried other approaches, such as using `fmt.Sscanf()` for the first
three (integer) fields and `strings.Split()` for the rest, but it slows
things down considerably:
> BenchmarkParsingMixed-4 1000 8827058 ns/op
Note the old code uses `fmt.Sscanf`, when a linear search for '-' field,
when a split for the last 3 fields. The new code relies on a single
split.
I have also added more comments to aid in future development.
Finally, the test data is fixed to now have white space before the first field.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The flow of getSourceMount was:
1 get all entries from /proc/self/mountinfo
2 do a linear search for the `source` directory
3 if found, return its data
4 get the parent directory of `source`, goto 2
The repeated linear search through the whole mountinfo (which can have
thousands of records) is inefficient. Instead, let's just
1 collect all the relevant records (only those mount points
that can be a parent of `source`)
2 find the record with the longest mountpath, return its data
This was tested manually with something like
```go
func TestGetSourceMount(t *testing.T) {
mnt, flags, err := getSourceMount("/sys/devices/msr/")
assert.NoError(t, err)
t.Logf("mnt: %v, flags: %v", mnt, flags)
}
```
...but it relies on having a specific mount points on the system
being used for testing.
[v2: add unit tests for ParentsFilter]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Prevent changing the tar output by setting the format to
PAX and keeping the times truncated.
Without this change the archiver will produce different tar
archives with different hashes with go 1.10.
The addition of the access and changetime timestamps would
also cause diff comparisons to fail.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove invalid flush commands, flush should only occur when file
has been completely written. This is already handle, remove these calls.
Ensure data gets written after EOF in correct order and before close.
Remove gname and uname from sum for hash compatibility.
Update tarsum tests for gname/uname removal.
Return valid length after eof.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When the authz response buffer limit is hit, perform a flush.
This prevents excessive buffer sizes, especially on large responses
(e.g. `/containers/<id>/archive` or `/containers/<id>/export`).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Sorting by mount point length can be implemented in a more
straightforward fashion since Go 1.8 introduced sort.Slice()
with an ability to provide a less() function in place.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This makes `go test .` to pass if run as non-root user, skipping
those tests that require superuser privileges (for `mount`).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
dm_task_deferred_remove is not supported by all distributions, due to
out-dated versions of devicemapper. However, in the case where the
devicemapper library was updated without rebuilding Docker (which can
happen in some distributions) then we should attempt to dynamically load
the relevant object rather than try to link to it.
This can only be done if Docker was built dynamically, for obvious
reasons.
In order to avoid having issues arise when dlsym(3) was unnecessary,
gate the whole dlsym(3) logic behind a buildflag that we disable by
default (libdm_dlsym_deferred_remove).
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Before this change, volume management was relying on the fact that
everything the plugin mounts is visible on the host within the plugin's
rootfs. In practice this caused some issues with mount leaks, so we
changed the behavior such that mounts are not visible on the plugin's
rootfs, but available outside of it, which breaks volume management.
To fix the issue, allow the plugin to scope the path correctly rather
than assuming that everything is visible in `p.Rootfs`.
In practice this is just scoping the `PropagatedMount` paths to the
correct host path.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The plugin spec says that plugins can live in one of:
- /var/run/docker/plugins/<name>.sock
- /var/run/docker/plugins/<name>/<name>.sock
- /etc/docker/plugins/<name>.[json,spec]
- /etc/docker/plugins/<name>/<name>.<json,spec>
- /usr/lib/docker/plugins/<name>.<json,spec>
- /usr/lib/docker/plugins/<name>/<name>.<json,spec>
However, the plugin scanner which is used by the volume list API was
doing `filepath.Walk`, which will walk the entire tree for each of the
supported paths.
This means that even v2 plugins in
`/var/run/docker/plugins/<id>/<name>.sock` were being detected as a v1
plugin.
When the v1 plugin loader tried to load such a plugin it would log an
error that it couldn't find it because it doesn't match one of the
supported patterns... e.g. when in a subdir, the subdir name must match
the plugin name for the socket.
There is no behavior change as the error is only on the `Scan()` call,
which is passing names to the plugin registry when someone calls the
volume list API.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
The Golang built-in gzip library is serialized, and fairly slow
at decompressing. It also only decompresses on demand, versus
pipelining decompression.
This change switches to using the pigz external command
for gzip decompression, as opposed to using the built-in
golang one. This code is not vendored, but will be used
if it autodetected as part of the OS.
This also switches to using context, versus a manually
managed channel to manage cancellations, and synchronization.
There is a little bit of weirdness around manually having
to cancel in the error cases.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
When a recursive unmount fails, don't bother parsing the mount table to check
if what we expected to be a mountpoint is still mounted. `EINVAL` is
returned when you try to unmount something that is not a mountpoint, the
other cases of `EINVAL` would not apply here unless everything is just
wrong. Parsing the mount table over and over is relatively expensive,
especially in the code path that it's in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
if thin device is deteled and the metadata exists, you can not
delete related containers. This patch ignore Nodata errors for
thin device deletion
Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Standard golang's `os.MkdirAll()` function returns "not a directory" error
in case a directory to be created already exists but is not a directory
(e.g. a file). Our own `idtools.MkdirAs*()` functions do not replicate
the behavior.
This is a bug since all `Mkdir()`-like functions are expected to ensure
the required directory exists and is indeed a directory, and return an
error otherwise.
As the code is using our in-house `system.Stat()` call returning a type
which is incompatible with that of golang's `os.Stat()`, I had to amend
the `system` package with `IsDir()`.
A test case is also provided.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Update golang.org/x/sys to 95c6576299259db960f6c5b9b69ea52422860fce in
order to get the unix.Utsname with byte array instead of int8/uint8
members.
This allows to use simple byte slice to string conversions instead of
using charsToString or its open-coded version.
Also see golang/go#20753 for details.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
To ensure that namesgenerator binary outputs random name
by initializing Seed.
Signed-off-by: Mizuki Urushida <z11111001011@gmail.com>
not use init function.
Signed-off-by: Mizuki Urushida <z11111001011@gmail.com>
In some circumstances we were not properly releasing plugin references,
leading to failures in removing a plugin with no way to recover other
than restarting the daemon.
1. If volume create fails (in the driver)
2. If a driver validation fails (should be rare)
3. If trying to get a plugin that does not match the passed in capability
Ideally the test for 1 and 2 would just be a unit test, however the
plugin interfaces are too complicated as `plugingetter` relies on
github.com/pkg/plugin/Client (a concrete type), which will require
spinning up services from within the unit test... it just wouldn't be a
unit test at this point.
I attempted to refactor this a bit, but since both libnetwork and
swarmkit are reliant on `plugingetter` as well, this would not work.
This really requires a re-write of the lower-level plugin management to
decouple these pieces.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In some cases (e.g. NFS), a chown may technically be a no-op but still
return `EPERM`, so only call `chown` when neccessary.
This is particularly problematic for docker users bind-mounting an NFS
share into a container.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Update golang.org/x/sys to 8dbc5d05d6edcc104950cc299a1ce6641235bc86 in
order to get the Major, Minor and Mkdev functions for every unix-like
OS. Use them instead of the locally defined versions which currently use
the Linux specific device major/minor encoding.
This means that the device number should now be properly encoded on e.g.
Darwin, FreeBSD or Solaris.
Also, the SIGUNUSED constant was removed from golang.org/x/sys/unix in
https://go-review.googlesource.com/61771 as it is also removed from the
respective glibc headers.
Remove it from signal.SignalMap as well after the golang.org/x/sys
re-vendoring.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
With `rprivate` there exists a race where a reference to a mount has
propagated to the new namespace, when `rprivate` is set the parent
namespace is not able to remove the mount due to that reference.
With `rslave` unmounts will propagate correctly into the namespace and
prevent the sort of transient errors that are possible with `rprivate`.
This is a similar fix to 117c92745b
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
The missing console mode constants were added to go-ansiterm in
Azure/go-ansiterm#23. Use these constants instead of defining them
locally.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
The promise package represents a simple enough concurrency pattern that
replicating it in place is sufficient. To end the propagation of this
package, it has been removed and the uses have been inlined.
While this code could likely be refactored to be simpler without the
package, the changes have been minimized to reduce the possibility of
defects. Someone else may want to do further refactoring to remove
closures and reduce the number of goroutines in use.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
If mount fails, the reason might be right there in the kernel log ring buffer.
Let's include it in the error message, it might be of great help.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Static build with devmapper is impossible now since libudev is required
and no static version of libudev is available (as static libraries are
not supported by systemd which udev is part of).
This should not hurt anyone as "[t]he primary user of static builds
is the Editions, and docker in docker via the containers, and none
of those use device mapper".
Also, since the need for static libdevmapper is gone, there is no need
to self-compile libdevmapper -- let's use the one from Debian Stretch.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Make sure to call C.free on C string allocated using C.CString in every
exit path.
C.CString allocates memory in the C heap using malloc. It is the callers
responsibility to free them. See
https://golang.org/cmd/cgo/#hdr-Go_references_to_C for details.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Use CreateEvent, OpenEvent (which both map to the respective *EventW
function) and PulseEvent from golang.org/x/sys instead of local copies.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Douglas Curtis <dougcurtis1@gmail.com>
Commenting out tests for now
Signed-off-by: Doug Curtis <dougcurtis1@gmail.com>
Added unit test for CopyInfoDestionationPath.
Signed-off-by: Doug Curtis <dougcurtis1@gmail.com>
Removing integration-cli test case additions
Signed-off-by: Doug Curtis <dougcurtis1@gmail.com>
Removing extra spaces between archive_unix_test.go test cases
Signed-off-by: Doug Curtis <dougcurtis1@gmail.com>
Fixed gofmt issues in archive_unix_test.go
Signed-off-by: Doug Curtis <dougcurtis1@gmail.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Remove unused function and variables from the package
- Remove usage of it from `profiles/apparmor` where it wasn't required
- Move the package to `daemon/logger/templates` where it's only used
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
The BSD and Solaris versions of term.MakeRaw already set VMIN and VTIME
explicitly such that a read returns when one character is available.
cfmakeraw (which was previously used) in glibc also sets these values
explicitly, so it should be done in the Linux version of MakeRaw as well
to be consistent.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Let's use latest lvm2 sources to compile the libdevmapper library.
Initial reason for compiling devmapper lib from sources was a need to
have the static version of the library at hand, in order to build
the static dockerd, but note that the same headers/solib are used
for dynamic build (dynbinary) as well.
The reason for this patch is to enable the deferral removal feature.
The supplied devmapper library (and headers) are too old, lacking the
needed functions, so the daemon is built with 'libdm_no_deferred_remove'
build tag (see the check in hack/make.sh). Because of this, even if the
kernel dm driver is perfectly able to support the feature, it can not
be used. For more details and background story, see [1].
Surely, one can't just change the version number. While at it:
- improve the comments;
- remove obsoleted URLs;
- remove s390 and ppc configure updates that are no longer needed;
- use pkg-config instead of hardcoding the flags (newer lib added
some more dependencies);
[1] https://github.com/moby/moby/issues/34298
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. devmapper_wrapper_{,no_}deferred_remove.go:
Comments about LibraryDeferredRemovalSupport were very totally
misleading to me. This thing has nothing to do with either static
or dynamic linking (but with build tags). Fix the comment accordingly.
2. devmapper.go:
Reveal the source of those magic device* constants.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Switch some more usage of the Stat function and the Stat_t type from the
syscall package to golang.org/x/sys. Those were missing in PR #33399.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
I am getting the following warning from gcc when compiling the daemon:
> # github.com/docker/docker/pkg/devicemapper
> pkg/devicemapper/devmapper_wrapper.go: In function ‘log_cb’:
> pkg/devicemapper/devmapper_wrapper.go:20:2: warning: ignoring return
> value of ‘vasprintf’, declared with attribute warn_unused_result
> [-Wunused-result]
> vasprintf(&buffer, f, ap);
> ^
vasprintf(3) man page says if the function returns -1, the buffer is
undefined, so we should not use it. In practice, I assume, this never
happens so we just return.
Introduced by https://github.com/moby/moby/pull/33845 that resulted in
commit 63328c6 ("devicemapper: remove 256 character limit of libdm logs")
Cc: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Also remove the test flag from pkg/term and jsut checkuid directly.
Fixed a problem with a pkg/term test that was leaving the terminal in a bad
state.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Use IoctlGetInt/IoctlSetInt from golang.org/x/sys/unix (where
applicable) instead of manually reimplementing them.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use IoctlGetTermios/IoctlSetTermios from golang.org/x/sys/unix instead
of manually reimplementing them.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Use unix.Prctl() instead of manually reimplementing it using
unix.RawSyscall. Also use unix.SECCOMP_MODE_FILTER instead of locally
defining it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
With docker-17.06.0 some images pulled do not extract properly. Some files don't appear in correct directories. This may or may not cause the pull to fail. These images can't be pushed or saved. 17.06 is the first version of Docker built with go1.8.
Cause
There are multiple updates to the tar package in go1.8.
https://go-review.googlesource.com/c/32234/ disables using "prefix" field when new tar archives are being written. Prefix field was previously set when a record in the archive used a path longer than 100 bytes.
Another change https://go-review.googlesource.com/c/31444/ makes the reader ignore the "prefix" field value if the record is in GNU format. GNU format defines that same area should be used for access and modified times. If the "prefix" field is not read, a file will only be extracted by the basename.
The problem is that with a previous version of the golang archive package headers could be written, that use the prefix field while at the same time setting the header format to GNU. This happens when numeric fields are big enough that they can not be written as octal strings and need to be written in binary. Usually, this shouldn't happen: uid, gid, devmajor, devminor can use up to 7 bytes, size and timestamp can use 11. If one of the records does overflow it switches the whole writer to GNU mode and all next files will be saved in GNU format.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Currently, names are maintained by a separate system called "registrar".
This means there is no way to atomically snapshot the state of
containers and the names associated with them.
We can add this atomicity and simplify the code by storing name
associations in the memdb. This removes the need for pkg/registrar, and
makes snapshots a lot less expensive because they no longer need to copy
all the names. This change also avoids some problematic behavior from
pkg/registrar where it returns slices which may be modified later on.
Note that while this change makes the *snapshotting* atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
The case where we are trying to do a remount with changed filesystem specific options was missing,
we need to call `mount` as well here to change those options.
See #33844 for where we need this, as we change `tmpfs` options.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Use the symlink xattr syscall wrappers Lgetxattr and Lsetxattr from
x/sys/unix (introduced in golang/sys@b90f89a) instead of providing own
wrappers. Leave the functionality of system.Lgetxattr intact with
respect to the retry with a larger buffer, but switch it to use
unix.Lgetxattr. Also leave system.Lsetxattr intact (even though it's
just a wrapper around the corresponding function from unix) in order to
keep moby building for !linux.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Due to the CL https://go-review.googlesource.com/c/39608/ in
x/sys/windows which changed the definitions of STD_INPUT_HANDLE,
STD_OUTPUT_HANDLE and STD_ERROR_HANDLE, we get the following failure
after re-vendoring x/sys/windows:
07:47:01 # github.com/docker/docker/pkg/term
07:47:01 pkg/term/term_windows.go:82: constant 4294967286 overflows int
07:47:01 pkg/term/term_windows.go:88: constant 4294967285 overflows int
07:47:01 pkg/term/term_windows.go:94: constant 4294967284 overflows int
07:47:12 Build step 'Execute shell' marked build as failure
Temporarily switch back pkg/term to use these constants from the syscall
package and add a comment about it.
To really fix this, go-ansiterm should probably be switched to use
x/sys/windows.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
libcontainer/user does not build at all on Windows any more, and
this was breaking the client on Windows with upstream `runc`. As
these functions are not used anyway, just split out and stop
checking `runtime`.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Enables other subsystems to watch actions for a plugin(s).
This will be used specifically for implementing plugins on swarm where a
swarm controller needs to watch the state of a plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
docker run --name=test ubuntu /bin/sh -c "cd /tmp && echo hi > a && ln a b" && docker cp test:/tmp tmp_
test
link /root/tmp/a /root/tmp_/b: no such file or directory
Signed-off-by: yangshukui <yangshukui@huawei.com>
Go 1.9 (golang/go@66b5a2f) removed file type bits from
archive/tar.FileInfoHeader().
This commit ensures file type bits are filled even on Go 1.9 for
compatibility.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
LogInit used to act as a manual way of registering the *necessary*
pkg/devicemapper logging callbacks. In addition, it was used to split up
the logic of pkg/devicemapper into daemon/graphdriver/devmapper (such
that some things were logged from libdm).
The manual aspect of this API was completely non-sensical and was just
begging for incorrect usage of pkg/devicemapper, so remove that semantic
and always register our own libdm callbacks.
In addition, recombine the split out logging callbacks into
pkg/devicemapper so that the default logger is local to the library and
also shown to be the recommended logger. This makes the code
substantially easier to read. Also the new DefaultLogger now has
configurable upper-bound for the log level, which allows for dynamically
changing the logging level.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
e07d3cd9a ("devmapper: Fix libdm logging") removed all of the callers of
DmLogInitVerbose, but we still kept around the wrapper. However, the
libdm dm_log_init_verbose API changes the verbosity of the *default*
libdm logger. Because pkg/devicemapper internally *relies* on using
logging callbacks to understand what errors were encountered by libdm,
this wrapper is useless (it only makes sense for the default logger
which we do not user).
Any user not inside Docker of this function almost certainly was not
using this API correctly, because pkg/devicemapper will misbehave if our
logging callbacks were not registered.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This limit is unecessary and can lead to the truncation of long libdm
logs (which is quite annoying).
Fixes: b440ec013 ("device-mapper: Move all devicemapper spew to log through utils.Debugf().")
Signed-off-by: Aleksa Sarai <asarai@suse.de>
There have been some cases where umount, a device can be busy for a very
short duration. Maybe its udev rules, or maybe it is runc related races
or probably it is something else. We don't know yet.
If deferred removal is enabled but deferred deletion is not, then for the
case of "docker run -ti --rm fedora bash", a container will exit, device
will be deferred removed and then immediately a call will come to delete
the device. It is possible that deletion will fail if device was busy
at that time.
A device can't be deleted if it can't be removed/deactivated first. There
is only one exception and that is when deferred deletion is on. In that
case graph driver will keep track of deleted device and try to delete it
later and return success to caller.
Always make sure that device deactivation is synchronous when device is
being deleted (except the case when deferred deletion is enabled).
This should also take care of small races when device is busy for a short
duration and it is being deleted.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This function was only used inside gitutils,
and is written specifically for the requirements
there.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>