When building images in a user-namespaced container, v3 capabilities are
stored including the root UID of the creator of the user-namespace.
This UID does not make sense outside the build environment however. If
the image is run in a non-user-namespaced runtime, or if a user-namespaced
runtime uses a different UID, the capabilities requested by the effective
bit will not be honoured by `execve(2)` due to this mismatch.
Instead, we convert v3 capabilities to v2, dropping the root UID on the
fly.
Signed-off-by: Eric Mountain <eric.mountain@datadoghq.com>
(cherry picked from commit 95eb490780)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
overlay2 no longer sets `archive.OverlayWhiteoutFormat` when
running in UserNS, so we can remove the complicated logic in the
archive package.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 6322dfc217)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
=== RUN TestUntarParentPathPermissions
archive_unix_test.go:171: assertion failed: error is not nil: chown /tmp/TestUntarParentPathPermissions694189715/foo: operation not permitted
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
(cherry picked from commit f23c1c297d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit edb62a3ace fixed a bug in MkdirAllAndChown()
that caused the specified permissions to not be applied correctly. As a result
of that bug, the configured umask would be applied.
When extracting archives, Unpack() used 0777 permissions when creating missing
parent directories for files that were extracted.
Before edb62a3ace, this resulted in actual
permissions of those directories to be 0755 on most configurations (using a
default 022 umask).
Creating these directories should not depend on the host's umask configuration.
This patch changes the permissions to 0755 to match the previous behavior,
and to reflect the original intent of using 0755 as default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 25ada76437)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fix#41803
Also attempt to mknod devices.
Mknodding devices are likely to fail, but still worth trying when
running with a seccomp user notification.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit d5d5cccb7e)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The implementation in libcontainer/system is quite complicated,
and we only use it to detect if user-namespaces are enabled.
In addition, the implementation in containerd uses a sync.Once,
so that detection (and reading/parsing `/proc/self/uid_map`) is
only performed once.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`func init()` is evil here, and the logrus calls are being made before
the logger is even setup.
It also means in order to use pigz you have to restart the daemon.
Instead this takes a small hit and resolves pigz on each extraction.
In the grand scheme of decompressing this is a very small hit.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.
This commit was generated by the following bash script:
```
set -e -u -o pipefail
for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
$file
goimports -w $file
done
```
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It makes sense to use mount package here because
- it no longer requires /proc to be mounted
- it provides verbose errors so the caller doesn't have to
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There is a race condition in pkg/archive when using `cmd.Start` for pigz
and xz where the `*bufio.Reader` could be returned to the pool while the
command is still writing to it, and then picked up and used by a new
command.
The command is wrapped in a `CommandContext` where the process will be
killed when the context is cancelled, however this is not instantaneous,
so there's a brief window while the command is still running but the
`*bufio.Reader` was already returned to the pool.
wrapReadCloser calls `cancel()`, and then `readBuf.Close()` which
eventually returns the buffer to the pool. However, because cmdStream
runs `cmd.Wait` in a go routine that we never wait for to finish, it is
not safe to return the reader to the pool yet. We need to ensure we
wait for `cmd.Wait` to finish!
Signed-off-by: Stephen Benjamin <stephen@redhat.com>
also renamed the non-windows variant of this file to be
consistent with other files in this package
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As reported in docker/for-linux/issues/484, since Docker 18.06
docker cp with a destination file name fails with the following error:
> archive/tar: cannot encode header: Format specifies USTAR; and USTAR cannot encode Name="a_very_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_long_filename_that_is_101_characters"
The problem is caused by changes in Go 1.10 archive/tar, which
mis-guesses the tar stream format as USTAR (rather than PAX),
which, in turn, leads to inability to specify file names
longer than 100 characters.
This tar stream is sent by TarWithOptions() (which, since we switched to
Go 1.10, explicitly sets format=PAX for every file, see FileInfoHeader(),
and before Go 1.10 it was PAX by default). Unfortunately, the receiving
side, RebaseArchiveEntries(), which calls tar.Next(), mistakenly guesses
header format as USTAR, which leads to the above error.
The fix is easy: set the format to PAX in RebaseArchiveEntries()
where we read the tar stream and change the file name.
A unit test is added to prevent future regressions.
NOTE this code is not used by dockerd, but rather but docker cli
(also possibly other clients), so this needs to be re-vendored
to cli in order to take effect.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Ubuntu kernel supports overlayfs in user namespaces.
However, Docker had previously crafting overlay opaques directly
using mknod(2) and setxattr(2), which are not supported in userns.
Tested with LXD, Ubuntu 18.04, kernel 4.15.0-36-generic #39-Ubuntu.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: John Howard <jhoward@microsoft.com>
If fixes an error in sameFsTime which was using `==` to compare two times. The correct way is to use go's built-in timea.Equals(timeb).
In changes_windows, it uses sameFsTime to compare mTim of a `system.StatT` to allow TestChangesDirsMutated to operate correctly now.
Note there is slight different between the Linux and Windows implementations of detecting changes. Due to https://github.com/moby/moby/issues/9874,
and the fix at https://github.com/moby/moby/pull/11422, Linux does not consider a change to the directory time as a change. Windows on NTFS
does. See https://github.com/moby/moby/pull/37982 for more information. The result in `TestChangesDirsMutated`, `dir3` is NOT considered a change
in Linux, but IS considered a change on Windows. The test mutates dir3 to have a mtime of +1 second.
With a handful of tests still outstanding, this change ports most of the unit tests under pkg/archive to Windows.
It provides an implementation of `copyDir` in tests for Windows. To make a copy similar to Linux's `cp -a` while preserving timestamps
and links to both valid and invalid targets, xcopy isn't sufficient. So I used robocopy, but had to circumvent certain exit codes that
robocopy exits with which are warnings. Link to article describing this is in the code.
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Prevent changing the tar output by setting the format to
PAX and keeping the times truncated.
Without this change the archiver will produce different tar
archives with different hashes with go 1.10.
The addition of the access and changetime timestamps would
also cause diff comparisons to fail.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The Golang built-in gzip library is serialized, and fairly slow
at decompressing. It also only decompresses on demand, versus
pipelining decompression.
This change switches to using the pigz external command
for gzip decompression, as opposed to using the built-in
golang one. This code is not vendored, but will be used
if it autodetected as part of the OS.
This also switches to using context, versus a manually
managed channel to manage cancellations, and synchronization.
There is a little bit of weirdness around manually having
to cancel in the error cases.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Update golang.org/x/sys to 8dbc5d05d6edcc104950cc299a1ce6641235bc86 in
order to get the Major, Minor and Mkdev functions for every unix-like
OS. Use them instead of the locally defined versions which currently use
the Linux specific device major/minor encoding.
This means that the device number should now be properly encoded on e.g.
Darwin, FreeBSD or Solaris.
Also, the SIGUNUSED constant was removed from golang.org/x/sys/unix in
https://go-review.googlesource.com/61771 as it is also removed from the
respective glibc headers.
Remove it from signal.SignalMap as well after the golang.org/x/sys
re-vendoring.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
The promise package represents a simple enough concurrency pattern that
replicating it in place is sufficient. To end the propagation of this
package, it has been removed and the uses have been inlined.
While this code could likely be refactored to be simpler without the
package, the changes have been minimized to reduce the possibility of
defects. Someone else may want to do further refactoring to remove
closures and reduce the number of goroutines in use.
Signed-off-by: Stephen J Day <stephen.day@docker.com>