Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.
Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.
This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a fix for https://github.com/docker/for-linux/issues/1012.
The code was not considering that C strings are NULL-terminated so
we need to leave one extra byte.
Without this fix, the testcase in https://github.com/docker/for-linux/issues/1012
fails with
```
Step 61/1001 : RUN echo 60 > 60
---> Running in dde85ac3b1e3
Removing intermediate container dde85ac3b1e3
---> 80a12a18a241
Step 62/1001 : RUN echo 61 > 61
error creating overlay mount to /23456789112345678921234/overlay2/d368abcc97d6c6ebcf23fa71225e2011d095295d5d8c9b31d6810bea748bdf07-init/merged: no such file or directory
```
with the output of `dmesg -T` as:
```
[Sat Dec 19 02:35:40 2020] overlayfs: failed to resolve '/23456789112345678921234/overlay2/89e435a1b24583c463abb73e8abfad8bf8a88312ef8253455390c5fa0a765517-init/wor': -2
```
with this fix, you get the expected:
```
Step 126/1001 : RUN echo 125 > 125
---> Running in 2f2e56da89e0
max depth exceeded
```
Signed-off-by: Oscar Bonilla <6f6231@gmail.com>
This ensures the storage-opts applies to all operations by the graph
drivers, replacing the merging of storage-opts into container storage
config at container-creation time, and hence applying storage-opts to
non-container operations like `COPY` and `ADD` in the builder.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This applies the 127GB default WCOW Sandbox size to not just `RUN` under
`docker build` (as was previously the case) but to `COPY` and `ADD`
under `docker build` and also to `docker run`.
It also removes an inconsistency that the 127GB size was not applied
when `--platform windows` was not passed to `docker build`, but WCOW was
still used as a platform default, e.g. Docker Desktop for Windows in
Windows Containers mode.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
full diff: https://github.com/moby/sys/compare/mountinfo/v0.1.3...mountinfo/v0.4.0
> Note that this dependency uses submodules, providing "github.com/moby/sys/mount"
> and "github.com/moby/sys/mountinfo". Our vendoring tool (vndr) currently doesn't
> support submodules, so we vendor the top-level moby/sys repository (which contains
> both) and pick the most recent tag, which could be either `mountinfo/vXXX` or
> `mount/vXXX`.
github.com/moby/sys/mountinfo v0.4.0
--------------------------------------------------------------------------------
Breaking changes:
- `PidMountInfo` is now deprecated and will be removed before v1.0; users should switch to `GetMountsFromReader`
Fixes and improvements:
- run filter after all fields are parsed
- correct handling errors from bufio.Scan
- documentation formatting fixes
github.com/moby/sys/mountinfo v0.3.1
--------------------------------------------------------------------------------
- mount: use MNT_* flags from golang.org/x/sys/unix on freebsd
- various godoc and CI fixes
- mountinfo: make GetMountinfoFromReader Linux-specific
- Add support for OpenBSD in addition to FreeBSD
- mountinfo: use idiomatic naming for fields
github.com/moby/sys/mountinfo v0.2.0
--------------------------------------------------------------------------------
Bug fixes:
- Fix path unescaping for paths with double quotes
Improvements:
- Mounted: speed up by adding fast paths using openat2 (Linux-only) and stat
- Mounted: relax path requirements (allow relative, non-cleaned paths, symlinks)
- Unescape fstype and source fields
- Documentation improvements
Testing/CI:
- Unit tests: exclude darwin
- CI: run tests under Fedora 32 to test openat2
- TestGetMounts: fix for Ubuntu build system
- Makefile: fix ignoring test failures
- CI: add cross build
github.com/moby/sys/mount v0.1.1
--------------------------------------------------------------------------------
https://github.com/moby/sys/releases/tag/mount%2Fv0.1.1
Improvements:
- RecursiveUnmount: add a fast path (#26)
- Unmount: improve doc
- fix CI linter warning on Windows
Testing/CI:
- Unit tests: exclude darwin
- Makefile: fix ignoring test failures
- CI: add cross build
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This pulls in the migration of go-winio/backuptar from the bundled fork
of archive/tar from Go 1.6 to using Go's current archive/tar unmodified.
This fixes the failure to import an OCI layer (tar stream) containing a
file larger than 8gB.
Fixes: #40444
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
The implementation in libcontainer/system is quite complicated,
and we only use it to detect if user-namespaces are enabled.
In addition, the implementation in containerd uses a sync.Once,
so that detection (and reading/parsing `/proc/self/uid_map`) is
only performed once.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.
This commit was generated by the following bash script:
```
set -e -u -o pipefail
for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
$file
goimports -w $file
done
```
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
`fuse-overlayfs` provides rootless overlayfs functionality without depending
on any kernel patch.
Aside from rootless, `fuse-overlayfs` could be potentially used for eliminating
`chown()` calls that happen in userns-remap mode, because `fuse-overlayfs` also
provides shiftfs functionality.
System requirements:
* fuse-overlayfs needs to be installed. Tested with 0.7.6.
* kernel >= 4.18
Unit test: `go test -exec sudo -v ./daemon/graphdriver/fuse-overlayfs`
The implementation is based on Podman's `overlay` driver which supports
both kernel-mode overlayfs and fuse-overlayfs in the single driver instance:
https://github.com/containers/storage/blob/39a8d5ed/drivers/overlay/overlay.go
However, Moby's implementation aims to decouple `fuse-overlayfs` driver from the
kernel-mode driver (`overlay2`) for simplicity.
Fix#40218
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Now that we do check if overlay is working by performing an actual
overlayfs mount, there's no need in extra checks for the kernel version
or the filesystem type. Actual mount check is sufficient.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Before this commit, overlay check was performed by looking for
`overlay` in /proc/filesystem. This obviously might not work
for rootless Docker (fs is there, but one can't use it as non-root).
This commit changes the check to perform the actual mount, by reusing
the code previously written to check for multiple lower dirs support.
The old check is removed from both drivers, as well as the additional
check for the multiple lower dirs support in overlay2 since it's now
a part of the main check.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This moves supportsMultipleLowerDir() to overlayutils
so it can be used from both overlay and overlay2.
The only changes made were:
* replace logger with logrus
* don't use workDirName mergedDirName constants
* add mnt var to improve readability a bit
This is a preparation for the next commit.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As caught by staticcheck (after disabling the default exclusion rules);
Based on the comment, this break was indeed meant to break the
loop and return the error.
```
daemon/graphdriver/aufs/mount.go:54:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck)
break
^
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
```
daemon/graphdriver/btrfs/btrfs.go:609:5: SA4003: no value of type uint64 is less than 0 (staticcheck)
if driver.options.size <= 0 {
^
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
```
daemon/graphdriver/aufs/aufs_test.go:746:8: SA4021: x = append(y) is equivalent to x = y (staticcheck)
ids = append(ids[2:])
^
```
Also pre-allocating the ids slice while we're at it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It has been pointed out that sometimes device mapper unit tests
fail with the following diagnostics:
> --- FAIL: TestDevmapperSetup (0.02s)
> graphtest_unix.go:44: graphdriver: loopback attach failed
> graphtest_unix.go:48: loopback attach failed
The root cause is the absence of udev inside the container used
for testing, which causes device nodes (/dev/loop*) to not be
created.
The test suite itself already has a workaround, but it only
creates 8 devices (loop0 till loop7). It might very well be
the case that the first few devices are already used by the
system (on my laptop 15 devices are busy).
The fix is to raise the number of devices being manually created.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Here, err is never non-nil as it was checked earlier.
Fixes the following linter warning:
> daemon/graphdriver/copy/copy.go:136:10: nilness: impossible condition: nil != nil (govet)
> if err != nil {
> ^
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Format the source according to latest goimports.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
suppressing the "SA9003: empty branch (staticcheck)" instead of commenting-out
or removing these lines because removing/commenting these lines causes a ripple
effect of changes, and there's still a to-do below.
```
13:06:14 daemon/graphdriver/graphtest/graphbench_unix.go:175:3: SA9003: empty branch (staticcheck)
13:06:14 if applyDiffSize != diffSize {
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When mounting overlays which have children, enforce that
the mount is always performed as read only. Newer versions
of the kernel return a device busy error when a lower directory
is in use as an upper directory in another overlay mount.
Adds committed file to indicate when an overlay is being used
as a parent, ensuring it will no longer be mounted with an
upper directory.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
btrfs_noversion was added in d7c37b5a28
for distributions that did not have the `btrfs/version.h` header file.
Seeing how all of the distributions we currently support do have the
`btrfs/version.h` file we should probably just remove this build flag
altogether.
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Protect access to q.quotas map, and lock around changing nextProjectID.
Techinically, the lock in findNextProjectID() is not needed as it is
only called during initialization, but one can never be too careful.
Fixes: 52897d1c09 ("projectquota: utility class for project quota controls")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>