The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.
This patch moves our own uses of the package to use the new module.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes a (theoretical?) panic if ID would be shorter than 12
characters. Also trim the ID _after_ cutting off the suffix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
daemon/graphdriver/aufs/aufs.go:239:80: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:249:27: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/graphtest/testutil.go:271:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/graphtest/graphbench_unix.go:179:32: empty-block: this block is empty, you can remove it (revive)
daemon/graphdriver/zfs/zfs.go:375:48: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/overlay/overlay.go:248:89: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:636:21: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1150:70: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1613:30: empty-lines: extra empty line at the end of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1645:65: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/btrfs/btrfs.go:53:101: empty-lines: extra empty line at the start of a block (revive)
daemon/graphdriver/devmapper/deviceset.go:1944:89: empty-lines: extra empty line at the start of a block (revive)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Finish the refactor which was partially completed with commit
34536c498d, passing around IdentityMapping structs instead of pairs of
[]IDMap slices.
Existing code which uses []IDMap relies on zero-valued fields to be
valid, empty mappings. So in order to successfully finish the
refactoring without introducing bugs, their replacement therefore also
needs to have a useful zero value which represents an empty mapping.
Change IdentityMapping to be a pass-by-value type so that there are no
nil pointers to worry about.
The functionality provided by the deprecated NewIDMappingsFromMaps
function is required by unit tests to to construct arbitrary
IdentityMapping values. And the daemon will always need to access the
mappings to pass them to the Linux kernel. Accommodate these use cases
by exporting the struct fields instead. BuildKit currently depends on
the UIDs and GIDs methods so we cannot get rid of them yet.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Trying to build Docker images with buildkit using a ZFS-backed storage
was unreliable due to apparent race condition between adding and
removing layers to the storage (see: https://github.com/moby/buildkit/issues/1758).
The issue describes a similar problem with the BTRFS driver that was
resolved by adding additional locking based on the scheme used in the
OverlayFS driver. This commit replicates the scheme to the ZFS driver
which makes the problem as reported in the issue stop happening.
Signed-off-by: Tomasz Mańko <hi@jaen.me>
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.
0701 on lower directories also happens to allow any user to modify
things in, for instance, the overlay upper dir which neccessarily
has 0755 permissions.
This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.
This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.
Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.
This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: https://github.com/moby/sys/compare/mountinfo/v0.1.3...mountinfo/v0.4.0
> Note that this dependency uses submodules, providing "github.com/moby/sys/mount"
> and "github.com/moby/sys/mountinfo". Our vendoring tool (vndr) currently doesn't
> support submodules, so we vendor the top-level moby/sys repository (which contains
> both) and pick the most recent tag, which could be either `mountinfo/vXXX` or
> `mount/vXXX`.
github.com/moby/sys/mountinfo v0.4.0
--------------------------------------------------------------------------------
Breaking changes:
- `PidMountInfo` is now deprecated and will be removed before v1.0; users should switch to `GetMountsFromReader`
Fixes and improvements:
- run filter after all fields are parsed
- correct handling errors from bufio.Scan
- documentation formatting fixes
github.com/moby/sys/mountinfo v0.3.1
--------------------------------------------------------------------------------
- mount: use MNT_* flags from golang.org/x/sys/unix on freebsd
- various godoc and CI fixes
- mountinfo: make GetMountinfoFromReader Linux-specific
- Add support for OpenBSD in addition to FreeBSD
- mountinfo: use idiomatic naming for fields
github.com/moby/sys/mountinfo v0.2.0
--------------------------------------------------------------------------------
Bug fixes:
- Fix path unescaping for paths with double quotes
Improvements:
- Mounted: speed up by adding fast paths using openat2 (Linux-only) and stat
- Mounted: relax path requirements (allow relative, non-cleaned paths, symlinks)
- Unescape fstype and source fields
- Documentation improvements
Testing/CI:
- Unit tests: exclude darwin
- CI: run tests under Fedora 32 to test openat2
- TestGetMounts: fix for Ubuntu build system
- Makefile: fix ignoring test failures
- CI: add cross build
github.com/moby/sys/mount v0.1.1
--------------------------------------------------------------------------------
https://github.com/moby/sys/releases/tag/mount%2Fv0.1.1
Improvements:
- RecursiveUnmount: add a fast path (#26)
- Unmount: improve doc
- fix CI linter warning on Windows
Testing/CI:
- Unit tests: exclude darwin
- Makefile: fix ignoring test failures
- CI: add cross build
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.
This commit was generated by the following bash script:
```
set -e -u -o pipefail
for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
$file
goimports -w $file
done
```
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Now all of the storage drivers use the field "storage-driver" in their log
messages, which is set to name of the respective driver.
Storage drivers changed:
- Aufs
- Btrfs
- Devicemapper
- Overlay
- Overlay 2
- Zfs
Signed-off-by: Alejandro GonzÃlez Hevia <alejandrgh11@gmail.com>
This was added in #36047 just as a way to make sure the tree is fully
unmounted on shutdown.
For ZFS this could be a breaking change since there was no unmount before.
Someone could have setup the zfs tree themselves. It would be better, if
we really do want the cleanup to actually the unpacked layers checking
for mounts rather than a blind recursive unmount of the root.
BTRFS does not use mounts and does not need to unmount anyway.
These was only an unmount to begin with because for some reason the
btrfs tree was being moutned with `private` propagation.
For the other graphdrivers that still have a recursive unmount here...
these were already being unmounted and performing the recursive unmount
shouldn't break anything. If anyone had anything mounted at the
graphdriver location it would have been unmounted on shutdown anyway.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The idea behind making the graphdrivers private is to prevent leaking
mounts into other namespaces.
Unfortunately this is not really what happens.
There is one case where this does work, and that is when the namespace
was created before the daemon's namespace.
However with systemd each system servie winds up with it's own mount
namespace. This causes a race betwen daemon startup and other system
services as to if the mount is actually private.
This also means there is a negative impact when other system services
are started while the daemon is running.
Basically there are too many things that the daemon does not have
control over (nor should it) to be able to protect against these kinds
of leakages. One thing is certain, setting the graphdriver roots to
private disconnects the mount ns heirarchy preventing propagation of
unmounts... new mounts are of course not propagated either, but the
behavior is racey (or just bad in the case of restarting services)... so
it's better to just be able to keep mount propagation in tact.
It also does not protect situations like `-v
/var/lib/docker:/var/lib/docker` where all mounts are recursively bound
into the container anyway.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This commit is a set of fixes and improvement for zfs graph driver,
in particular:
1. Remove mount point after umount in `Get()` error path, as well
as in `Put()` (with `MNT_DETACH` flag). This should solve "failed
to remove root filesystem for <ID> .... dataset is busy" error
reported in Moby issue 35642.
To reproduce the issue:
- start dockerd with zfs
- docker run -d --name c1 --rm busybox top
- docker run -d --name c2 --rm busybox top
- docker stop c1
- docker rm c1
Output when the bug is present:
```
Error response from daemon: driver "zfs" failed to remove root
filesystem for XXX : exit status 1: "/sbin/zfs zfs destroy -r
scratch/docker/YYY" => cannot destroy 'scratch/docker/YYY':
dataset is busy
```
Output when the bug is fixed:
```
Error: No such container: c1
```
(as the container has been successfully autoremoved on stop)
2. Fix/improve error handling in `Get()` -- do not try to umount
if `refcount` > 0
3. Simplifies unmount in `Get()`. Specifically, remove call to
`graphdriver.Mounted()` (which checks if fs is mounted using
`statfs()` and check for fs type) and `mount.Unmount()` (which
parses `/proc/self/mountinfo`). Calling `unix.Unmount()` is
simple and sufficient.
4. Add unmounting of driver's home to `Cleanup()`.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This patch adds the capability for the VFS graphdriver to use
XFS project quotas. It reuses the existing quota management
code that was created by overlay2 on XFS.
It doesn't rely on a filesystem whitelist, but instead
the quota-capability detection code.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Make sure to call C.free on C string allocated using C.CString in every
exit path.
C.CString allocates memory in the C heap using malloc. It is the callers
responsibility to free them. See
https://golang.org/cmd/cgo/#hdr-Go_references_to_C for details.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
This allows for easy extension of adding more parameters to existing
parameters list. Otherwise adding a single parameter changes code
at so many places.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Use module name logrus instead of log.
Use logrus.[Error|Warn|Debug|Fatal|Panic|Info]f instead of w/o f
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>