Since commit "seccomp: Sync fields with runtime-spec fields"
(5d244675bd) we support to specify the
DefaultErrnoRet to be used.
Before that commit it was not specified and EPERM was used by default.
This commit keeps the same behaviour but just makes it explicit that the
default is EPERM.
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
full diff: https://github.com/containerd/containerd/compare/v1.5.4...v1.5.5
Welcome to the v1.5.5 release of containerd!
The fifth patch release for containerd 1.5 updates runc to 1.0.1 and contains
other minor updates.
Notable Updates
- Update runc binary to 1.0.1
- Update pull logic to try next mirror on non-404 response
- Update pull authorization logic on redirect
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Welcome to the v1.5.5 release of containerd!
The fifth patch release for containerd 1.5 updates runc to 1.0.1 and contains
other minor updates.
Notable Updates
- Update runc binary to 1.0.1
- Update pull logic to try next mirror on non-404 response
- Update pull authorization logic on redirect
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The existing code was the exact equivalent of bytes.HasPrefix();
// HasPrefix tests whether the byte slice s begins with prefix.
func HasPrefix(s, prefix []byte) bool {
return len(s) >= len(prefix) && Equal(s[0:len(prefix)], prefix)
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Let clients choose object types to compute disk usage of.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
If no seccomp policy is requested, then the built-in default policy in
dockerd applies. This has no rule for "clone3" defined, nor any default
errno defined. So when runc receives the config it attempts to determine
a default errno, using logic defined in its commit:
7a8d7162f9
As explained in the above commit message, runc uses a heuristic to
decide which errno to return by default:
[quote]
The solution applied here is to prepend a "stub" filter which returns
-ENOSYS if the requested syscall has a larger syscall number than any
syscall mentioned in the filter. The reason for this specific rule is
that syscall numbers are (roughly) allocated sequentially and thus newer
syscalls will (usually) have a larger syscall number -- thus causing our
filters to produce -ENOSYS if the filter was written before the syscall
existed.
[/quote]
Unfortunately clone3 appears to one of the edge cases that does not
result in use of ENOSYS, instead ending up with the historical EPERM
errno.
Latest glibc (2.33.9000, in Fedora 35 rawhide) will attempt to use
clone3 by default. If it sees ENOSYS then it will automatically
fallback to using clone. Any other errno is treated as a fatal
error. Thus when docker seccomp policy triggers EPERM from clone3,
no fallback occurs and programs are thus unable to spawn threads.
The clone3 syscall is much more complicated than clone, most notably its
flags are not exposed as a directly argument any more. Instead they are
hidden inside a struct. This means that seccomp filters are unable to
apply policy based on values seen in flags. Thus we can't directly
replicate the current "clone" filtering for "clone3". We can at least
ensure "clone3" returns ENOSYS errno, to trigger fallback to "clone"
at which point we can filter on flags.
Fixes: https://github.com/moby/moby/issues/42680
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This utility was just a shallow wrapper around executing the regular
expression, and in some cases, we didn't even use the error it returned,
so better to inline the code instead of abstracting it away.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This way, there's no need to pass down the regular expression, and the
validation is "just another" validator (which we already pass).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Compile the regular expression, instead of 'ad-hoc'. For this to work, I moved
the splitting was moved out of parseMountRaw() into ParseMountRaw(), and the
former was renamed to parseMount(). This function still receives the 'raw' string,
as it's used to include the "raw" spec for inclusion in error messages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add hints for "Failed to destroy btrfs snapshot <DIR> for <ID>: operation not permitted" on rootless
Related to issue 41762
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
`/containers/<name>/copy` endpoint was deprecated in 1.8 and errors
since 1.12. See https://github.com/moby/moby/pull/22149 for more info.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
(*PatternMatcher).Matches includes a special case for when the pattern
matches a parent dir, even though it doesn't match the current path.
However, it assumes that the parent dir which would match the pattern
must have the same number of separators as the pattern itself. This
doesn't hold true with a patern like "**/foo". A file foo/bar would have
len(parentPathDirs) == 1, which is less than the number of path
len(pattern.dirs) == 2... therefore this check would be skipped.
Given that "**/foo" matches "foo", I think it's a bug that the "parent
subdir matches" check is being skipped in this case.
It seems safer to loop over the parent subdirs and check each against
the pattern. It's possible there is a safe optimization to check only a
certain subset, but the existing logic seems unsafe.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Discovered a few instances, where loop variable is incorrectly used
within a test closure, which is marked as parallel.
Few of these were actually loops over singleton slices, therefore the issue
might not have surfaced there (yet), but it is good to fix there as
well, as this is an incorrect pattern used across different tests.
Signed-off-by: Roman Volosatovs <roman.volosatovs@docker.com>
The daemon uses a priority list to automatically select the best-matching storage
driver for the backing filesystem that is used.
Historically, overlay2 was not supported on Btrfs and ZFS, and the daemon would
automatically pick the `btrfs` or `zfs` storage driver if that was the Backing
File System.
Commits 649e4c8889 and e226aea280
improved our detection to check if overlay2 was supported on the backing file-
system, allowing overlay2 to be used on top of Btrfs or ZFS, but did not change
the priority list.
While both Btrfs and ZFS have advantages for certain use-cases, and provide
advanced features that are not available to overlay2, they also are known
to require more "handholding", and are generally considered to be mostly
useful for "advanced" users.
This patch changes the storage-driver priority list, to prefer overlay2 (if
supported by the backing filesystem), and effectively makes btrfs and zfs
opt-in storage drivers.
This change does not affect existing installations; the daemon will detect
the storage driver that was previously in use (based on the presence of
storage directories in `/var/lib/docker`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>