Commit graph

6725 commits

Author SHA1 Message Date
Sebastiaan van Stijn
a1150245cc
Update to Go 1.17.0, and gofmt with Go 1.17
Movified from 686be57d0a, and re-ran
gofmt again to address for files not present in 20.10 and vice-versa.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 686be57d0a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-07 23:27:50 +02:00
Sebastiaan van Stijn
88ca5cec4e
daemon: fix error-message for minimum allowed kernel-memory limit
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 3c44ade6d0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-04 19:26:18 +02:00
Sebastiaan van Stijn
32fe0bbb91
daemon: use RWMutex for stateCounter
Use an RWMutex to allow concurrent reads of these counters

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 699174347c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-25 09:38:53 +01:00
Sebastiaan van Stijn
7f375bcff4
Merge pull request from GHSA-2mm7-x5h6-5pvq
[20.10] oci: inheritable capability set should be empty
2022-03-23 22:10:17 +01:00
Samuel Karp
dd38613d0c
oci: inheritable capability set should be empty
The Linux kernel never sets the Inheritable capability flag to anything
other than empty.  Moby should have the same behavior, and leave it to
userspace code within the container to set a non-empty value if desired.

Reported-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Samuel Karp <skarp@amazon.com>
(cherry picked from commit 0d9a37d0c2)
Signed-off-by: Samuel Karp <skarp@amazon.com>
2022-03-17 14:17:00 -07:00
Brian Goff
7f44d606f9
Merge pull request #43166 from thaJeztah/20.10_backport_fix_update_sync
[20.10 backport] Fix for lack of syncronization in daemon/update.go
2022-02-17 11:08:56 -08:00
Akihiro Suda
879dd468dc
Merge pull request #43215 from thaJeztah/20.10_backport_fix_overlay_fuse_permissions
[20.10 backport] daemon/graphdriver/fuse-overlayfs: Init(): fix directory permissions (staticcheck)
2022-02-12 12:10:29 +09:00
Sebastiaan van Stijn
9edb93886a
Merge pull request #43151 from thaJeztah/20.10_backport_containerd_15
[20.10 backport] update containerd binary v1.5.9, runc v1.0.3, and some script changes
2022-02-10 20:36:31 +01:00
Sebastiaan van Stijn
878b9de935
daemon/graphdriver/fuse-overlayfs: Init(): fix directory permissions (staticcheck)
daemon/graphdriver/fuse-overlayfs/fuseoverlayfs.go:101:63: SA9002: file mode '700' evaluates to 01274; did you mean '0700'? (staticcheck)
        if err := idtools.MkdirAllAndChown(path.Join(home, linkDir), 700, currentID); err != nil {
                                                                     ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f9fb5d4f25)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-08 13:38:29 +01:00
dmytro.iakovliev
b2684c1857
Fix for lack of syncromization in daemon/update.go
Signed-off-by: dmytro.iakovliev <dmytro.iakovliev@zodiacsystems.com>
(cherry picked from commit 58825ffc32)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:55:17 +01:00
Kazuyoshi Kato
8268f70ebb
daemon/logger: replace flaky TestFollowLogsHandleDecodeErr
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
(cherry picked from commit c91e09bee2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:40:19 +01:00
Kazuyoshi Kato
78d0b936b8
daemon/logger: refactor followLogs to write more unit tests
followLogs() is getting really long (170+ lines) and complex.
The function has multiple inner functions that mutate its variables.

To refactor the function, this change introduces follow{} struct.
The inner functions are now defined as ordinal methods, which are
accessible from tests.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
(cherry picked from commit 7a10f5a558)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:40:16 +01:00
Kazuyoshi Kato
39519221c2
daemon/logger: test followLogs' handleDecodeErr case
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
(cherry picked from commit f2e458ebc5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:38:36 +01:00
Kazuyoshi Kato
ada1b01de1
daemon/logger: read the length header correctly
Before this change, if Decode() couldn't read a log record fully,
the subsequent invocation of Decode() would read the record's non-header part
as a header and cause a huge heap allocation.

This change prevents such a case by having the intermediate buffer in
the decoder struct.

Fixes #42125.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
(cherry picked from commit 48d387a757)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:38:32 +01:00
Sebastiaan van Stijn
fb45fe614d
info: remove "expected" check for tini version
These checks were added when we required a specific version of containerd
and runc (different versions were known to be incompatible). I don't think
we had a similar requirement for tini, so this check was redundant. Let's
remove the check altogether.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit b585c64e2b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:48 +01:00
Albin Kerouanton
f9df098e76 fluentd: Turn ForceStopAsyncSend true when async connect is used
The flag ForceStopAsyncSend was added to fluent logger lib in v1.5.0 (at
this time named AsyncStop) to tell fluentd to abort sending logs
asynchronously as soon as possible, when its Close() method is called.
However this flag was broken because of the way the lib was handling it
(basically, the lib could be stucked in retry-connect loop without
checking this flag).

Since fluent logger lib v1.7.0, calling Close() (when ForceStopAsyncSend
is true) will really stop all ongoing send/connect procedure,
wherever it's stucked.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit bd61629b6b)
Signed-off-by: Wesley <wppttt@amazon.com>
2022-01-13 01:06:12 +00:00
Sebastiaan van Stijn
660b9962e4
daemon.WithCommonOptions() fix detection of user-namespaces
Commit dae652e2e5 added support for non-privileged
containers to use ICMP_PROTO (used for `ping`). This option cannot be set for
containers that have user-namespaces enabled.

However, the detection looks to be incorrect; HostConfig.UsernsMode was added
in 6993e891d1 / ee2183881b,
and the property only has meaning if the daemon is running with user namespaces
enabled. In other situations, the property has no meaning.
As a result of the above, the sysctl would only be set for containers running
with UsernsMode=host on a daemon running with user-namespaces enabled.

This patch adds a check if the daemon has user-namespaces enabled (RemappedRoot
having a non-empty value), or if the daemon is running inside a user namespace
(e.g. rootless mode) to fix the detection.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a826ca3aef)

---
The cherry-pick was almost clean but `userns.RunningInUserNS()` -> `sys.RunningInUserNS()`.

Fix docker/buildx issue 561
---

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-12-15 18:20:07 +09:00
Akihiro Suda
0e76a0a418
info: unset cgroup-related fields when CgroupDriver == none
Fix issue 42151

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 039e9670cb)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-11-09 18:43:44 +09:00
Cam
3285c27503 Fix log statement 'failed to exit' timeout accuracy
log statement should reflect how long it actually waited, not how long
it theoretically could wait based on the 'seconds' integer passed in.

Signed-off-by: Cam <gh@sparr.email>
(cherry picked from commit d15ce134ef)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-10-20 22:35:44 +00:00
Cam
a4bcd4c64f docker daemon container stop refactor
this refactors the Stop command to fix a few issues and behaviors that
dont seem completely correct:

1. first it fixes a situation where stop could hang forever (#41579)
2. fixes a behavior where if sending the
stop signal failed, then the code directly sends a -9 signal. If that
fails, it returns without waiting for the process to exit or going
through the full docker kill codepath.
3. fixes a behavior where if sending the stop signal failed, then the
code sends a -9 signal. If that succeeds, then we still go through the
same stop waiting process, and may even go through the docker kill path
again, even though we've already sent a -9.
4. fixes a behavior where the code would wait the full 30 seconds after
sending a stop signal, even if we already know the stop signal failed.

fixes #41579

Signed-off-by: Cam <gh@sparr.email>
(cherry picked from commit 8e362b75cb)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-10-20 22:35:40 +00:00
Cam
bed624fdc9 docker kill: fix bug where failed kills didnt fallback to unix kill
1. fixes #41587
2. removes potential infinite Wait and goroutine leak at end of kill
function

fixes #41587

Signed-off-by: Cam <gh@sparr.email>
(cherry picked from commit e57a365ab1)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-10-20 22:33:15 +00:00
Brian Goff
93ac040bf0 Lock down docker root dir perms.
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.

0701 on lower directories also happens to allow any user to modify
     things in, for instance, the overlay upper dir which neccessarily
     has 0755 permissions.

This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.

This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-08-19 20:40:15 +00:00
Sebastiaan van Stijn
cc5a381cbc
Merge pull request #42633 from thaJeztah/20.10_backport_warn_on_non_matching_platform
[20.10 backport] docker pull: warn when pulled single-arch image does not match --platform
2021-07-15 20:46:08 +02:00
Sebastiaan van Stijn
7429792eed
docker pull: warn when pulled single-arch image does not match --platform
This takes the same approach as was implemented on `docker build`, where a warning
is printed if `FROM --platform=...` is used (added in 399695305c)

Before:

    docker rmi armhf/busybox
    docker pull --platform=linux/s390x armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    docker.io/armhf/busybox:latest

With this change:

    docker rmi armhf/busybox
    docker pull --platform=linux/s390x armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    WARNING: image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64
    docker.io/armhf/busybox:latest

And daemon logs print:

   WARN[2021-04-26T11:19:37.153572667Z] ignoring platform mismatch on single-arch image  error="image with reference armhf/busybox was found but does not match the specified platform: wanted linux/s390x, actual: linux/arm64" image=armhf/busybox

When pulling without specifying `--platform, no warning is currently printed (but we can add a warning in future);

    docker rmi armhf/busybox
    docker pull armhf/busybox

    Using default tag: latest
    latest: Pulling from armhf/busybox
    d34a655120f5: Pull complete
    Digest: sha256:8e51389cdda2158935f2b231cd158790c33ae13288c3106909324b061d24d6d1
    Status: Downloaded newer image for armhf/busybox:latest
    docker.io/armhf/busybox:latest

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 424c0eb3c0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-13 17:06:58 +02:00
Akihiro Suda
869b50e10b
rootless: disable overlay2 if running with SELinux
Kernel 5.11 introduced support for rootless overlayfs, but incompatible with SELinux.

On the other hand, fuse-overlayfs is compatible.

Close issue 42333

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 4300a52606)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-07-06 18:57:39 +09:00
Drew Erny
89edb68e89
Fix possible overlapping IPs
A node is no longer using its load balancer IP address when it no longer
has tasks that use the network that requires that load balancer. When
this occurs, the swarmkit manager will free that IP in IPAM, and may
reaassign it.

When a task shuts down cleanly, it attempts removal of the networks it
uses, and if it is the last task using those networks, this removal
succeeds, and the load balancer IP is freed.

However, this behavior is absent if the container fails. Removal of the
networks is never attempted.

To address this issue, I amend the executor. Whenever a node load
balancer IP is removed or changed, that information is passedd to the
executor by way of the Configure method. By keeping track of the set of
node NetworkAttachments from the previous call to Configure, we can
determine which, if any, have been removed or changed.

At first, this seems to create a race, by which a task can be attempting
to start and the network is removed right out from under it. However,
this is already addressed in the controller. The controller will attempt
to recreate missing networks before starting a task.

Signed-off-by: Drew Erny <derny@mirantis.com>
(cherry picked from commit 0d9b0ed678)
Signed-off-by: Ameya Gawde <agawde@mirantis.com>
2021-06-18 10:13:59 -07:00
Brian Goff
12b03bcb27 Error string match: do not match command path
Whether or not the command path is in the error message is a an
implementation detail.
For example, on Windows the only reason this ever matched was because it
dumped the entire container config into the error message, but this had
nothing to do with the actual error.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 225e046d9d)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-04-27 18:46:33 +00:00
Tibor Vass
8728dd246c
Merge pull request #42263 from AkihiroSuda/move-cgroup2-out-of-experimental-20.10
[20.10 backport] Move cgroup v2 out of experimental
2021-04-09 15:06:18 -07:00
Akihiro Suda
255c79a1e8
Move cgroup v2 out of experimental
We have upgraded runc to rc93 and added CI for cgroup 2.
So we can move cgroup v2 out of experimental.

Fix issue 41916

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 1d2a660093)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-07 13:55:48 +09:00
Akihiro Suda
8088859bab
btrfs: Allow unprivileged user to delete subvolumes (kernel >= 4.18)
Fix issue 41762

Cherry-pick "drivers: btrfs: Allow unprivileged user to delete subvolumes" from containers/storage
831e32b6bd

> In btrfs, subvolume can be deleted by IOC_SNAP_DESTROY ioctl but there
> is one catch: unprivileged IOC_SNAP_DESTROY call is restricted by default.
>
> This is because IOC_SNAP_DESTROY only performs permission checks on
> the top directory(subvolume) and unprivileged user might delete dirs/files
> which cannot be deleted otherwise. This restriction can be relaxed if
> user_subvol_rm_allowed mount option is used.
>
> Although the above ioctl had been the only way to delete a subvolume,
> btrfs now allows deletion of subvolume just like regular directory
> (i.e. rmdir sycall) since kernel 4.18.
>
> So if we fail to cleanup subvolume in subvolDelete(), just fallback to
> system.EnsureRmoveall() to try to cleanup subvolumes again.
> (Note: quota needs privilege, so if quota is enabled we do not fallback)
>
> This fix will allow non-privileged container works with btrfs backend.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 62b5194f62)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-06 14:45:01 +09:00
Tibor Vass
88bd96d6e5
Merge pull request #42233 from AkihiroSuda/fix-rootless-bind-EPERM-20.10
[20.10 backport] rootless: bind mount: fix "operation not permitted"
2021-04-01 07:41:54 -07:00
Akihiro Suda
c1e7924f7c
archive: do not use overlayWhiteoutConverter for UserNS
overlay2 no longer sets `archive.OverlayWhiteoutFormat` when
running in UserNS, so we can remove the complicated logic in the
archive package.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 6322dfc217)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-01 19:00:42 +09:00
Akihiro Suda
22dc1597b9
overlay2: doesSupportNativeDiff: add fast path for userns
When running in userns, returns error (i.e. "use naive, not native")
immediately.

No substantial change to the logic.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 67aa418df2)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-01 19:00:37 +09:00
Akihiro Suda
daae27bfce
overlay2: call d.naiveDiff.ApplyDiff when useNaiveDiff==true
Previously, `d.naiveDiff.ApplyDiff` was not used even when
`useNaiveDiff()==true`

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit dd97134232)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-01 19:00:32 +09:00
Akihiro Suda
e974cb638c
rootless: bind mount: fix "operation not permitted"
The following was failing previously, because `getUnprivilegedMountFlags()` was not called:
```console
$ sudo mount -t tmpfs -o noexec none /tmp/foo
$ $ docker --context=rootless run -it --rm -v /tmp/foo:/mnt:ro alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:520: container init caused: rootfs_linux.go:60: mounting "/tmp/foo" to rootfs at "/home/suda/.local/share/docker/overlay2/b8e7ea02f6ef51247f7f10c7fb26edbfb308d2af8a2c77915260408ed3b0a8ec/merged/mnt" caused: operation not permitted: unknown.
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 248f98ef5e)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-01 18:45:23 +09:00
Brian Goff
60aa0f2f6b
Merge pull request #42079 from thaJeztah/20.10_backport_update_docs_links
[20.10 backport] Update documentation links
2021-03-25 12:48:52 -07:00
Sebastiaan van Stijn
5a697ae130
Merge pull request #42174 from thaJeztah/20.10_backport_41820_fix_json_unexpected_eof
[20.10 backport] Fix handling for json-file io.UnexpectedEOF
2021-03-20 10:10:24 +01:00
Brian Goff
969bde2009
jsonfile: more defensive reader implementation
Tonis mentioned that we can run into issues if there is more error
handling added here. This adds a custom reader implementation which is
like io.MultiReader except it does not cache EOF's.
What got us into trouble in the first place is `io.MultiReader` will
always return EOF once it has received an EOF, however the error
handling that we are going for is to recover from an EOF because the
underlying file is a file which can have more data added to it after
EOF.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 5a664dc87d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-19 18:18:55 +01:00
Brian Goff
cb501700e8
Fix handling for json-file io.UnexpectedEOF
When the multireader hits EOF, we will always get EOF from it, so we
cannot store the multrireader fro later error handling, only for the
decoder.

Thanks @tobiasstadler for pointing this error out.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 4be98a38e7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-19 18:18:52 +01:00
Akihiro Suda
2d39a44c1c
overlayutils/userxattr.go: add "fast path" for kernel >= 5.11.0
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit a8008f7313)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-19 03:36:07 +09:00
Akihiro Suda
95d2b686be
overlay2: support "userxattr" option (kernel 5.11)
The "userxattr" option is needed for mounting overlayfs inside a user namespace with kernel >= 5.11.

The "userxattr" option is NOT needed for the initial user namespace (aka "the host").

Also, Ubuntu (since circa 2015) and Debian (since 10) with kernel < 5.11 can mount the overlayfs in a user namespace without the "userxattr" option.

The corresponding kernel commit: 2d2f2d7322ff43e0fe92bf8cccdc0b09449bf2e1
> **ovl: user xattr**
>
> Optionally allow using "user.overlay." namespace instead of "trusted.overlay."
> ...
> Disable redirect_dir and metacopy options, because these would allow privilege escalation through direct manipulation of the
> "user.overlay.redirect" or "user.overlay.metacopy" xattrs.

Fix issue 42055

Related to containerd/containerd PR 5076

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 11ef8d3ba9)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-19 03:35:59 +09:00
Sebastiaan van Stijn
04d9b581e9
Update documentation links
- Using "/go/" redirects for some topics, which allows us to
  redirect to new locations if topics are moved around in the
  documentation.
- Updated some old URLs to their new location.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 328de0b8d9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-25 21:54:39 +01:00
Nathan Carlson
0e001154f9
Check the length of the correct variable #42039
Signed-off-by: Nathan Carlson <carl4403@umn.edu>
(cherry picked from commit 8d73c1ad68)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-18 22:23:34 +01:00
Sebastiaan van Stijn
df2cfb4d33
Merge pull request #42045 from cpuguy83/20.10_fallback_manifest_on_bad_plat
[20.10] Fallback to manifest list when no platform match
2021-02-18 21:37:34 +01:00
Tibor Vass
caa48de224
Merge pull request #41974 from thaJeztah/20.10_backport_for_linux_1169_plugins_custom_runtime-panic
[20.10 backport] Add shim config for custom runtimes for plugins
2021-02-18 12:36:21 -08:00
Tibor Vass
ff486ae873
Merge pull request #41973 from thaJeztah/20.10_backport_fix_builder_inconsisent_platform
[20.10 backport] Fix builder inconsistent error on buggy platform
2021-02-18 12:32:53 -08:00
Tibor Vass
b81e649d2b
Merge pull request #41977 from thaJeztah/20.10_backport_minor_fixes
[20.10 backport] assorted small fixes, docs changes, and contrib
2021-02-18 12:29:07 -08:00
Brian Goff
3beb2e4422 Move cpu variant checks into platform matcher
Wrap platforms.Only and fallback to our ignore mismatches due to  empty
CPU variants. This just cleans things up and makes the logic re-usable
in other places.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 50f39e7247)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-02-18 20:12:07 +00:00
Brian Goff
0caf485abb Fallback to manifest list when no platform match
In some cases, in fact many in the wild, an image may have the incorrect
platform on the image config.
This can lead to failures to run an image, particularly when a user
specifies a `--platform`.
Typically what we see in the wild is a manifest list with an an entry
for, as an example, linux/arm64 pointing to an image config that has
linux/amd64 on it.

This change falls back to looking up the manifest list for an image to
see if the manifest list shows the image as the correct one for that
platform.

In order to accomplish this we need to traverse the leases associated
with an image. Each image, if pulled with Docker 20.10, will have the
manifest list stored in the containerd content store with the resource
assigned to a lease keyed on the image ID.
So we look up the lease for the image, then look up the assocated
resources to find the manifest list, then check the manifest list for a
platform match, then ensure that manifest referes to our image config.

This is only used as a fallback when a user specified they want a
particular platform and the image config that we have does not match
that platform.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 4be5453215)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-02-18 20:12:00 +00:00
Brian Goff
ab5711e619
Fix builder inconsistent error on buggy platform
When pulling an image by platform, it is possible for the image's
configured platform to not match what was in the manifest list.
The image itself is buggy because either the manifest list is incorrect
or the image config is incorrect. In any case, this is preventing people
from upgrading because many times users do not have control over these
buggy images.

This was not a problem in 19.03 because we did not compare on platform
before. It just assumed if we had the image it was the one we wanted
regardless of platform, which has its own problems.

Example Dockerfile that has this problem:

```Dockerfile
FROM --platform=linux/arm64 k8s.gcr.io/build-image/debian-iptables:buster-v1.3.0
RUN echo hello
```

This fails the first time you try to build after it finishes pulling but
before performing the `RUN` command.
On the second attempt it works because the image is already there and
does not hit the code that errors out on platform mismatch (Actually it
ignores errors if an image is returned at all).

Must be run with the classic builder (DOCKER_BUILDKIT=0).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 399695305c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-17 21:20:46 +01:00