Commit graph

39506 commits

Author SHA1 Message Date
Sebastiaan van Stijn
fd32c70031
update containerd binary to v1.5.5
Welcome to the v1.5.5 release of containerd!

The fifth patch release for containerd 1.5 updates runc to 1.0.1 and contains
other minor updates.

Notable Updates

- Update runc binary to 1.0.1
- Update pull logic to try next mirror on non-404 response
- Update pull authorization logic on redirect

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 4a07b89e9a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:46 +01:00
Sebastiaan van Stijn
302114634c
update containerd binary v1.4.8
Update to containerd 1.4.8 to address [CVE-2021-32760][1].

[1]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-32760

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cf1328cd46)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:44 +01:00
Sebastiaan van Stijn
1cd13dcb6c
Update containerd binary to v1.5.3
full diff: https://github.com/containerd/containerd/compare/v1.5.2...v1.5.3

Welcome to the v1.5.3 release of containerd!

The third patch release for containerd 1.5 updates runc to 1.0.0 and contains
various other fixes.

Notable Updates

- Update runc binary to 1.0.0
- Send pod UID to CNI plugins as K8S_POD_UID
- Fix invalid validation error checking
- Fix error on image pull resume
- Fix User Agent sent to registry authentication server
- Fix symlink resolution for disk mounts on Windows

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 5ae2af41ee)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:42 +01:00
Sebastiaan van Stijn
5f09d5c76a
update containerd binary to v1.5.2
full diff: https://github.com/containerd/containerd/compare/v1.5.1...v1.5.2

The second patch release for containerd 1.5 is a security release to update
runc for CVE-2021-30465

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8e3186fc8f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:40 +01:00
Sebastiaan van Stijn
23f23c99ed
update containerd binary to v1.5.1
full diff: https://github.com/containerd/containerd/compare/v1.5.0...v1.5.1

Notable Updates

- Update runc to rc94
- Fix registry mirror authorization logic in CRI plugin
- Fix regression in cri-cni-release to include cri tools

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 22c0291333)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:38 +01:00
Sebastiaan van Stijn
f036a34c5b
update containerd binary to v1.5.0
Welcome to the v1.5.0 release of containerd!

The sixth major release of containerd includes many stability improvements
and code organization changes to make contribution easier and make future
features cleaner to develop. This includes bringing CRI development into the
main containerd repository and switching to Go modules. This release also
brings support for the Node Resource Interface (NRI).

Highlights
--------------------------------------------------------------------------------

*Project Organization*

- Merge containerd/cri codebase into containerd/containerd
- Move to Go modules
- Remove selinux build tag
- Add json log format output option for daemon log

*Snapshots*

- Add configurable overlayfs path
- Separate overlay implementation from plugin
- Native snapshotter configuration and plugin separation
- Devmapper snapshotter configuration and plugin separation
- AUFS snapshotter configuration and plugin separation
- ZFS snapshotter configuration and plugin separation
- Pass custom snapshot labels when creating snapshot
- Add platform check for snapshotter support when unpacking
- Handle loopback mounts
- Support userxattr mount option for overlay in user namespace
- ZFS snapshotter implementation of usage

*Distribution*

- Improve registry response errors
- Improve image pull performance over HTTP 1.1
- Registry configuration package
- Add support for layers compressed with zstd
- Allow arm64 to fallback to arm (v8, v7, v6, v5)

*Runtime*

- Add annotations to containerd task update API
- Add logging binary support when terminal is true
- Runtime support on FreeBSD

*Windows*

- Implement windowsDiff.Compare to allow outputting OCI images
- Optimize WCOW snapshotter to commit writable layers as read-only parent layers
- Optimize LCOW snapshotter use of scratch layers

*CRI*

- Add NRI injection points cri#1552
- Add support for registry host directory configuration
- Update privileged containers to use current capabilities instead of known capabilities
- Add pod annotations to CNI call
- Enable ocicrypt by default
- Support PID NamespaceMode_TARGET

Impactful Client Updates
--------------------------------------------------------------------------------

This release has changes which may affect projects which import containerd.

*Switch to Go modules*

containerd and all containerd sub-repositories are now using Go modules. This
should help make importing easier for handling transitive dependencies. As of
this release, containerd still does not guarantee client library compatibility
for 1.x versions, although best effort is made to minimize impact from changes
to exported Go packages.

*CRI plugin moved to main repository*

With the CRI plugin moving into the main repository, imports under github.com/containerd/cri/
can now be found github.com/containerd/containerd/pkg/cri/.
There are no changes required for end users of CRI.

*Library changes*

oci

The WithAllCapabilities has been removed and replaced with WithAllCurrentCapabilities
and WithAllKnownCapabilities. WithAllKnownCapabilities has similar
functionality to the previous WithAllCapabilities with added support for newer
capabilities. WithAllCurrentCapabilities can be used to give privileged
containers the same set of permissions as the calling process, preventing errors
when privileged containers attempt to get more permissions than given to the
caller.

*Configuration changes*

New registry.config_path for CRI plugin

registry.config_path specifies a directory to look for registry hosts
configuration. When resolving an image name during pull operations, the CRI
plugin will look in the <registry.config_path>/<image hostname>/ directory
for host configuration. An optional hosts.toml file in that directory may be
used to configure which hosts will be used for the pull operation as well
host-specific configurations. Updates under that directory do not require
restarting the containerd daemon.

Enable registry.config_path in the containerd configuration file.

    [plugins."io.containerd.grpc.v1.cri".registry]
       config_path = "/etc/containerd/certs.d"
    Configure registry hosts, such as /etc/containerd/certs.d/docker.io/hosts.toml
    for any image under the docker.io namespace (any image on Docker Hub).

    server = "https://registry-1.docker.io"

    [host."https://public-mirror.example.com"]
      capabilities = ["pull"]
    [host."https://docker-mirror.internal"]
      capabilities = ["pull", "resolve"]
      ca = "docker-mirror.crt"

If no hosts.toml configuration exists in the host directory, it will fallback
to check certificate files based on Docker's certificate file
pattern (".crt" files for CA certificates and ".cert"/".key" files for client
certificates).

*Deprecation of registry.mirrors and registry.configs in CRI plugin*

Mirroring and TLS can now be configured using the new registry.config_path
option. Existing configurations may be migrated to new host directory
configuration. These fields are only deprecated with no planned removal,
however, these configurations cannot be used while registry.config_path is
defined.

*Version 1 schema is deprecated*

Version 2 of the containerd configuration toml is recommended format and the
default. Starting this version, a deprecation warning will be logged when
version 1 is used.

To check version, see the version value in the containerd toml configuration.

    version=2

FreeBSD Runtime Support (Experimental)
--------------------------------------------------------------------------------

This release includes changes that allow containerd to run on FreeBSD with a
compatible runtime, such as runj. This
support should be considered experimental and currently there are no official
binary releases for FreeBSD. The runtimes used by containerd are maintained
separately and have their own stability guarantees. The containerd project
strives to be compatible with any runtime which aims to implement containerd's
shim API and OCI runtime specification.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 9b2f55bc1c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:36 +01:00
Sebastiaan van Stijn
1dd37750a6
Revert "[20.10] update containerd binary to v1.4.5"
This reverts commit 01f734cb4f.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:34 +01:00
Sebastiaan van Stijn
b097d29705
Revert "[20.10] update containerd binary to v1.4.6"
This reverts commit 56541eca9a.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:32 +01:00
Sebastiaan van Stijn
de656f9da4
Revert "[20.10] update containerd binary to v1.4.7"
This reverts commit 793340a33a.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:30 +01:00
Sebastiaan van Stijn
9e36f77577
Revert "[20.10] update containerd binary v1.4.8"
This reverts commit 067918a8c3.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:28 +01:00
Sebastiaan van Stijn
eb2acf2fb3
Revert "[20.10] update containerd binary to v1.4.9"
This reverts commit e8fb8f7acd.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:26 +01:00
Sebastiaan van Stijn
4e838e50ea
Revert "[20.10] update containerd binary to v1.4.10"
This reverts commit 6835d15f55.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:24 +01:00
Sebastiaan van Stijn
79fd9c1541
Revert "[20.10] update containerd binary to v1.4.11"
This reverts commit 129a2000cf.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:22 +01:00
Sebastiaan van Stijn
13de46fd4b
Revert "[20.10] update containerd binary to v1.4.12"
This reverts commit d47de2a4c7.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-20 09:24:19 +01:00
Sebastiaan van Stijn
22ff2ed34b
Merge pull request #43147 from PettitWesley/backport-fluentd-fix
[20.10 backport] backport fluentd log driver async connect fix
2022-01-20 09:23:15 +01:00
Sebastiaan van Stijn
b106f7dfd0
Merge pull request #43153 from thaJeztah/20.10_bump_go_1.16.13
[20.10] update Go to 1.16.13
2022-01-18 17:37:54 +01:00
Sebastiaan van Stijn
aa92e697cb
[20.10] update Go to 1.16.13
go1.16.13 (released 2022-01-06) includes fixes to the compiler, linker, runtime,
and the net/http package. See the Go 1.16.13 milestone on our issue tracker for
details:

https://github.com/golang/go/issues?q=milestone%3AGo1.16.13+label%3ACherryPickApproved

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-13 16:44:11 +01:00
Albin Kerouanton
f9df098e76 fluentd: Turn ForceStopAsyncSend true when async connect is used
The flag ForceStopAsyncSend was added to fluent logger lib in v1.5.0 (at
this time named AsyncStop) to tell fluentd to abort sending logs
asynchronously as soon as possible, when its Close() method is called.
However this flag was broken because of the way the lib was handling it
(basically, the lib could be stucked in retry-connect loop without
checking this flag).

Since fluent logger lib v1.7.0, calling Close() (when ForceStopAsyncSend
is true) will really stop all ongoing send/connect procedure,
wherever it's stucked.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit bd61629b6b)
Signed-off-by: Wesley <wppttt@amazon.com>
2022-01-13 01:06:12 +00:00
Albin Kerouanton
81fc02b7e1 vendor: github.com/fluent/fluent-logger-golang v1.8.0
Updates the fluent logger library to v1.8.0. Following PRs/commits were
merged since last bump:

* [Add callback for error handling when using
  async](https://github.com/fluent/fluent-logger-golang/pull/97)
* [Fix panic when accessing unexported struct
  field](https://github.com/fluent/fluent-logger-golang/pull/99)
* [Properly stop logger during (re)connect
  failure](https://github.com/fluent/fluent-logger-golang/pull/82)
* [Support a TLS-enabled connection](e5d6aa13b7)

See https://github.com/fluent/fluent-logger-golang/compare/v1.6.1..v1.8.0

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit e24d61b7ef)
Signed-off-by: Wesley <wppttt@amazon.com>
2022-01-13 01:05:52 +00:00
Cam
d6f3add5c6 vendor: github.com/fluent/fluent-logger-golang 1.6.1
Updates the fluent logger library. Namely this fixes a couple places
where the library could panic when closing and writing to channels.

see https://github.com/fluent/fluent-logger-golang/pull/93 and
https://github.com/fluent/fluent-logger-golang/pull/95

closes #40829
closes #32567

Signed-off-by: Cam <gh@sparr.email>
(cherry picked from commit a6a98d6928)
Signed-off-by: Wesley <wppttt@amazon.com>
2022-01-13 01:05:12 +00:00
Tianon Gravi
b1fc0c84de
Merge pull request #43084 from AkihiroSuda/cherrypick-42736
[20.10 backport] daemon.WithCommonOptions() fix detection of user-namespaces
2022-01-07 16:09:12 -08:00
Sebastiaan van Stijn
660b9962e4
daemon.WithCommonOptions() fix detection of user-namespaces
Commit dae652e2e5 added support for non-privileged
containers to use ICMP_PROTO (used for `ping`). This option cannot be set for
containers that have user-namespaces enabled.

However, the detection looks to be incorrect; HostConfig.UsernsMode was added
in 6993e891d1 / ee2183881b,
and the property only has meaning if the daemon is running with user namespaces
enabled. In other situations, the property has no meaning.
As a result of the above, the sysctl would only be set for containers running
with UsernsMode=host on a daemon running with user-namespaces enabled.

This patch adds a check if the daemon has user-namespaces enabled (RemappedRoot
having a non-empty value), or if the daemon is running inside a user namespace
(e.g. rootless mode) to fix the detection.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit a826ca3aef)

---
The cherry-pick was almost clean but `userns.RunningInUserNS()` -> `sys.RunningInUserNS()`.

Fix docker/buildx issue 561
---

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-12-15 18:20:07 +09:00
Justin Cormack
459d0dfbbb
Merge pull request #43077 from thaJeztah/20.10_bump_go_1.16.12
[20.10] update Go to 1.16.12
2021-12-12 10:11:51 +00:00
Sebastiaan van Stijn
a621bc007b
[20.10] update Go to 1.16.12
go1.16.12 (released 2021-12-09) includes security fixes to the syscall and net/http
packages. See the Go 1.16.12 milestone on the issue tracker for details:

https://github.com/golang/go/issues?q=milestone%3AGo1.16.12+label%3ACherryPickApproved

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-12 01:15:00 +01:00
Brian Goff
4bed71ae9a
Merge pull request #43063 from thaJeztah/20.10_bump_go_1.16.11
[20.10] update Go to 1.16.11
2021-12-08 10:09:17 -08:00
Sebastiaan van Stijn
f4daf9dd08
[20.10] update Go to 1.16.11
go1.16.11 (released 2021-12-02) includes fixes to the compiler, runtime, and the
net/http, net/http/httptest, and time packages. See the Go 1.16.11 milestone on
the issue tracker for details:

https://github.com/golang/go/issues?q=milestone%3AGo1.16.11+label%3ACherryPickApproved

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-06 10:16:12 +01:00
Sebastiaan van Stijn
847da184ad
Merge pull request #43004 from AkihiroSuda/cherrypick-42152
[20.10 backport] info: unset cgroup-related fields when CgroupDriver == none
2021-11-18 01:21:59 +01:00
Sebastiaan van Stijn
4a27cd1a1b
Merge pull request #43027 from thaJeztah/20.10_backport_update_image_spec
[20.10 backport] vendor: github.com/opencontainers/image-spec v1.0.2
2021-11-18 01:16:31 +01:00
Sebastiaan van Stijn
7568123fc4
Merge pull request #43023 from thaJeztah/20.10_bump_buildkit
[20.10] vendor: github.com/moby/buildkit v0.8.3-4-gbc07b2b8
2021-11-18 00:18:46 +01:00
Sebastiaan van Stijn
c98869341b
Merge pull request #43024 from thaJeztah/20.10_containerd_1.4.12
[20.10] update containerd binary to v1.4.12
2021-11-18 00:04:51 +01:00
Sebastiaan van Stijn
dc015972bb
vendor: github.com/opencontainers/image-spec v1.0.2
- Bring mediaType out of reserved status
- specs-go: adding mediaType to the index and manifest structures

full diff: https://github.com/opencontainers/image-spec/compare/v1.0.1...v1.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cef0a7c14e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-18 00:03:29 +01:00
Sebastiaan van Stijn
11b3bfee6c
Merge pull request #43028 from thaJeztah/20.10_fix_vendor_conf
[20.10] fix vendor validation
2021-11-17 23:08:36 +01:00
Sebastiaan van Stijn
e0108db2bd
[20.10] fix vendor validation
Looks like vndr didn't like the replace rule missing a scheme;

    github.com/docker/distribution: Err: exit status 128, out: fatal: repository 'github.com/samuelkarp/docker-distribution' does not exist
    github.com/containerd/containerd: Err: exit status 128, out: fatal: repository 'github.com/moby/containerd' does not exist

While at it, I also replaced the schem for go-immutable-radix, because GitHub
is deprecating the git:// protocol.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-17 22:13:44 +01:00
Sebastiaan van Stijn
d47de2a4c7
[20.10] update containerd binary to v1.4.12
The twelfth patch release for containerd 1.4 contains a few minor bug fixes
and an update to mitigate CVE-2021-41190.

Notable Updates

* Handle ambiguous OCI manifest parsing GHSA-5j5w-g665-5m35
* Update pull to try next mirror for non-404 errors
* Update pull to handle of non-https urls in descriptors

See the changelog for complete list of changes

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-17 21:01:41 +01:00
Sebastiaan van Stijn
da9c983789
[20.10] vendor: github.com/moby/buildkit v0.8.3-4-gbc07b2b8
imageutil: make mediatype detection more stricter to mitigate CVE-2021-41190.

full diff: 244e8cde63...bc07b2b81b

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-17 20:40:17 +01:00
Sebastiaan van Stijn
10106a0f66
Merge pull request from GHSA-xmmx-7jpf-fx42
[20.10] vendor: update github.com/docker/distribution and github.com/containerd/containerd
2021-11-17 20:34:50 +01:00
Samuel Karp
c1f352c4b1
distribution: validate blob type
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-11-15 14:25:52 -08:00
Samuel Karp
c96ed28f2f
vendor: update github.com/containerd/containerd
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-11-15 14:25:52 -08:00
Akihiro Suda
7bd682c48c
Merge pull request #43008 from thaJeztah/20.10_backport_fix_TestBuildUserNamespaceValidateCapabilitiesAreV2
[20.10 backport] TestBuildUserNamespaceValidateCapabilitiesAreV2: cleanup daemon storage
2021-11-11 15:36:36 +09:00
Sebastiaan van Stijn
7677aeafd7
TestBuildUserNamespaceValidateCapabilitiesAreV2: cleanup daemon storage
This should help with Jenkins failing to clean up the Workspace:

- make sure "cleanup" is also called in the defer for all daemons. keeping
  the daemon's storage around prevented Jenkins from cleaning up.
- close client connections and some readers (just to be sure)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit eea2758761)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-10 14:12:12 +01:00
Sebastiaan van Stijn
34eb6fbe60
testutil: daemon.Cleanup(): cleanup more directories
The storage-driver directory caused Jenkins cleanup to fail. While at it, also
removing other directories that we do not include in the "bundles" that are
stored as Jenkins artifacts.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 1a15a1a061)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-10 14:12:09 +01:00
Sebastiaan van Stijn
c7b97c306a
Merge pull request #42987 from thaJeztah/20.10_backport_create_panic_log_without_readonly
[20.10 backport] [Windows]] cmd/dockerd: create panic.log file without readonly flag
2021-11-09 22:05:55 +01:00
Akihiro Suda
0e76a0a418
info: unset cgroup-related fields when CgroupDriver == none
Fix issue 42151

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
(cherry picked from commit 039e9670cb)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-11-09 18:43:44 +09:00
Tianon Gravi
d9b2f4d0c8
Merge pull request #42989 from thaJeztah/20.10_bump_go_1.16.10
[20.10] Update Go to 1.16.10
2021-11-08 13:05:28 -08:00
Sebastiaan van Stijn
c7edd308ad
[20.10] Update Go to 1.16.10
go1.16.10 (released 2021-11-04) includes security fixes to the archive/zip and
debug/macho packages, as well as bug fixes to the compiler, linker, runtime, the
misc/wasm directory, and to the net/http package. See the Go 1.16.10 milestone
for details: https://github.com/golang/go/issues?q=milestone%3AGo1.16.10+label%3ACherryPickApproved

From the announcement e-mail:

[security] Go 1.17.3 and Go 1.16.10 are released

We have just released Go versions 1.17.3 and 1.16.10, minor point releases.
These minor releases include two security fixes following the security policy:

- archive/zip: don't panic on (*Reader).Open
  Reader.Open (the API implementing io/fs.FS introduced in Go 1.16) can be made
  to panic by an attacker providing either a crafted ZIP archive containing
  completely invalid names or an empty filename argument.
  Thank you to Colin Arnott, SiteHost and Noah Santschi-Cooney, Sourcegraph Code
  Intelligence Team for reporting this issue. This is CVE-2021-41772 and Go issue
  golang.org/issue/48085.
- debug/macho: invalid dynamic symbol table command can cause panic
  Malformed binaries parsed using Open or OpenFat can cause a panic when calling
  ImportedSymbols, due to an out-of-bounds slice operation.
  Thanks to Burak Çarıkçı - Yunus Yıldırım (CT-Zer0 Crypttech) for reporting this
  issue. This is CVE-2021-41771 and Go issue golang.org/issue/48990.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-05 11:10:49 +01:00
Samuel Karp
b3456925ca
vendor: update github.com/docker/distribution
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-11-04 14:41:33 -07:00
Aleksandr Chebotov
6611c72b65
cmd/dockerd: create panic.log file without readonly flag
Signed-off-by: Aleksandr Chebotov <v-aleche@microsoft.com>
(cherry picked from commit b865204042)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-11-03 14:30:08 +01:00
Sebastiaan van Stijn
9a309c6165
Merge pull request #42971 from thaJeztah/20.10_backport_fix_TestCreateServiceSecretFileMode
[20.10 backport] Fix race in TestCreateServiceSecretFileMode, TestCreateServiceConfigFileMode
2021-11-03 14:29:53 +01:00
Sebastiaan van Stijn
4b9a3dac46
Fix race in TestCreateServiceSecretFileMode, TestCreateServiceConfigFileMode
Looks like this test was broken from the start, and fully relied on a race
condition. (Test was added in 65ee7fff02)

The problem is in the service's command: `ls -l /etc/config || /bin/top`, which
will either:

- exit immediately if the secret is mounted correctly at `/etc/config` (which it should)
- keep running with `/bin/top` if the above failed

After the service is created, the test enters a race-condition, checking for 1
task to be running (which it ocassionally is), after which it proceeds, and looks
up the list of tasks of the service, to get the log output of `ls -l /etc/config`.

This is another race: first of all, the original filter for that task lookup did
not filter by `running`, so it would pick "any" task of the service (either failed,
running, or "completed" (successfully exited) tasks).

In the meantime though, SwarmKit kept reconciling the service, and creating new
tasks, so even if the test was able to get the ID of the correct task, that task
may already have been exited, and removed (task-limit is 5 by default), so only
if the test was "lucky", it would be able to get the logs, but of course, chances
were likely that it would be "too late", and the task already gone.

The problem can be easily reproduced when running the steps manually:

    echo 'CONFIG' | docker config create myconfig -

    docker service create --config source=myconfig,target=/etc/config,mode=0777 --name myservice busybox sh -c 'ls -l /etc/config || /bin/top'

The above creates the service, but it keeps retrying, because each task exits
immediately (followed by SwarmKit reconciling and starting a new task);

    mjntpfkkyuuc1dpay4h00c4oo
    overall progress: 0 out of 1 tasks
    1/1: ready     [======================================>            ]
    verify: Detected task failure
    ^COperation continuing in background.
    Use `docker service ps mjntpfkkyuuc1dpay4h00c4oo` to check progress.

And checking the tasks for the service reveals that tasks exit cleanly (no error),
but _do exit_, so swarm just keeps up reconciling, and spinning up new tasks;

    docker service ps myservice --no-trunc
    ID                          NAME              IMAGE                                                                                    NODE             DESIRED STATE   CURRENT STATE                     ERROR     PORTS
    2wmcuv4vffnet8nybg3he4v9n   myservice.1       busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57   docker-desktop   Ready           Ready less than a second ago
    5p8b006uec125iq2892lxay64    \_ myservice.1   busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57   docker-desktop   Shutdown        Complete less than a second ago
    k8lpsvlak4b3nil0zfkexw61p    \_ myservice.1   busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57   docker-desktop   Shutdown        Complete 6 seconds ago
    vsunl5pi7e2n9ol3p89kvj6pn    \_ myservice.1   busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57   docker-desktop   Shutdown        Complete 11 seconds ago
    orxl8b6kt2l6dfznzzd4lij4s    \_ myservice.1   busybox:latest@sha256:f7ca5a32c10d51aeda3b4d01c61c6061f497893d7f6628b92f822f7117182a57   docker-desktop   Shutdown        Complete 17 seconds ago

This patch changes the service's command to `sleep`, so that a successful task
(after successfully performing `ls -l /etc/config`) continues to be running until
the service is deleted. With that change, the service should (usually) reconcile
immediately, which removes the race condition, and should also make it faster :)

This patch changes the tests to use client.ServiceLogs() instead of using the
service's tasklist to directly access container logs. This should also fix some
failures that happened if some tasks failed to start before reconciling, in which
case client.TaskList() (with the current filters), could return more tasks than
anticipated (as it also contained the exited tasks);

    === RUN   TestCreateServiceSecretFileMode
        create_test.go:291: assertion failed: 2 (int) != 1 (int)
    --- FAIL: TestCreateServiceSecretFileMode (7.88s)
    === RUN   TestCreateServiceConfigFileMode
        create_test.go:355: assertion failed: 2 (int) != 1 (int)
    --- FAIL: TestCreateServiceConfigFileMode (7.87s)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 13cff6d583)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-27 12:30:35 +02:00
Sebastiaan van Stijn
e2f740de44
Merge pull request #42959 from thaJeztah/20.10_backport_fix_racey_health_test
[20.10 backport] Fix racey TestHealthKillContainer
2021-10-22 17:32:27 +02:00