Commit graph

7050 commits

Author SHA1 Message Date
Sebastiaan van Stijn
a5f6500958
replace deprecated gotest.tools' env.Patch() with t.SetEnv()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-28 12:12:39 +02:00
Sebastiaan van Stijn
c6cc03747d
daemon/images: use gotest.tools for tests, and use sub-tests
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-27 15:36:14 +02:00
Sebastiaan van Stijn
c3d7a0c603
Fix validation of IpcMode, PidMode, UTSMode, CgroupnsMode
These HostConfig properties were not validated until the OCI spec for the container
was created, which meant that `container run` and `docker create` would accept
invalid values, and the invalid value would not be detected until `start` was
called, returning a 500 "internal server error", as well as errors from containerd
("cleanup: failed to delete container from containerd: no such container") in the
daemon logs.

As a result, a faulty container was created, and the container state remained
in the `created` state.

This patch:

- Updates `oci.WithNamespaces()` to return the correct `errdefs.InvalidParameter`
- Updates `verifyPlatformContainerSettings()` to validate these settings, so that
  an error is returned when _creating_ the container.

Before this patch:

    docker run -dit --ipc=shared --name foo busybox
    2a00d74e9fbb7960c4718def8f6c74fa8ee754030eeb93ee26a516e27d4d029f
    docker: Error response from daemon: Invalid IPC mode: shared.

    docker ps -a --filter name=foo
    CONTAINER ID   IMAGE     COMMAND   CREATED              STATUS    PORTS     NAMES
    2a00d74e9fbb   busybox   "sh"      About a minute ago   Created             foo

After this patch:

    docker run -dit --ipc=shared --name foo busybox
    docker: Error response from daemon: invalid IPC mode: shared.

     docker ps -a --filter name=foo
    CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

An integration test was added to verify the new validation, which can be run with:

    make BIND_DIR=. TEST_FILTER=TestCreateInvalidHostConfig DOCKER_GRAPHDRIVER=vfs test-integration

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-25 17:41:51 +02:00
Sebastiaan van Stijn
d633169483
Merge pull request #43484 from ndeloof/create_host_path
introduce CreateMountpoint for parity between binds and mounts
2022-05-19 23:06:01 +02:00
Cory Snider
a67e159909 daemon/logger: hold LogFile lock less on ReadLogs
Reduce the amount of time ReadLogs holds the LogFile fsop lock by
releasing it as soon as all the files are opened, before parsing the
compressed file headers.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:23:18 -04:00
Cory Snider
01915a725e daemon/logger: follow LogFile without file watches
File watches have been a source of complexity and unreliability in the
LogFile follow implementation, especially when combined with file
rotation. File change events can be unreliably delivered, especially on
Windows, and the polling fallback adds latency. Following across
rotations has never worked reliably on Windows. Without synchronization
between the log writer and readers, race conditions abound: readers can
read from the file while a log entry is only partially written, leading
to decode errors and necessitating retries.

In addition to the complexities stemming from file watches, the LogFile
follow implementation had complexity from needing to handle file
truncations, and (due to a now-fixed bug in the polling file watcher
implementation) evictions to unlock the log file so it could be rotated.
Log files are now always rotated, never truncated, so these situations
no longer need to be handled by the follow code.

Rewrite the LogFile follow implementation in terms of waiting until
LogFile notifies it that a new message has been written to the log file.
The LogFile informs the follower of the file offset of the last complete
write so that the follower knows not to read past that, preventing it
from attempting to decode partial messages and making retries
unnecessary. Synchronization between LogFile and its followers is used
at critical points to prevent missed notifications of writes and races
between file rotations and the follower opening files for read.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
6d5bc07189 daemon/logger: fix refcounting decompressed files
The refCounter used for sharing temporary decompressed log files and
tracking when the files can be deleted is keyed off the source file's
path. But the path of a log file is not stable: it is renamed on each
rotation. Consequently, when logging is configured with both rotation
and compression, multiple concurrent readers of a container's logs could
read logs out of order, see duplicates or decompress a log file which
has already been decompressed.

Replace refCounter with a new implementation, sharedTempFileConverter,
which is agnostic to the file path, keying off the source file's
identity instead. Additionally, sharedTempFileConverter handles the full
lifecycle of the temporary file, from creation to deletion. This is all
abstracted from the consumer: all the bookkeeping and cleanup is handled
behind the scenes when Close() is called on the returned reader value.
Only one file descriptor is used per temporary file, which is shared by
all readers.

A channel is used for concurrency control so that the lock can be
acquired inside a select statement. While not currently utilized, this
makes it possible to add support for cancellation to
sharedTempFileConverter in the future.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
49aa66b597 daemon/logger: rotate log files, never truncate
Truncating the current log file while a reader is still reading through
it results in log lines getting missed. In contrast, rotating the file
allows readers who have the file open can continue to read from it
undisturbed. Rotating frees up the file name for the logger to create a
new file in its place. This remains true even when max-file=1; the
current log file is "rotated" from its name without giving it a new one.

On POSIXy filesystem APIs, rotating the last file is straightforward:
unlink()ing a file name immediately deletes the name from the filesystem
and makes it available for reuse, even if processes have the file open
at the time. Windows on the other hand only makes the name available
for reuse once the file itself is deleted, which only happens when no
processes have it open. To reuse the file name while the file is still
in use, the file needs to be renamed. So that's what we have to do:
rotate the file to a temporary name before marking it for deletion.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
990b0e28ba daemon/logger/local: fix appending newlines
The json-file driver appends a newline character to log messages with
PLogMetaData.Last set, but the local driver did not. Alter the behavior
of the local driver to match that of the json-file driver.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
3844d1a3d1 daemon/logger: drain readers when logger is closed
The LogFile follower would stop immediately upon the producer closing.
The close signal would race the file watcher; if a message were to be
logged and the logger immediately closed, the follower could miss that
last message if the close signal (formerly ProducerGone) was to win the
race. Add logic to perform one more round of reading when the producer
is closed to catch up on any final logs.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
906b979b88 daemon/logger: remove ProducerGone from LogWatcher
Whether or not the logger has been closed is a property of the logger,
and only of concern to its log reading implementation, not log watchers.
The loggers and their reader implementations can communicate as they see
fit. A single channel per logger which is closed when the logger is
closed is plenty sufficient to broadcast the state to log readers, with
no extra bookeeping or synchronization required.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
ae5f664f4e daemon/logger: open log reader synchronously
The asynchronous startup of the log-reading goroutine made the
follow-tail tests nondeterministic. The Log calls in the tests which
were supposed to happen after the reader started reading would sometimes
execute before the reader, throwing off the counts. Tweak the ReadLogs
implementation so that the order of operations is deterministic.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:22 -04:00
Cory Snider
9aa9d6fafc daemon/logger: add test suite for LogReaders
Add an extensive test suite for validating the behavior of any
LogReader. Test the current LogFile-based implementations against it.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:21 -04:00
Cory Snider
961d32868c daemon/logger: improve jsonfilelog read benchmark
The jsonfilelog read benchmark was incorrectly reusing the same message
pointer in the producer loop. The message value would be reset after the
first call to jsonlogger.Log, resulting in all subsequent calls logging
a zero-valued message. This is not a representative workload for
benchmarking and throws off the throughput metric.

Reduce variation between benchmark runs by using a constant timestamp.

Write to the producer goroutine's error channel only on a non-nil error
to eliminate spurious synchronization between producer and consumer
goroutines external to the logger being benchmarked.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-05-19 15:22:21 -04:00
Sebastiaan van Stijn
517afce0c4
Merge pull request #43557 from neersighted/overlay2-report-metacopy
[v2] overlay2: test for and report metacopy status
2022-05-19 21:16:40 +02:00
Nicolas De Loof
304fbf0804
introduce CreateMountpoint for parity between binds and mounts
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
2022-05-19 16:43:06 +02:00
Paweł Gronowski
85a7f5a09a daemon/linux: Set console size on creation
On Linux the daemon was not respecting the HostConfig.ConsoleSize
property and relied on cli initializing the tty size after the container
was created. This caused a delay between container creation and
the tty actually being resized.

This is also a small change to the api description, because
HostConfig.ConsoleSize is no longer Windows-only.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-05-19 07:57:27 +02:00
Bjorn Neergaard
ce3e2d1955 overlay2: account for UserNS/userxattr in metacopy test
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
2022-05-17 06:58:50 -06:00
Brian Goff
4e025b54d5 Remove mount spec backport
This was added in 1.13 to "upgrade" old mount specs to the new format.
This is no longer needed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2022-05-13 23:14:43 +00:00
Bjorn Neergaard
2c3d1f7b4b overlay2: test for and report metacopy status
This is a first, naive implementation, that does not account for
userxattr/UserNS.

Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
2022-05-13 07:37:20 -06:00
Samuel Karp
a75620086f
Merge pull request #43580 from thaJeztah/remove_initlayer_stub 2022-05-13 01:09:01 -07:00
Brian Goff
f32b304a8f
Merge pull request #42501 from tianon/always-seccomp
Remove "seccomp" build tag
2022-05-12 19:12:15 -07:00
Sebastiaan van Stijn
34e02d9b04
Merge pull request #43524 from thaJeztah/daemon_fix_hosts_validation_step2
opts: ParseTCPAddr(): extract parsing logic, consistent errors
2022-05-13 02:42:40 +02:00
Drew Erny
240a9fcb83
Add Swarm cluster volume supports
Adds code to support Cluster Volumes in Swarm using CSI drivers.

Signed-off-by: Drew Erny <derny@mirantis.com>
2022-05-13 00:55:44 +02:00
Tianon Gravi
c9e19a2aa1 Remove "seccomp" build tag
Similar to the (now removed) `apparmor` build tag, this build-time toggle existed for users who needed to build without the `libseccomp` library.  That's no longer necessary, and given the importance of seccomp to the overall default security profile of Docker containers, it makes sense that any binary built for Linux should support (and use by default) seccomp if the underlying host does.

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
2022-05-12 14:48:35 -07:00
Nicolas De Loof
af5d83a641
Make it explicit raw|multiplexed stream implementation being used
fix #35761

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
2022-05-12 11:36:31 +02:00
Sebastiaan van Stijn
61fec7b36e
daemon/initlayer: Init(): remove unused stub for Windows
This package is only used in unix/linux files, so we don't
need a stub for Windows.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-11 01:27:47 +02:00
Sebastiaan van Stijn
3228dbaaa9
Merge pull request #43555 from thaJeztah/separate_engine_id
daemon: separate daemon ID from trust-key, and disable generating
2022-05-10 14:27:42 +02:00
Eng Zer Jun
7873c27cfb
all: replace strings.Replace with strings.ReplaceAll
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-05-09 19:45:40 +08:00
Sebastiaan van Stijn
6b4696e18d
Merge pull request #43544 from thaJeztah/daemon_fix_hosts_validation_step1h
daemon/config: remove uses of pointers for ints
2022-05-06 17:52:52 +02:00
Sebastiaan van Stijn
d6115b8f40
daemon: fix some minor nits
- remove isErrNoSuchProcess() in favor of a plain errors.As()
- errNoSuchProcess.Error(): remove punctuation

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 11:27:59 +02:00
Sebastiaan van Stijn
d733481399
daemon: daemon.ContainerKill() accept stop-signal as string
This allows the postContainersKill() handler to pass values as-is. As part of
the rewrite, I also moved the daemon.GetContainer(name) call later in the
function, so that we can fail early if an invalid signal is passed, before
doing the (heavier) fetching of the container.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 11:27:47 +02:00
Sebastiaan van Stijn
21df9a04e0
container: StopSignal(): return syscall.Signal
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 00:53:53 +02:00
Sebastiaan van Stijn
ea1eb449b7
daemon: killWithSignal, killPossiblyDeadProcess: accept syscall.Signal
This helps reducing some type-juggling / conversions further up
the stack.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 00:53:52 +02:00
Sebastiaan van Stijn
2ec2b65e45
libcontainerd: SignalProcess(): accept syscall.Signal
This helps reducing some type-juggling / conversions further up
the stack.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-05 00:53:49 +02:00
Sebastiaan van Stijn
070da63310
daemon: only create trust-key if DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE is set
The libtrust trust-key is only used for pushing legacy image manifests;
pushing these images has been deprecated, and we only need to be able
to push them in our CI.

This patch disables generating the trust-key (and related paths) unless
the DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE env-var is set (which we do in
our CI).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-04 20:18:08 +02:00
Sebastiaan van Stijn
bb1208639b
daemon: separate daemon ID from trust-key
This change is in preparation of deprecating support for old manifests.
Currently the daemon's ID is based on the trust-key ID, which will be
removed once we fully deprecate support for old manifests (the trust
key is currently only used in tests).

This patch:

- looks if a trust-key is present; if so, it migrates the trust-key
  ID to the new "engine-id" file within the daemon's root.
- if no trust-key is present (so in case it's a "fresh" install), we
  generate a UUID instead and use that as ID.

The migration is to prevent engines from getting a new ID on upgrades;
while we don't provide any guarantees on the engine's ID, users may
expect the ID to be "stable" (not change) between upgrades.

A test has been added, which can be ran with;

    make DOCKER_GRAPHDRIVER=vfs TEST_FILTER='TestConfigDaemonID' test-integration

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-04 20:17:18 +02:00
Sebastiaan van Stijn
a3ae9a5956
opts: ParseTCPAddr(): extract parsing logic, consistent errors
Make sure we validate the default address given before using it, and
combine the parsing/validation logic so that it can be reused.

This patch also makes the errors more consistent, and uses pkg/errors
for generating them.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-05-01 19:53:40 +02:00
Sebastiaan van Stijn
545cf195e2
Merge pull request #43480 from corhere/mitigate-slow-health-check-start
Mitigate the impact of slow exec starts on health checks
2022-04-29 15:07:31 +02:00
Sebastiaan van Stijn
5486146943
Merge pull request #43525 from thaJeztah/daemon_fix_hosts_validation_step1e
daemon: daemon.initNetworkController(): dont return the controller
2022-04-29 14:12:56 +02:00
Sebastiaan van Stijn
bf04690bbc
Merge pull request #43530 from thaJeztah/api_cleanup_definitions
api/types: cleanup to use more idiomatic names
2022-04-29 11:35:43 +02:00
Sebastiaan van Stijn
e62382d014
daemon/config: remove uses of pointers for ints
Use the default (0) value to indicate "not set", which simplifies
working with these configuration options, preventing the need to
use intermediate variables etc.

While changing this code, also making some small cleanups, such
as replacing "fmt.Sprintf()" for "strconv" variants.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-29 09:39:34 +02:00
Sebastiaan van Stijn
4d22584432
Merge pull request #43536 from thaJeztah/daemon_fix_hosts_validation_step1g
daemon: improvements to config (re)loading
2022-04-29 09:39:11 +02:00
Sebastiaan van Stijn
dbd575ef91
daemon: daemon.initNetworkController(): dont return the controller
This method returned the network controller, only to set it on the daemon.

While making this change, also;

- update some error messages to be in the correct format
- use errors.Wrap() where possible
- extract configuring networks into a separate function to make the flow
  slightly easier to follow.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-29 09:08:49 +02:00
Cory Snider
bdc6473d2d health: Start probe timeout after exec starts
Starting an exec can take a significant amount of time while under heavy
container operation load. In extreme cases the time to start the process
can take upwards of a second, which is a significant fraction of the
default health probe timeout (30s). With a shorter timeout, the exec
start delay could make the difference between a successful probe and a
probe timeout! Mitigate the impact of excessive exec start latencies by
only starting the probe timeout timer after the exec'ed process has
started.

Add a metric to sample the latency of starting health-check exec probes.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-04-28 17:21:03 -04:00
Sebastiaan van Stijn
41b96bff55
update uses of container.ContainerCreateCreatedBody to CreateResponse
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-28 22:39:20 +02:00
Sebastiaan van Stijn
64e96932bd
api: rename volume.VolumeCreateBody to volume.CreateOptions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-28 22:39:14 +02:00
Sebastiaan van Stijn
3cae9fef16
imports: remove "volumetypes" aliases for api/types/volume
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-28 22:39:04 +02:00
Brian Goff
b3332b851a
Merge pull request #43517 from Juneezee/test/t.Setenv
test: use `T.Setenv` to set env vars in tests
2022-04-28 12:02:01 -07:00
Sebastiaan van Stijn
647aede6ad
Merge pull request #43515 from corhere/swarmkit-v2
Bump swarmkit to v2
2022-04-28 20:08:42 +02:00