These HostConfig properties were not validated until the OCI spec for the container
was created, which meant that `container run` and `docker create` would accept
invalid values, and the invalid value would not be detected until `start` was
called, returning a 500 "internal server error", as well as errors from containerd
("cleanup: failed to delete container from containerd: no such container") in the
daemon logs.
As a result, a faulty container was created, and the container state remained
in the `created` state.
This patch:
- Updates `oci.WithNamespaces()` to return the correct `errdefs.InvalidParameter`
- Updates `verifyPlatformContainerSettings()` to validate these settings, so that
an error is returned when _creating_ the container.
Before this patch:
docker run -dit --ipc=shared --name foo busybox
2a00d74e9fbb7960c4718def8f6c74fa8ee754030eeb93ee26a516e27d4d029f
docker: Error response from daemon: Invalid IPC mode: shared.
docker ps -a --filter name=foo
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2a00d74e9fbb busybox "sh" About a minute ago Created foo
After this patch:
docker run -dit --ipc=shared --name foo busybox
docker: Error response from daemon: invalid IPC mode: shared.
docker ps -a --filter name=foo
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
An integration test was added to verify the new validation, which can be run with:
make BIND_DIR=. TEST_FILTER=TestCreateInvalidHostConfig DOCKER_GRAPHDRIVER=vfs test-integration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Reduce the amount of time ReadLogs holds the LogFile fsop lock by
releasing it as soon as all the files are opened, before parsing the
compressed file headers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
File watches have been a source of complexity and unreliability in the
LogFile follow implementation, especially when combined with file
rotation. File change events can be unreliably delivered, especially on
Windows, and the polling fallback adds latency. Following across
rotations has never worked reliably on Windows. Without synchronization
between the log writer and readers, race conditions abound: readers can
read from the file while a log entry is only partially written, leading
to decode errors and necessitating retries.
In addition to the complexities stemming from file watches, the LogFile
follow implementation had complexity from needing to handle file
truncations, and (due to a now-fixed bug in the polling file watcher
implementation) evictions to unlock the log file so it could be rotated.
Log files are now always rotated, never truncated, so these situations
no longer need to be handled by the follow code.
Rewrite the LogFile follow implementation in terms of waiting until
LogFile notifies it that a new message has been written to the log file.
The LogFile informs the follower of the file offset of the last complete
write so that the follower knows not to read past that, preventing it
from attempting to decode partial messages and making retries
unnecessary. Synchronization between LogFile and its followers is used
at critical points to prevent missed notifications of writes and races
between file rotations and the follower opening files for read.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The refCounter used for sharing temporary decompressed log files and
tracking when the files can be deleted is keyed off the source file's
path. But the path of a log file is not stable: it is renamed on each
rotation. Consequently, when logging is configured with both rotation
and compression, multiple concurrent readers of a container's logs could
read logs out of order, see duplicates or decompress a log file which
has already been decompressed.
Replace refCounter with a new implementation, sharedTempFileConverter,
which is agnostic to the file path, keying off the source file's
identity instead. Additionally, sharedTempFileConverter handles the full
lifecycle of the temporary file, from creation to deletion. This is all
abstracted from the consumer: all the bookkeeping and cleanup is handled
behind the scenes when Close() is called on the returned reader value.
Only one file descriptor is used per temporary file, which is shared by
all readers.
A channel is used for concurrency control so that the lock can be
acquired inside a select statement. While not currently utilized, this
makes it possible to add support for cancellation to
sharedTempFileConverter in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Truncating the current log file while a reader is still reading through
it results in log lines getting missed. In contrast, rotating the file
allows readers who have the file open can continue to read from it
undisturbed. Rotating frees up the file name for the logger to create a
new file in its place. This remains true even when max-file=1; the
current log file is "rotated" from its name without giving it a new one.
On POSIXy filesystem APIs, rotating the last file is straightforward:
unlink()ing a file name immediately deletes the name from the filesystem
and makes it available for reuse, even if processes have the file open
at the time. Windows on the other hand only makes the name available
for reuse once the file itself is deleted, which only happens when no
processes have it open. To reuse the file name while the file is still
in use, the file needs to be renamed. So that's what we have to do:
rotate the file to a temporary name before marking it for deletion.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The json-file driver appends a newline character to log messages with
PLogMetaData.Last set, but the local driver did not. Alter the behavior
of the local driver to match that of the json-file driver.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The LogFile follower would stop immediately upon the producer closing.
The close signal would race the file watcher; if a message were to be
logged and the logger immediately closed, the follower could miss that
last message if the close signal (formerly ProducerGone) was to win the
race. Add logic to perform one more round of reading when the producer
is closed to catch up on any final logs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Whether or not the logger has been closed is a property of the logger,
and only of concern to its log reading implementation, not log watchers.
The loggers and their reader implementations can communicate as they see
fit. A single channel per logger which is closed when the logger is
closed is plenty sufficient to broadcast the state to log readers, with
no extra bookeeping or synchronization required.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The asynchronous startup of the log-reading goroutine made the
follow-tail tests nondeterministic. The Log calls in the tests which
were supposed to happen after the reader started reading would sometimes
execute before the reader, throwing off the counts. Tweak the ReadLogs
implementation so that the order of operations is deterministic.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an extensive test suite for validating the behavior of any
LogReader. Test the current LogFile-based implementations against it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The jsonfilelog read benchmark was incorrectly reusing the same message
pointer in the producer loop. The message value would be reset after the
first call to jsonlogger.Log, resulting in all subsequent calls logging
a zero-valued message. This is not a representative workload for
benchmarking and throws off the throughput metric.
Reduce variation between benchmark runs by using a constant timestamp.
Write to the producer goroutine's error channel only on a non-nil error
to eliminate spurious synchronization between producer and consumer
goroutines external to the logger being benchmarked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
On Linux the daemon was not respecting the HostConfig.ConsoleSize
property and relied on cli initializing the tty size after the container
was created. This caused a delay between container creation and
the tty actually being resized.
This is also a small change to the api description, because
HostConfig.ConsoleSize is no longer Windows-only.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Similar to the (now removed) `apparmor` build tag, this build-time toggle existed for users who needed to build without the `libseccomp` library. That's no longer necessary, and given the importance of seccomp to the overall default security profile of Docker containers, it makes sense that any binary built for Linux should support (and use by default) seccomp if the underlying host does.
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
- remove isErrNoSuchProcess() in favor of a plain errors.As()
- errNoSuchProcess.Error(): remove punctuation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows the postContainersKill() handler to pass values as-is. As part of
the rewrite, I also moved the daemon.GetContainer(name) call later in the
function, so that we can fail early if an invalid signal is passed, before
doing the (heavier) fetching of the container.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The libtrust trust-key is only used for pushing legacy image manifests;
pushing these images has been deprecated, and we only need to be able
to push them in our CI.
This patch disables generating the trust-key (and related paths) unless
the DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE env-var is set (which we do in
our CI).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change is in preparation of deprecating support for old manifests.
Currently the daemon's ID is based on the trust-key ID, which will be
removed once we fully deprecate support for old manifests (the trust
key is currently only used in tests).
This patch:
- looks if a trust-key is present; if so, it migrates the trust-key
ID to the new "engine-id" file within the daemon's root.
- if no trust-key is present (so in case it's a "fresh" install), we
generate a UUID instead and use that as ID.
The migration is to prevent engines from getting a new ID on upgrades;
while we don't provide any guarantees on the engine's ID, users may
expect the ID to be "stable" (not change) between upgrades.
A test has been added, which can be ran with;
make DOCKER_GRAPHDRIVER=vfs TEST_FILTER='TestConfigDaemonID' test-integration
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure we validate the default address given before using it, and
combine the parsing/validation logic so that it can be reused.
This patch also makes the errors more consistent, and uses pkg/errors
for generating them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the default (0) value to indicate "not set", which simplifies
working with these configuration options, preventing the need to
use intermediate variables etc.
While changing this code, also making some small cleanups, such
as replacing "fmt.Sprintf()" for "strconv" variants.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This method returned the network controller, only to set it on the daemon.
While making this change, also;
- update some error messages to be in the correct format
- use errors.Wrap() where possible
- extract configuring networks into a separate function to make the flow
slightly easier to follow.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Starting an exec can take a significant amount of time while under heavy
container operation load. In extreme cases the time to start the process
can take upwards of a second, which is a significant fraction of the
default health probe timeout (30s). With a shorter timeout, the exec
start delay could make the difference between a successful probe and a
probe timeout! Mitigate the impact of excessive exec start latencies by
only starting the probe timeout timer after the exec'ed process has
started.
Add a metric to sample the latency of starting health-check exec probes.
Signed-off-by: Cory Snider <csnider@mirantis.com>