Prevent new health check probes from racing the task deletion. This may
have been a root cause of containers taking so long to stop on Windows.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We have integration tests which assert the invariant that a
GET /containers/{id}/json response lists only IDs of execs which are in
the Running state, according to GET /exec/{id}/json. The invariant could
be violated if those requests were to race the handling of the exec's
task-exit event. The coarse-grained locking of the container ExecStore
when starting an exec task was accidentally synchronizing
(*Daemon).ProcessEvent and (*Daemon).ContainerExecInspect to it just
enough to make it improbable for the integration tests to catch the
invariant violation on execs which exit immediately. Removing the
unnecessary locking made the underlying race condition more likely for
the tests to hit.
Maintain the invariant by deleting the exec from its container's
ExecCommands before clearing its Running flag. Additionally, fix other
potential data races with execs by ensuring that the ExecConfig lock is
held whenever a mutable field is read from or written to.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing logic to handle container ID conflicts when attempting to
create a plugin container is not nearly as robust as the implementation
in daemon for user containers. Extract and refine the logic from daemon
and use it in the plugin executor.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.
Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OOMKilled flag on a container's state has historically behaved
rather unintuitively: it is updated on container exit to reflect whether
or not any process within the container has been OOM-killed during the
preceding run of the container. The OOMKilled flag would be set to true
when the container exits if any process within the container---including
execs---was OOM-killed at any time while the container was running,
whether or not the OOM-kill was the cause of the container exiting. The
flag is "sticky," persisting through the next start of the container;
only being cleared once the container exits without any processes having
been OOM-killed that run.
Alter the behavior of the OOMKilled flag such that it signals whether
any process in the container had been OOM-killed since the most recent
start of the container. Set the flag immediately upon any process being
OOM-killed, and clear it when the container transitions to the "running"
state.
There is an ulterior motive for this change. It reduces the amount of
state the libcontainerd client needs to keep track of and clean up on
container exit. It's one less place the client could leak memory if a
container was to be deleted without going through libcontainerd.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.
Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an integration test to verify that health checks are killed on
timeout and that the output is captured.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just a minor cleanup; I went looking where these options were used, and
these occurrences set then to their default value. Removing these
assignments makes it easier to find where they're actually used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Terminating the exec process when the context is canceled has been
broken since Docker v17.11 so nobody has been able to depend upon that
behaviour in five years of releases. We are thus free from backwards-
compatibility constraints.
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Treat (storage/graph)Driver as snapshotter
Also moved some layerStore related initialization to the non-c8d case
because otherwise they get treated as a graphdriver plugins.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since runtimes can now just be containerd shims, we need to check if the
reference is possibly a containerd shim.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The `-g` / `--graph` options were soft deprecated in favor of `--data-root` in
261ef1fa27 (v17.05.0) and at the time considered
to not be removed. However, with the move towards containerd snapshotters, having
these options around adds additional complexity to handle fallbacks for deprecated
(and hidden) flags, so completing the deprecation.
With this patch:
dockerd --graph=/var/lib/docker --validate
Flag --graph has been deprecated, Use --data-root instead
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
mkdir -p /etc/docker
echo '{"graph":"/var/lib/docker"}' > /etc/docker/daemon.json
dockerd --validate
unable to configure the Docker daemon with file /etc/docker/daemon.json: merged configuration validation from file and command line flags failed: the "graph" config file option is deprecated; use "data-root" instead
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was only used as an intermediate variable to store what's returned
by layerstore.DriverName() / ImageService.StorageDriver()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the function name more generic, as it's no longer used only
for graphdrivers but also for snapshotters.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Makes sure that tests use a config struct that's more representative
to how it's used in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This centralizes more defaults, to be part of the config struct that's
created, instead of interweaving the defaults with other code in various
places.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the information stored as part of the container for the error-message,
instead of querying the current storage driver from the daemon.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The check was accounting for old containers that did not have a storage-driver
set in their config, and was added in 4908d7f81d
for docker v0.7.0-rc6 - nearly 9 Years ago, so very likely nobody is still
depending on this ;-)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was added in 0cba7740d4, as part of
the LCOW implementation. LCOW support has been removed, so we can remove
this check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
full diff: 6068d1894d...48dd89375d
Finishes off the work to change references to cluster volumes in the API
from using "csi" as the magic word to "cluster". This reflects that the
volumes are "cluster volumes", not "csi volumes".
Notably, there is no change to the plugin definitions being "csinode"
and "csicontroller". This terminology is appropriate with regards to
plugins because it accurates reflects what the plugin is.
Signed-off-by: Drew Erny <derny@mirantis.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
rename some variables that collided with imports or (upcoming)
changes, e.g. `ctx`, which is commonly used for `context.Context`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>