The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.
Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.
Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.
Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.
Signed-off-by: Cory Snider <csnider@mirantis.com>
After moving libnetwork to this repo, we need to update all the import
paths for libnetwork to point to docker/docker/libnetwork instead of
docker/libnetwork.
This change implements that.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This patch adds a new "prune" event type to indicate that pruning of a resource
type completed.
This event-type can be used on systems that want to perform actions after
resources have been cleaned up. For example, Docker Desktop performs an fstrim
after resources are deleted (https://github.com/linuxkit/linuxkit/tree/v0.7/pkg/trim-after-delete).
While the current (remove, destroy) events can provide information on _most_
resources, there is currently no event triggered after the BuildKit build-cache
is cleaned.
Prune events have a `reclaimed` attribute, indicating the amount of space that
was reclaimed (in bytes). The attribute can be used, for example, to use as a
threshold for performing fstrim actions. Reclaimed space for `network` events
will always be 0, but the field is added to be consistent with prune events for
other resources.
To test this patch:
Create some resources:
for i in foo bar baz; do \
docker network create network_$i \
&& docker volume create volume_$i \
&& docker run -d --name container_$i -v volume_$i:/volume busybox sh -c 'truncate -s 5M somefile; truncate -s 5M /volume/file' \
&& docker tag busybox:latest image_$i; \
done;
docker pull alpine
docker pull nginx:alpine
echo -e "FROM busybox\nRUN truncate -s 50M bigfile" | DOCKER_BUILDKIT=1 docker build -
Start listening for "prune" events in another shell:
docker events --filter event=prune
Prune containers, networks, volumes, and build-cache:
docker system prune -af --volumes
See the events that are returned:
docker events --filter event=prune
2020-07-25T12:12:09.268491000Z container prune (reclaimed=15728640)
2020-07-25T12:12:09.447890400Z network prune (reclaimed=0)
2020-07-25T12:12:09.452323000Z volume prune (reclaimed=15728640)
2020-07-25T12:12:09.517236200Z image prune (reclaimed=21568540)
2020-07-25T12:12:09.566662600Z builder prune (reclaimed=52428841)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This stuff doesn't belong here and is causing imports of libnetwork into
the router, which is not what we want.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since Go 1.7, context is a standard package. Since Go 1.9, everything
that is provided by "x/net/context" is a couple of type aliases to
types in "context".
Many vendored packages still use x/net/context, so vendor entry remains
for now.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since the volume store already provides this functionality, we should
just use it rather than duplicating it.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The imageRefs map was being popualted with containerID, and accessed
with an imageID which would never match.
Remove this broken code because: 1) it hasn't ever worked so isn't
necessary, and 2) because at best it would be racy
ImageDelete() should already handle preventing of removal of used
images.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
- Call the function that create an event entry while volumes are
pruning.
- Pass volume.Volume type on volumeRm instead of a name. Volume lookup is done
on the exported VolumeRm function.
- Skip volume deletion when force option used and it does not exists.
Signed-off-by: Nicolas Sterchele <sterchele.nicolas@gmail.com>
The `filters.Include()` method was deprecated in favor of `filters.Contains()`
in 065118390a, but still used in various
locations.
This patch replaces uses of `filters.Include()` with `filters.Contains()`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- They are configuration-only networks which
can be used to supply the configuration
when creating regular networks.
- They do not get allocated and do net get plumbed.
Drivers do not get to know about them.
- They can be removed, once no other network is
using them.
- When user creates a network specifying a
configuration network for the config, no
other network specific configuration field
is are accepted. User can only specify
network operator fields (attachable, internal,...)
Signed-off-by: Alessandro Boch <aboch@docker.com>
This fix tries to address the issue raised in 29999 where it was not
possible to mask these items (like important non-removable stuff)
from `docker system prune`.
This fix adds `label` and `label!` field for `--filter` in `system prune`,
so that it is possible to selectively prune items like:
```
$ docker container prune --filter label=foo
$ docker container prune --filter label!=bar
```
Additional unit tests and integration tests have been added.
This fix fixes 29999.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Remove forked reference package. Use normalized named values
everywhere and familiar functions to convert back to familiar
strings for UX and storage compatibility.
Enforce that the source repository in the distribution metadata
is always a normalized string, ignore invalid values which are not.
Update distribution tests to use normalized values.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
`NetworksPrune()` is designed to ignore errors
encountered during removal of networks, and only
print them as warnings.
However, the last error encountered was returned
by the function, resulting in the prune command
to be reported as "failing" wheras it did not.
In addition, in situations where a network
failed to be removed, the networks that
_were_ succesfully removed were not reported
back.
This patch changes the function to not return
the error, and to return the list of networks
that were succesfully removed at all times.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `digest` data type, used throughout docker for image verification
and identity, has been broken out into `opencontainers/go-digest`. This
PR updates the dependencies and moves uses over to the new type.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This fix is a follow up for comment
https://github.com/docker/docker/pull/28535#issuecomment-263215225
This fix provides `--filter until=<timestamp>` for `docker container/image prune`.
This fix adds `--filter until=<timestamp>` to `docker container/image prune`
so that it is possible to specify a timestamp and prune those containers/images
that are earlier than the timestamp.
Related docs has been updated
Several integration tests have been added to cover changes.
This fix fixes#28497.
This fix is related to #28535.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This fix convert DanglingOnly in ImagesPruneConfig to Filters,
so that it is possible to maintain API compatibility in the future.
Several integration tests have been added to cover changes.
This fix is related to 28497.
A follow up to this PR will be done once this PR is merged.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
`docker network prune` prunes unused networks, including overlay ones.
`docker system prune` also prunes unused networks.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>