The most notable change here is that the OCI's type uses a pointer for `Created`, which we probably should've been too, so most of these changes are accounting for that (and embedding our `Equal` implementation in the one single place it was used).
Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.
Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.
Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.
Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Passing around a bare pointer to the map of configured features in order
to propagate to consumers changes to the configuration across reloads is
dangerous. Map operations are not atomic, so concurrently reading from
the map while it is being updated is a data race as there is no
synchronization. Use a getter function to retrieve the current features
map so the features can be retrieved race-free.
Remove the unused features argument from the build router.
Signed-off-by: Cory Snider <csnider@mirantis.com>
`docker run -v /foo:/foo:ro` is now recursively read-only on kernel >= 5.12.
Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.
Use `ro-non-recursive` to disable RRO.
Use `ro-force-recursive` or `rro` to explicitly enable RRO. (Fails on kernel < 5.12)
Fix issue 44978
Fix docker/for-linux issue 788
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The error returned by DecodeConfig was changed in
b6d58d749c and caused this to regress.
Allow empty request bodies for this endpoint once again.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This was deprecated in dbb48e4b29, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was deprecated in 818ee96219, which
is part of the v24.0.0 release, so we can remove it from master.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This field is deprecated since 1261fe69a3,
and will now be omitted on API v1.44 and up for the `GET /images/json`,
`GET /images/{id}/json`, and `GET /system/df` endpoints.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 24.0 branch was created, so changes in master/main should now be
targeting the next version of the API (1.44).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- forward-port changes from 0ffaa6c785 to api/swagger.yaml (v1.44-dev)
- backports the changes to v1.43;
- Update container OOMKilled flag immediately 57d2d6ef62
- Add no-new-privileges to SecurityOptions returned by /info eb7738221c
- API: deprecate VirtualSize field for /images/json and /images/{id}/json 1261fe69a3
- api/types/container: create type for changes endpoint dbb48e4b29
- builder-next/prune: Handle "until" filter timestamps 54a125f677
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As of Go 1.8, "net/http".Server provides facilities to close all
listeners, making the same facilities in server.Server redundant.
http.Server also improves upon server.Server by additionally providing a
facility to also wait for outstanding requests to complete after closing
all listeners. Leverage those facilities to give in-flight requests up
to five seconds to finish up after all containers have been shut down.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The image store sends events when a new image is created/tagged, using
it instead of the reference store makes sure we send the "tag" event
when a new image is built using buildx.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Fixes `docker system prune --filter until=<timestamp>`.
`docker system prune` claims to support "until" filter for timestamps,
but it doesn't work because builder "until" filter only supports
duration.
Use the same filter parsing logic and then convert the timestamp to a
relative "keep-duration" supported by buildkit.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
In versions of Docker before v1.10, this field was calculated from
the image itself and all of its parent images. Images are now stored
self-contained, and no longer use a parent-chain, making this field
an equivalent of the Size field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The signatures of functions in containerd's errdefs packages are very
similar to those in our own, and it's easy to accidentally use the wrong
package.
This patch uses a consistent alias for all occurrences of this import.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Push the reference parsing from repo and tag names into the api and pass
a reference object to the ImageService.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Commit 3991faf464 moved search into the registry
package, which also made the `dockerversion` package a dependency for registry,
which brings additional (indirect) dependencies, such as `pkg/parsers/kernel`,
and `golang.org/x/sys/windows/registry`.
Client code, such as used in docker/cli may depend on the `registry` package,
but should not depend on those additional dependencies.
This patch moves setting the userAgent to the API router, and instead of
passing it as a separate argument, includes it into the "headers".
As these headers now not only contain the `X-Meta-...` headers, the variables
were renamed accordingly.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the utility introduced in 1bd486666b to
share the same implementation as similar options. The IPCModeContainer const
is left for now, but we need to consider what to do with these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit e7d75c8db7 fixed validation of "host"
mode values, but also introduced a regression for validating "container:"
mode PID-modes.
PID-mode implemented a stricter validation than the other options and, unlike
the other options, did not accept an empty container name/ID. This feature was
originally implemented in fb43ef649b, added some
some integration tests (but no coverage for this case), and the related changes
in the API types did not have unit-tests.
While a later change (d4aec5f0a6) added a test
for the `--pid=container:` (empty name) case, that test was later migrated to
the CLI repository, as it covered parsing the flag (and validating the result).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit 1bd486666b refactored this code, but
it looks like I removed some changes in this part of the code when extracting
these changes from a branch I was working on, and the behavior did not match
the function's description (name to be empty if there is no "container:" prefix
Unfortunately, there was no test coverage for this in this repository, so we
didn't catch this.
This patch:
- fixes containerID() to not return a name/ID if no container: prefix is present
- adds test-coverage for TestCgroupSpec
- adds test-coverage for NetworkMode.ConnectedContainer
- updates some test-tables to remove duplicates, defaults, and use similar cases
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make if more explicit which test-cases should be valid, and make it the
first field, because the "valid" field is shared among all test-cases in
the test-table, and making it the first field makes it slightly easier
to distinguish valid from invalid cases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit 3246db3755 added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
docker swarm init
docker volume create foo
docker create -v foo:/foo busybox top
docker volume rm foo
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict
volume_test.go:123: assertion failed: expected an error, got nil
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found
--- FAIL: TestVolumesRemoveSwarmEnabled (1.57s)
--- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
FAIL
With this patch:
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration
...
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled
...
=== RUN TestVolumesRemoveSwarmEnabled
=== PAUSE TestVolumesRemoveSwarmEnabled
=== CONT TestVolumesRemoveSwarmEnabled
=== RUN TestVolumesRemoveSwarmEnabled/volume_in_use
=== RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume
=== RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force
--- PASS: TestVolumesRemoveSwarmEnabled (1.53s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s)
--- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
SearchRegistryForImages does not make sense as part of the image
service interface. The implementation just wraps the search API of the
registry service to filter the results client-side. It has nothing to do
with local image storage, and the implementation of search does not need
to change when changing which backend (graph driver vs. containerd
snapshotter) is used for local image storage.
Filtering of the search results is an implementation detail: the
consumer of the results does not care which actor does the filtering so
long as the results are filtered as requested. Move filtering into the
exported API of the registry service to hide the implementation details.
Only one thing---the registry service implementation---would need to
change in order to support server-side filtering of search results if
Docker Hub or other registry servers were to add support for it to their
APIs.
Use a fake registry server in the search unit tests to avoid having to
mock out the registry API client.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Co-authored-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>