Commit graph

236 commits

Author SHA1 Message Date
Albin Kerouanton
91eee33f62
api: ContainerCreate: return an error when config is nil
The same error is already returned by `(*Daemon).containerCreate()` but
since this function is also called by the cluster executor, the error
has to be duplicated.

Doing that allows to remove a nil check on container config in
`postContainersCreate`.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-10-25 21:25:17 +02:00
Sebastiaan van Stijn
cff4f20c44
migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
4dbfe7e17e
Merge pull request #46502 from rumpl/c8d-fix-diff
c8d: Fix `docker diff`
2023-09-20 21:16:08 +02:00
Djordje Lukic
207c4d537c c8d: Fix docker diff
Diffing a container yielded some extra changes that come from the
files/directories that we mount inside the container (/etc/resolv.conf
for example). To avoid that we create an intermediate snapshot that has
these files, with this we can now diff the container fs with its parent
and only get the differences that were made inside the container.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2023-09-20 14:16:22 +02:00
Albin Kerouanton
4bd0553274
daemon: Return all validation errors for NetworkingConfig and EndpointSettings
Thus far, validation code would stop as soon as a bad value was found.
Now, we try to validate as much as we can, to return all errors to the
API client.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-18 17:25:06 +02:00
Albin Kerouanton
ff503882f7
daemon: Improve NetworkingConfig & EndpointSettings validation
So far, only a subset of NetworkingConfig was validated when calling
ContainerCreate. Other parameters would be validated when the container
was started. And the same goes for EndpointSettings on NetworkConnect.

This commit adds two validation steps:

1. Check if the IP addresses set in endpoint's IPAMConfig are valid,
   when ContainerCreate and ConnectToNetwork is called ;
2. Check if the network allows static IP addresses, only on
   ConnectToNetwork as we need the libnetwork's Network for that and it
   might not exist until NetworkAttachment requests are sent to the
   Swarm leader (which happens only when starting the container) ;

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-18 17:21:06 +02:00
Albin Kerouanton
bbcd662532
api: Allow ContainerCreate to take several EndpointsConfig for >= 1.44
The API endpoint `/containers/create` accepts several EndpointsConfig
since v1.22 but the daemon would error out in such case. This check is
moved from the daemon to the api and is now applied only for API < 1.44,
effectively allowing the daemon to create containers connected to
several networks.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-09-15 10:07:29 +02:00
Sebastiaan van Stijn
0f871f8cb7
api/types/events: define "Action" type and consts
Define consts for the Actions we use for events, instead of "ad-hoc" strings.
Having these consts makes it easier to find where specific events are triggered,
makes the events less error-prone, and allows documenting each Action (if needed).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-29 00:38:08 +02:00
Sebastiaan van Stijn
2be118379e
api/types/container: add RestartPolicyMode type and enum
Also move the validation function to live with the type definition,
which allows it to be used outside of the daemon as well.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-22 16:40:57 +02:00
Albin Kerouanton
38e26c4717
daemon/create.go: Fix error capitalization
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-08-10 12:15:34 +02:00
Albin Kerouanton
742475bc8d
daemon/create.go: Supersede github.com/pkg/errors
Will make it possible to use `errors.Join()` in that file.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-08-10 01:30:10 +02:00
Albin Kerouanton
f3e62199ea
daemon: Remove unneeded error wrapping in verifyNetworkingConfig
This function is called by `daemon.containerCreate()` which is already
wrapping errors coming from `verifyNetworkingConfig()` with
`errdefs.InvalidParameter()`. So `verifyNetworkingConfig()` should only
return standard errors.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
2023-08-03 11:21:52 +02:00
Sebastiaan van Stijn
210932b3bf
daemon: format code with gofumpt
Formatting the code with https://github.com/mvdan/gofumpt

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-29 00:33:03 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Jeyanthinath Muthuram
307b09e7eb
fixing consistent aliases for OCI spec imports
Signed-off-by: Jeyanthinath Muthuram <jeyanthinath10@gmail.com>
2023-05-08 15:27:52 +05:30
Laura Brehm
a34060cdb4
Resolve and store manifest when creating container
This addresses the previous issue with the containerd store where, after a container is created, we can't deterministically resolve which image variant was used to run it (since we also don't store what platform the image was fetched for).

This is required for things like `docker commit`, and computing the containers layer size later, since we need to resolve the specific image variant.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
2023-03-06 15:13:36 +01:00
Djordje Lukic
0137446248 Implement run using the containerd snapshotter
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>

c8d/daemon: Mount root and fill BaseFS

This fixes things that were broken due to nil BaseFS like `docker cp`
and running a container with workdir override.

This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>

daemon/images: Don't unset BaseFS

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2023-02-06 18:21:50 +01:00
Nicolas De Loof
def549c8f6
imageservice: Add context to various methods
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-11-03 12:22:40 +01:00
Sebastiaan van Stijn
96355b4f1c
Merge pull request #44016 from thaJeztah/dont_set_ignoreImagesArgsEscaped
daemon: don't set ignoreImagesArgsEscaped, managed where not needed
2022-09-27 17:59:23 +02:00
Djordje Lukic
1a3d8019d1 Remove the OS check when creating a container
Now that we can pass any custom containerd shim to dockerd there is need
for this check. Without this it becomes possible to use wasm shims for
example with images that have "wasi" as the OS.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2022-09-22 17:27:10 +02:00
Sebastiaan van Stijn
779a5b3029 ImageService.GetImage(): pass context
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2022-09-07 16:53:45 +02:00
Sebastiaan van Stijn
ba138d6201
daemon: don't set ignoreImagesArgsEscaped, managed where not needed
Just a minor cleanup; I went looking where these options were used, and
these occurrences set then to their default value. Removing these
assignments makes it easier to find where they're actually used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-23 23:54:56 +02:00
Nicolas De Loof
9a849cc83a
introduce GetImageOpts to manage image inspect data in backend
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-16 16:49:46 +02:00
Sebastiaan van Stijn
41b96bff55
update uses of container.ContainerCreateCreatedBody to CreateResponse
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-28 22:39:20 +02:00
Sebastiaan van Stijn
f3bce92a24
daemon: cleanupContainer(): pass ContainerRmConfig as parameter
We already have this config, so might as well pass it, instead of passing
each option as a separate argument.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-20 21:27:24 +02:00
Tonis Tiigi
482d1d15bf distribution: use the maximum compatible platform by default
When no specific platform is set, pull the platform that
most matches the current host.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2022-03-31 15:20:59 -07:00
Sebastiaan van Stijn
e30a4a438b
daemon: remove leftover LCOW platform checks
This removes some of the checks that were added in 0cba7740d4,
but should no longer be needed.

- `Daemon.create()`: fix the error message, which assumed it could only occur on Windows.
- `Daemon.cleanupContainer()`: no need to validate container platform to delete it.
- `Daemon.containerExport`: if a container was created, we should be able to
  export it; no need to validate.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-01-25 15:23:18 +01:00
Brian Goff
03f1c3d78f
Lock down docker root dir perms.
Do not use 0701 perms.
0701 dir perms allows anyone to traverse the docker dir.
It happens to allow any user to execute, as an example, suid binaries
from image rootfs dirs because it allows traversal AND critically
container users need to be able to do execute things.

0701 on lower directories also happens to allow any user to modify
     things in, for instance, the overlay upper dir which neccessarily
     has 0755 permissions.

This changes to use 0710 which allows users in the group to traverse.
In userns mode the UID owner is (real) root and the GID is the remapped
root's GID.

This prevents anyone but the remapped root to traverse our directories
(which is required for userns with runc).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit ef7237442147441a7cadcda0600be1186d81ac73)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 93ac040bf0)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-10-05 09:57:00 +02:00
Sebastiaan van Stijn
0c84c322ae
daemon, oci: remove LCOW bits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-27 13:35:59 +02:00
Sebastiaan van Stijn
300c11c7c9
volume/mounts: remove "containerOS" argument from NewParser (LCOW code)
This changes mounts.NewParser() to create a parser for the current operatingsystem,
instead of one specific to a (possibly non-matching, in case of LCOW) OS.

With the OS-specific handling being removed, the "OS" parameter is also removed
from `daemon.verifyContainerSettings()`, and various other container-related
functions.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-02 13:51:55 +02:00
Sebastiaan van Stijn
e047d984dc
Remove LCOW code (step 1)
The LCOW implementation in dockerd has been deprecated in favor of re-implementation
in containerd (in progress). Microsoft started removing the LCOW V1 code from the
build dependencies we use in Microsoft/opengcs (soon to be part of Microsoft/hcshhim),
which means that we need to start removing this code.

This first step removes the lcow graphdriver, the LCOW initialization code, and
some LCOW-related utilities.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-06-03 21:16:21 +02:00
Brian Goff
50f39e7247 Move cpu variant checks into platform matcher
Wrap platforms.Only and fallback to our ignore mismatches due to  empty
CPU variants. This just cleans things up and makes the logic re-usable
in other places.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-02-18 16:58:48 +00:00
Brian Goff
7f5e39bd4f
Use real root with 0701 perms
Various dirs in /var/lib/docker contain data that needs to be mounted
into a container. For this reason, these dirs are set to be owned by the
remapped root user, otherwise there can be permissions issues.
However, this uneccessarily exposes these dirs to an unprivileged user
on the host.

Instead, set the ownership of these dirs to the real root (or rather the
UID/GID of dockerd) with 0701 permissions, which allows the remapped
root to enter the directories but not read/write to them.
The remapped root needs to enter these dirs so the container's rootfs
can be configured... e.g. to mount /etc/resolve.conf.

This prevents an unprivileged user from having read/write access to
these dirs on the host.
The flip side of this is now any user can enter these directories.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit e908cc3901)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-02 13:01:25 +01:00
Paul "TBBle" Hampson
db7b7f6df9 Parse storage-opt in GraphDriver init on Windows
This ensures the storage-opts applies to all operations by the graph
drivers, replacing the merging of storage-opts into container storage
config at container-creation time, and hence applying storage-opts to
non-container operations like `COPY` and `ADD` in the builder.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2020-11-10 19:51:46 +11:00
Brian Goff
88c0271605 Don't set default platform on container create
This fixes a regression based on expectations of the runtime:

```
docker pull arm32v7/alpine
docker run arm32v7/alpine
```

Without this change, the `docker run` will fail due to platform
matching on non-arm32v7 systems, even though the image could run
(assuming the system is setup correctly).

This also emits a warning to make sure that the user is aware that a
platform that does not match the default platform of the system is being
run, for the cases like:

```
docker pull --platform armhf busybox
docker run busybox
```

Not typically an issue if the requests are done together like that, but
if the image was already there and someone did `docker run` without an
explicit `--platform`, they may very well be expecting to run a native
version of the image instead of the armhf one.

This warning does add some extra noise in the case of platform specific
images being run, such as `arm32v7/alpine`, but this can be supressed by
explicitly setting the platform.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-10-20 20:17:23 +00:00
Tibor Vass
5c10ea6ae8
Merge pull request #40725 from cpuguy83/check_img_platform
Accept platform spec on container create
2020-05-21 11:33:27 -07:00
Sebastiaan van Stijn
a8216806ce
vendor: opencontainers/selinux v1.5.1
full diff: https://github.com/opencontainers/selinux/compare/v1.3.3...v1.5.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-05 20:33:06 +02:00
Sebastiaan van Stijn
eb14d936bf
daemon: rename variables that collide with imported package names
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:23 +02:00
Brian Goff
7a9cb29fb9 Accept platform spec on container create
This enables image lookup when creating a container to fail when the
reference exists but it is for the wrong platform. This prevents trying
to run an image for the wrong platform, as can be the case with, for
example binfmt_misc+qemu.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-03-20 16:10:36 -07:00
Sebastiaan van Stijn
05469b5fa2
daemon: add "isWindows" const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-17 23:49:43 +02:00
Brian Goff
ab47e16cc5
Merge pull request #38918 from thaJeztah/bump_selinux
bump opencontainers/selinux to v1.2
2019-03-28 17:27:03 -07:00
sh7dm
8f303bd848 ContainerCreate shouldn't return warnings=nil
Fixes #38222
Closes #38614 (carried)

Signed-off-by: Tibor Vass <tibor@docker.com>
2019-03-21 21:20:31 +00:00
Sebastiaan van Stijn
f43826c433
bump opencontainers/selinux to v1.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-03-21 10:10:05 +01:00
Sebastiaan van Stijn
c7105e3c99
Simplify verifyNetworkingConfig()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-03-20 18:46:56 +01:00
John Howard
20833b06a0 Windows: (WCOW) Generate OCI spec that remote runtime can escape
Signed-off-by: John Howard <jhoward@microsoft.com>

Also fixes https://github.com/moby/moby/issues/22874

This commit is a pre-requisite to moving moby/moby on Windows to using
Containerd for its runtime.

The reason for this is that the interface between moby and containerd
for the runtime is an OCI spec which must be unambigious.

It is the responsibility of the runtime (runhcs in the case of
containerd on Windows) to ensure that arguments are escaped prior
to calling into HCS and onwards to the Win32 CreateProcess call.

Previously, the builder was always escaping arguments which has
led to several bugs in moby. Because the local runtime in
libcontainerd had context of whether or not arguments were escaped,
it was possible to hack around in daemon/oci_windows.go with
knowledge of the context of the call (from builder or not).

With a remote runtime, this is not possible as there's rightly
no context of the caller passed across in the OCI spec. Put another
way, as I put above, the OCI spec must be unambigious.

The other previous limitation (which leads to various subtle bugs)
is that moby is coded entirely from a Linux-centric point of view.

Unfortunately, Windows != Linux. Windows CreateProcess uses a
command line, not an array of arguments. And it has very specific
rules about how to escape a command line. Some interesting reading
links about this are:

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
https://stackoverflow.com/questions/31838469/how-do-i-convert-argv-to-lpcommandline-parameter-of-createprocess
https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017

For this reason, the OCI spec has recently been updated to cater
for more natural syntax by including a CommandLine option in
Process.

What does this commit do?

Primary objective is to ensure that the built OCI spec is unambigious.

It changes the builder so that `ArgsEscaped` as commited in a
layer is only controlled by the use of CMD or ENTRYPOINT.

Subsequently, when calling in to create a container from the builder,
if follows a different path to both `docker run` and `docker create`
using the added `ContainerCreateIgnoreImagesArgsEscaped`. This allows
a RUN from the builder to control how to escape in the OCI spec.

It changes the builder so that when shell form is used for RUN,
CMD or ENTRYPOINT, it builds (for WCOW) a more natural command line
using the original as put by the user in the dockerfile, not
the parsed version as a set of args which loses fidelity.
This command line is put into args[0] and `ArgsEscaped` is set
to true for CMD or ENTRYPOINT. A RUN statement does not commit
`ArgsEscaped` to the commited layer regardless or whether shell
or exec form were used.
2019-03-12 18:41:55 -07:00
Salahuddin Khan
763d839261 Add ADD/COPY --chown flag support to Windows
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.

NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.

The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>

On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.

Signed-off-by: Salahuddin Khan <salah@docker.com>
2018-08-13 21:59:11 -07:00
Brian Goff
e4b6adc88e Extract volume interaction to a volumes service
This cleans up some of the package API's used for interacting with
volumes, and simplifies management.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-05-25 14:21:07 -04:00
Daniel Nephin
2b1a2b10af Move ImageService to new package
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-26 16:49:37 -05:00