Commit graph

120 commits

Author SHA1 Message Date
Sebastiaan van Stijn
a3a42c459e
api/types/image: move GetImageOpts to api/types/backend
The `GetImageOpts` struct is used for options to be passed to the backend,
and are not used in client code. This struct currently is intended for internal
use only.

This patch moves the `GetImageOpts` struct to the backend package to prevent
it being imported in the client, and to make it more clear that this is part
of internal APIs, and not public-facing.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-01-22 20:45:21 +01:00
Paweł Gronowski
f07387466a
daemon/oci: Extract side effects from withMounts
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-19 17:27:16 +01:00
Paul "TBBle" Hampson
e8f4bfb374
Root.Path for a process-isolated WCOW container must be the Volume GUID
The actual divergence is due to differences in the snapshotter and
graphfilter mount behaviour on Windows, but the snapshotter behaviour is
better, so we deal with it here rather than changing the snapshotter
behaviour.

We're relying on the internals of containerd's Windows mount
implementation here. Unless this code flow is replaced, future work is
to move getBackingDeviceForContainerdMount into containerd's mount
implementation.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-17 16:29:32 +01:00
Paul "TBBle" Hampson
66325f7271
Implement GetLayerFolders for the containerd image store
The existing API ImageService.GetLayerFolders didn't have access to the
ID of the container, and once we have that, the snapshotter Mounts API
provides all the information we need here.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2024-01-17 16:29:28 +01:00
Sebastiaan van Stijn
cff4f20c44
migrate to github.com/containerd/log v0.1.0
The github.com/containerd/containerd/log package was moved to a separate
module, which will also be used by upcoming (patch) releases of containerd.

This patch moves our own uses of the package to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-10-11 17:52:23 +02:00
Sebastiaan van Stijn
a3c97beee0
image: implement CheckOS, deprecate pkg/system IsOSSupported
Implement a function that returns an error to replace existing uses of
the IsOSSupported utility, where callers had to produce the error after
checking.

The IsOSSupported function was used in combination with images, so implementing
a utility in "image" to prevent having to import pkg/system (which contains many
unrelated functions)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-07 22:14:44 +02:00
Sebastiaan van Stijn
150b1c8c73
daemon: daemon.createSpec: remove uses of logrus
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-06 13:30:33 +02:00
Sebastiaan van Stijn
2cbed9d2a8
daemon: inline Daemon.getDNSSearchSettings
This function was created as a "method", but didn't use the Daemon in any
way, and all other options were checked inline, so let's not pretend this
function is more "special" than the other checks, and inline the code.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-02 16:14:15 +02:00
Sebastiaan van Stijn
210932b3bf
daemon: format code with gofumpt
Formatting the code with https://github.com/mvdan/gofumpt

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-29 00:33:03 +02:00
Brian Goff
74da6a6363 Switch all logging to use containerd log pkg
This unifies our logging and allows us to propagate logging and trace
contexts together.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2023-06-24 00:23:44 +00:00
Cory Snider
d222bf097c daemon: reload runtimes w/o breaking containers
The existing runtimes reload logic went to great lengths to replace the
directory containing runtime wrapper scripts as atomically as possible
within the limitations of the Linux filesystem ABI. Trouble is,
atomically swapping the wrapper scripts directory solves the wrong
problem! The runtime configuration is "locked in" when a container is
started, including the path to the runC binary. If a container is
started with a runtime which requires a daemon-managed wrapper script
and then the daemon is reloaded with a config which no longer requires
the wrapper script (i.e. some args -> no args, or the runtime is dropped
from the config), that container would become unmanageable. Any attempts
to stop, exec or otherwise perform lifecycle management operations on
the container are likely to fail due to the wrapper script no longer
existing at its original path.

Atomically swapping the wrapper scripts is also incompatible with the
read-copy-update paradigm for reloading configuration. A handler in the
daemon could retain a reference to the pre-reload configuration for an
indeterminate amount of time after the daemon configuration has been
reloaded and updated. It is possible for the daemon to attempt to start
a container using a deleted wrapper script if a request to run a
container races a reload.

Solve the problem of deleting referenced wrapper scripts by ensuring
that all wrapper scripts are *immutable* for the lifetime of the daemon
process. Any given runtime wrapper script must always exist with the
same contents, no matter how many times the daemon config is reloaded,
or what changes are made to the config. This is accomplished by using
everyone's favourite design pattern: content-addressable storage. Each
wrapper script file name is suffixed with the SHA-256 digest of its
contents to (probabilistically) guarantee immutability without needing
any concurrency control. Stale runtime wrapper scripts are only cleaned
up on the next daemon restart.

Split the derived runtimes configuration from the user-supplied
configuration to have a place to store derived state without mutating
the user-supplied configuration or exposing daemon internals in API
struct types. Hold the derived state and the user-supplied configuration
in a single struct value so that they can be updated as an atomic unit.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:25 -04:00
Cory Snider
0b592467d9 daemon: read-copy-update the daemon config
Ensure data-race-free access to the daemon configuration without
locking by mutating a deep copy of the config and atomically storing
a pointer to the copy into the daemon-wide configStore value. Any
operations which need to read from the daemon config must capture the
configStore value only once and pass it around to guarantee a consistent
view of the config.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-01 14:45:24 -04:00
Cory Snider
0ffaa6c785 daemon: add annotations to container HostConfig
Allow clients to set annotations on a container which will applied to
the container's OCI spec.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-02-23 18:59:00 -05:00
Sebastiaan van Stijn
f95e9b68d6
daemon: use strings.Cut() and cleanup error messages
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-12-21 11:09:03 +01:00
Nicolas De Loof
def549c8f6
imageservice: Add context to various methods
Co-authored-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
2022-11-03 12:22:40 +01:00
Cory Snider
e332c41e9d pkg/containerfs: alias ContainerFS to string
Drop the constructor and redundant string() type-casts.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-09-23 16:56:52 -04:00
Cory Snider
95824f2b5f pkg/containerfs: simplify ContainerFS type
Iterate towards dropping the type entirely.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2022-09-23 16:56:49 -04:00
Sebastiaan van Stijn
779a5b3029 ImageService.GetImage(): pass context
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
2022-09-07 16:53:45 +02:00
Nicolas De Loof
9a849cc83a
introduce GetImageOpts to manage image inspect data in backend
Currently only provides the existing "platform" option, but more
options will be added in follow-ups.

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-08-16 16:49:46 +02:00
Paul "TBBle" Hampson
cb07afa3cc Implement :// separator for arbitrary Windows Device IDTypes
Arbitrary here does not include '', best to catch that one early as it's
almost certainly a mistake (possibly an attempt to pass a POSIX path
through this API)

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-27 13:26:47 +11:00
Paul "TBBle" Hampson
92f13bad88 Allow Windows Devices to be activated for HyperV Isolation
If not using the containerd backend, this will still fail, but later.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-27 13:26:41 +11:00
Paul "TBBle" Hampson
c60f70f112 Break out setupWindowsDevices and add tests
Since this function is about to get more complicated, and change
behaviour, this establishes tests for the existing implementation.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-27 13:23:48 +11:00
Sebastiaan van Stijn
1b3fef5333
Windows: require Windows Server RS5 / ltsc2019 (build 17763) as minimum
Windows Server 2016 (RS1) reached end of support, and Docker Desktop requires
Windows 10 V19H2 (version 1909, build 18363) as a minimum.

This patch makes Windows Server RS5 /  ltsc2019 (build 17763) the minimum version
to run the daemon, and removes some hacks for older versions of Windows.

There is one check remaining that checks for Windows RS3 for a workaround
on older versions, but recent changes in Windows seemed to have regressed
on the same issue, so I kept that code for now to check if we may need that
workaround (again);

085c6a98d5/daemon/graphdriver/windows/windows.go (L319-L341)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 22:58:28 +01:00
Eng Zer Jun
c55a4ac779
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated in Go 1.16. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-08-27 14:56:57 +08:00
Sebastiaan van Stijn
0c84c322ae
daemon, oci: remove LCOW bits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-07-27 13:35:59 +02:00
Brian Goff
24f173a003 Replace service "Capabilities" w/ add/drop API
After dicussing with maintainers, it was decided putting the burden of
providing the full cap list on the client is not a good design.
Instead we decided to follow along with the container API and use cap
add/drop.

This brings in the changes already merged into swarmkit.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-07-27 10:09:42 -07:00
Brian Goff
7a9cb29fb9 Accept platform spec on container create
This enables image lookup when creating a container to fail when the
reference exists but it is for the wrong platform. This prevents trying
to run an image for the wrong platform, as can be the case with, for
example binfmt_misc+qemu.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-03-20 16:10:36 -07:00
Olli Janatuinen
1308a3a99f Move DefaultCapabilities() to caps package
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2019-11-14 21:13:16 +02:00
Sebastiaan van Stijn
6b91ceff74
Use hcsshim osversion package for Windows versions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-22 02:53:00 +02:00
John Howard
8988448729 Remove refs to jhowardmsft from .go code
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-09-25 10:51:18 -07:00
Sebastiaan van Stijn
07ff4f1de8
goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
John Howard
a3eda72f71
Merge pull request #38541 from Microsoft/jjh/containerd
Windows: Experimental: ContainerD runtime
2019-03-19 21:09:19 -07:00
Jean Rouge
7fdac7eb0f Making it possible to pass Windows credential specs directly to the engine
Instead of having to go through files or registry values as is currently the
case.

While adding GMSA support to Kubernetes (https://github.com/kubernetes/kubernetes/pull/73726)
I stumbled upon the fact that Docker currently only allows passing Windows
credential specs through files or registry values, forcing the Kubelet
to perform a rather awkward dance of writing-then-deleting to either the
disk or the registry to be able to create a Windows container with cred
specs.

This patch solves this problem by making it possible to directly pass
whole base64-encoded cred specs to the engine's API. I took the opportunity
to slightly refactor the method responsible for Windows cred spec as it
seemed hard to read to me.

Added some unit tests on Windows credential specs handling, as there were
previously none.

Added/amended the relevant integration tests.

I have also tested it manually: given a Windows container using a cred spec
that you would normally start with e.g.
```powershell
docker run --rm --security-opt "credentialspec=file://win.json" mcr.microsoft.com/windows/servercore:ltsc2019 nltest /parentdomain
# output:
# my.ad.domain.com. (1)
# The command completed successfully
```
can now equivalently be started with
```powershell
$rawCredSpec = & cat 'C:\ProgramData\docker\credentialspecs\win.json'
$escaped = $rawCredSpec.Replace('"', '\"')
docker run --rm --security-opt "credentialspec=raw://$escaped" mcr.microsoft.com/windows/servercore:ltsc2019 nltest /parentdomain
# same output!
```

I'll do another PR on Swarmkit after this is merged to allow services to use
the same option.

(It's worth noting that @dperny faced the same problem adding GMSA support
to Swarmkit, to which he came up with an interesting solution - see
https://github.com/moby/moby/pull/38632 - but alas these tricks are not
available to the Kubelet.)

Signed-off-by: Jean Rouge <rougej+github@gmail.com>
2019-03-15 19:20:19 -07:00
John Howard
d4ceb61f2b LCOW:Reworking spec builder
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-03-12 18:41:55 -07:00
John Howard
20833b06a0 Windows: (WCOW) Generate OCI spec that remote runtime can escape
Signed-off-by: John Howard <jhoward@microsoft.com>

Also fixes https://github.com/moby/moby/issues/22874

This commit is a pre-requisite to moving moby/moby on Windows to using
Containerd for its runtime.

The reason for this is that the interface between moby and containerd
for the runtime is an OCI spec which must be unambigious.

It is the responsibility of the runtime (runhcs in the case of
containerd on Windows) to ensure that arguments are escaped prior
to calling into HCS and onwards to the Win32 CreateProcess call.

Previously, the builder was always escaping arguments which has
led to several bugs in moby. Because the local runtime in
libcontainerd had context of whether or not arguments were escaped,
it was possible to hack around in daemon/oci_windows.go with
knowledge of the context of the call (from builder or not).

With a remote runtime, this is not possible as there's rightly
no context of the caller passed across in the OCI spec. Put another
way, as I put above, the OCI spec must be unambigious.

The other previous limitation (which leads to various subtle bugs)
is that moby is coded entirely from a Linux-centric point of view.

Unfortunately, Windows != Linux. Windows CreateProcess uses a
command line, not an array of arguments. And it has very specific
rules about how to escape a command line. Some interesting reading
links about this are:

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
https://stackoverflow.com/questions/31838469/how-do-i-convert-argv-to-lpcommandline-parameter-of-createprocess
https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017

For this reason, the OCI spec has recently been updated to cater
for more natural syntax by including a CommandLine option in
Process.

What does this commit do?

Primary objective is to ensure that the built OCI spec is unambigious.

It changes the builder so that `ArgsEscaped` as commited in a
layer is only controlled by the use of CMD or ENTRYPOINT.

Subsequently, when calling in to create a container from the builder,
if follows a different path to both `docker run` and `docker create`
using the added `ContainerCreateIgnoreImagesArgsEscaped`. This allows
a RUN from the builder to control how to escape in the OCI spec.

It changes the builder so that when shell form is used for RUN,
CMD or ENTRYPOINT, it builds (for WCOW) a more natural command line
using the original as put by the user in the dockerfile, not
the parsed version as a set of args which loses fidelity.
This command line is put into args[0] and `ArgsEscaped` is set
to true for CMD or ENTRYPOINT. A RUN statement does not commit
`ArgsEscaped` to the commited layer regardless or whether shell
or exec form were used.
2019-03-12 18:41:55 -07:00
John Howard
85ad4b16c1 Windows: Experimental: Allow containerd for runtime
Signed-off-by: John Howard <jhoward@microsoft.com>

This is the first step in refactoring moby (dockerd) to use containerd on Windows.
Similar to the current model in Linux, this adds the option to enable it for runtime.
It does not switch the graphdriver to containerd snapshotters.

 - Refactors libcontainerd to a series of subpackages so that either a
  "local" containerd (1) or a "remote" (2) containerd can be loaded as opposed
  to conditional compile as "local" for Windows and "remote" for Linux.

 - Updates libcontainerd such that Windows has an option to allow the use of a
   "remote" containerd. Here, it communicates over a named pipe using GRPC.
   This is currently guarded behind the experimental flag, an environment variable,
   and the providing of a pipename to connect to containerd.

 - Infrastructure pieces such as under pkg/system to have helper functions for
   determining whether containerd is being used.

(1) "local" containerd is what the daemon on Windows has used since inception.
It's not really containerd at all - it's simply local invocation of HCS APIs
directly in-process from the daemon through the Microsoft/hcsshim library.

(2) "remote" containerd is what docker on Linux uses for it's runtime. It means
that there is a separate containerd service running, and docker communicates over
GRPC to it.

To try this out, you will need to start with something like the following:

Window 1:
	containerd --log-level debug

Window 2:
	$env:DOCKER_WINDOWS_CONTAINERD=1
	dockerd --experimental -D --containerd \\.\pipe\containerd-containerd

You will need the following binary from github.com/containerd/containerd in your path:
 - containerd.exe

You will need the following binaries from github.com/Microsoft/hcsshim in your path:
 - runhcs.exe
 - containerd-shim-runhcs-v1.exe

For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`.
This is no different to the current requirements. However, you may need updated binaries,
particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit
binaries are somewhat out of date.

Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. This will come in time - this is a baby (although large)
step to migrating Docker on Windows to containerd.

Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use
HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable.
This abstraction is done in HCSShim. (Referring specifically to runtime)

Note the LCOW graphdriver still uses HCS v1 APIs regardless.

Note also that this does not migrate docker to use containerd snapshotters
rather than graphdrivers. This needs to be done in conjunction with Linux also
doing the same switch.
2019-03-12 18:41:55 -07:00
Drew Erny
6f1d7ddfa4 Use Runtime target
The Swarmkit api specifies a target for configs called called "Runtime"
which indicates that the config is not mounted into the container but
has some other use. This commit updates the Docker api to reflect this.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-02-19 13:14:17 -06:00
Drew Erny
04995fa7c7 Add CredentialSpec from configs support
Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-02-04 14:52:01 -06:00
Yusuf Tarık Günaydın
86bd2e9864 Implemented memory and CPU limits for LCOW.
Signed-off-by: Yusuf Tarık Günaydın <yusuf_tarik@hotmail.com>
2019-02-02 13:02:23 +03:00
Olli Janatuinen
80d7bfd54d Capabilities refactor
- Add support for exact list of capabilities, support only OCI model
- Support OCI model on CapAdd and CapDrop but remain backward compatibility
- Create variable locally instead of declaring it at the top
- Use const for magic "ALL" value
- Rename `cap` variable as it overlaps with `cap()` built-in
- Normalize and validate capabilities before use
- Move validation for conflicting options to validateHostConfig()
- TweakCapabilities: simplify logic to calculate capabilities

Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-22 21:50:41 +02:00
Michael Crosby
b940cc5cff Move caps and device spec utils to oci pkg
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-12-11 10:20:25 -05:00
Justin Terry (VM)
b2d99865ea Add --device support for Windows
Implements the --device forwarding for Windows daemons. This maps the physical
device into the container at runtime.

Ex:

docker run --device="class/<clsid>" <image> <cmd>

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
2018-11-21 15:31:17 -08:00
John Starks
e9268d9642 lcow: Allow the client to add device cgroup rules
Signed-off-by: John Starks <jostarks@microsoft.com>
2018-06-15 16:14:17 -07:00
John Starks
349aeeab7c lcow: Allow the client to add or remove capabilities
Signed-off-by: John Starks <jostarks@microsoft.com>
2018-06-15 16:03:33 -07:00
Brian Goff
cc8f358c23 Move network operations out of container package
These network operations really don't have anything to do with the
container but rather are setting up the networking.

Ideally these wouldn't get shoved into the daemon package, but doing
something else (e.g. extract a network service into a new package) but
there's a lot more work to do in that regard.
In reality, this probably simplifies some of that work as it moves all
the network operations to the same place.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-05-10 17:16:00 -04:00
John Howard
0f5fe3f9cf Windows: Fix Hyper-V containers regression from 36586
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-03-15 15:36:36 -07:00
Kir Kolyshkin
d6ea46ceda container.BaseFS: check for nil before deref
Commit 7a7357dae1 ("LCOW: Implemented support for docker cp + build")
changed `container.BaseFS` from being a string (that could be empty but
can't lead to nil pointer dereference) to containerfs.ContainerFS,
which could be be `nil` and so nil dereference is at least theoretically
possible, which leads to panic (i.e. engine crashes).

Such a panic can be avoided by carefully analysing the source code in all
the places that dereference a variable, to make the variable can't be nil.
Practically, this analisys are impossible as code is constantly
evolving.

Still, we need to avoid panics and crashes. A good way to do so is to
explicitly check that a variable is non-nil, returning an error
otherwise. Even in case such a check looks absolutely redundant,
further changes to the code might make it useful, and having an
extra check is not a big price to pay to avoid a panic.

This commit adds such checks for all the places where it is not obvious
that container.BaseFS is not nil (which in this case means we do not
call daemon.Mount() a few lines earlier).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-03-13 21:24:48 -07:00
Sebastiaan van Stijn
1346a2c89a
Merge pull request #36267 from Microsoft/jjh/removeservicing
Windows: Remove servicing mode
2018-02-28 01:15:03 +01:00
John Howard
d4f37c0885 Windows: Remove servicing mode
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-02-27 08:48:31 -08:00
Daniel Nephin
2b1a2b10af Move ImageService to new package
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-26 16:49:37 -05:00