Commit graph

362 commits

Author SHA1 Message Date
Cam
80a5df9c49
Added container ID to containerd task delete event messages
Signed-off-by: Cam <gh@sparr.email>
2020-10-30 20:58:57 -07:00
Tibor Vass
cf867587b9
Merge pull request #41527 from thaJeztah/no_oom_score_adj
daemon: don't adjust oom-score if score is 0
2020-10-15 15:00:18 -07:00
Brian Goff
f14aea63c9 "Fix" checkpoint on v2 runtime
Checkpoint/Restore is horribly broken all around.
But on the, now default, v2 runtime it's even more broken.

This at least makes checkpoint equally broken on both runtimes.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-10-12 22:35:37 +00:00
Sebastiaan van Stijn
cf7a5be0f2
daemon: don't adjust oom-score if score is 0
This patch makes two changes if --oom-score-adj is set to 0

- do not adjust the oom-score-adjust cgroup for dockerd
- do not set the hard-coded -999 score for containerd if
  containerd is running as child process

Before this change:

oom-score-adj | dockerd       | containerd as child-process
--------------|---------------|----------------------------
-             | -500          | -500 (same as dockerd)
-100          | -100          | -100 (same as dockerd)
 0            |  0            | -999 (hard-coded default)

With this change:

oom-score-adj | dockerd       | containerd as child-process
--------------|---------------|----------------------------
-             | -500          | -500 (same as dockerd)
-100          | -100          | -100 (same as dockerd)
0             | not adjusted  | not adjusted

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-10-05 19:52:02 +02:00
Brian Goff
906007f6c1 libcontainerd: use cancellable context for events
The event subscriber can only be cancelled by cancelling the context.
In the case where we have to restart event processing we are never
cancelling the old subscribiption.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-08-12 17:09:21 +00:00
Brian Goff
60d7265803 Use IsServing to determine if c8d client is ready
Instead of sleeping an arbitrary amount of time, using the client to
tell us when it's ready so we can start processing events sooner.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-08-12 17:09:21 +00:00
Sebastiaan van Stijn
bf7fd015f7
Remove unused useShimV2()
This function was removed in the Linux code as part of
f63f73a4a8, but was not removed in
the Windows code.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-07-15 14:28:48 +02:00
Brian Goff
f63f73a4a8 Configure shims from runtime config
In dockerd we already have a concept of a "runtime", which specifies the
OCI runtime to use (e.g. runc).
This PR extends that config to add containerd shim configuration.
This option is only exposed within the daemon itself (cannot be
configured in daemon.json).
This is due to issues in supporting unknown shims which will require
more design work.

What this change allows us to do is keep all the runtime config in one
place.

So the default "runc" runtime will just have it's already existing shim
config codified within the runtime config alone.
I've also added 2 more "stock" runtimes which are basically runc+shimv1
and runc+shimv2.
These new runtime configurations are:

- io.containerd.runtime.v1.linux - runc + v1 shim using the V1 shim API
- io.containerd.runc.v2 - runc + shim v2

These names coincide with the actual names of the containerd shims.

This allows the user to essentially control what shim is going to be
used by either specifying these as a `--runtime` on container create or
by setting `--default-runtime` on the daemon.

For custom/user-specified runtimes, the default shim config (currently
shim v1) is used.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-07-13 14:18:02 -07:00
Akihiro Suda
3802830989 cgroup2: implement docker stats
The following fields are unsupported:

* BlkioStats: all fields other than IoServiceBytesRecursive
* CPUStats: CPUUsage.PercpuUsage
* MemoryStats: MaxUsage and Failcnt

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-02 17:51:34 +09:00
Sebastiaan van Stijn
9f0b3f5609
bump gotest.tools v3.0.1 for compatibility with Go 1.14
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 00:06:42 +01:00
Akihiro Suda
612343618d cgroup2: use shim V2
* Requires containerd binaries from containerd/containerd#3799 . Metrics are unimplemented yet.
* Works with crun v0.10.4, but `--security-opt seccomp=unconfined` is needed unless using master version of libseccomp
  ( containers/crun#156, seccomp/libseccomp#177 )
* Doesn't work with master runc yet
* Resource limitations are unimplemented

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-01 02:58:40 +09:00
Brian Goff
36cf709abd
Merge pull request #40146 from thaJeztah/move_hcsshim
libcontainerd: move hcsshim import to windows-only file
2019-12-19 11:55:24 -08:00
Sebastiaan van Stijn
5bb4f4818b
libcontainerd: move hcsshim import to windows-only file
This reduces the dependency-graph when building packages for
Linux only.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-12-10 10:58:14 +01:00
Sebastiaan van Stijn
d29f420424
libcontainerd: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:44:10 +01:00
Sebastiaan van Stijn
9a7e96b5b7
Rename "v1" to "statsV1"
follow-up to 27552ceb15, where this
was left as a review comment, but the PR was already merged.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-01 16:18:06 +01:00
Sebastiaan van Stijn
27552ceb15
bump containerd/cgroups 5fbad35c2a7e855762d3c60f2e474ffcad0d470a
full diff: c4b9ac5c76...5fbad35c2a

- containerd/cgroups#82 Add go module support
- containerd/cgroups#96 Move metrics proto package to stats/v1
- containerd/cgroups#97 Allow overriding the default /proc folder in blkioController
- containerd/cgroups#98 Allows ignoring memory modules
- containerd/cgroups#99 Add Go 1.13 to Travis
- containerd/cgroups#100 stats/v1: export per-cgroup stats

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-31 01:09:12 +01:00
Sebastiaan van Stijn
6b91ceff74
Use hcsshim osversion package for Windows versions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-22 02:53:00 +02:00
Brian Goff
bef73d8b07 Wait for c8d process exit instead of polling API
In the containerd supervisor, instead of polling the healthcheck API
every 500 milliseconds we can just wait for the process to exit.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-10-16 12:23:10 -07:00
Akihiro Suda
de5a67156b
Merge pull request #39082 from ehazlett/opts-for-create
Add NewContainerOpts to libcontainerd.Create
2019-10-04 08:20:47 +09:00
Evan Hazlett
35ac4be5d5 add NewContainerOpts to libcontainerd.Create
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2019-10-03 11:45:41 -04:00
John Howard
8988448729 Remove refs to jhowardmsft from .go code
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-09-25 10:51:18 -07:00
Sebastiaan van Stijn
07ff4f1de8
goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
Sebastiaan van Stijn
e554ab5589
Allow system.MkDirAll() to be used as drop-in for os.MkDirAll()
also renamed the non-windows variant of this file to be
consistent with other files in this package

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-08-08 15:05:49 +02:00
Brian Goff
1acaf2aabe Sleep before restarting event processing
This prevents restarting event processing in a tight loop.
You can see this with the following steps:

```terminal
$ containerd &
$ dockerd --containerd=/run/containerd/containerd.sock &
$ pkill -9 containerd
```

At this point you will be spammed with logs such as:

```
ERRO[2019-07-12T22:29:37.318761400Z] failed to get event                           error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\"" module=libcontainerd namespace=plugins.moby
```

Without this change you can quickly end up with gigabytes of log data.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-07-12 15:42:19 -07:00
Michael Crosby
b5f28865ef Handle blocked I/O of exec'd processes
This is the second part to
https://github.com/containerd/containerd/pull/3361 and will help process
delete not block forever when the process exists but the I/O was
inherited by a subprocess that lives on.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-06-21 12:02:15 -04:00
Sebastiaan van Stijn
c85fe2d224
Merge pull request #38522 from cpuguy83/fix_timers
Make sure timers are stopped after use.
2019-06-07 13:16:46 +02:00
Sebastiaan van Stijn
539e72f75b
Fix typo retreive -> retrieve
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-06-04 17:33:04 +02:00
Sebastiaan van Stijn
c030885e7a
Windows: fix error-type for starting a running container
Trying to start a container that is already running is not an
error condition, so a `304 Not Modified` should be returned instead
of a `409 Conflict`.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-05-22 13:27:55 +02:00
Michael Crosby
b9b5dc37e3 Remove inmemory container map
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-04-05 15:48:07 -04:00
Michael Crosby
adb15c2899 Export WithBundle code
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-04-05 08:41:48 -04:00
Michael Crosby
45e328b0ac Remove libcontainerd status type
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-04-04 15:17:13 -04:00
John Howard
2f27332836 Windows: Implement docker top for containerd
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-03-12 18:41:55 -07:00
John Howard
8de5db1c00 Remove unsupported lcow.vhdx option
Signed-off-by: John Howard <jhoward@microsoft.com>

This was only experimental and removed from opengcs. Making same
change in docker.
2019-03-12 18:41:55 -07:00
John Howard
afa3aec024 Windows: Don't shadow err variable
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-03-12 18:41:55 -07:00
John Howard
32acc76b1a Windows: Fix handle leaks/logging if init proc start fails
Signed-off-by: John Howard <jhoward@microsoft.com>

Fixes #38719

Fixes some subtle bugs on Windows

 - Fixes https://github.com/moby/moby/issues/38719. This one is the most important
   as failure to start the init process in a Windows container will cause leaked
   handles. (ie where the `ctr.hcsContainer.CreateProcess(...)` call fails).
   The solution to the leak is to split out the `reapContainer` part of `reapProcess`
   into a separate function. This ensures HCS resources are cleaned up correctly and
   not leaked.

 - Ensuring the reapProcess goroutine is started immediately the process
   is actually started, so we don't leak in the case of failures such as
   from `newIOFromProcess` or `attachStdio`

 - libcontainerd on Windows (local, not containerd) was not sending the EventCreate
   back to the monitor on Windows. Just LCOW. This was just an oversight from
   refactoring a couple of years ago by Mikael as far as I can tell. Technically
   not needed for functionality except for the logging being missing, but is correct.
2019-03-12 18:41:55 -07:00
John Howard
20833b06a0 Windows: (WCOW) Generate OCI spec that remote runtime can escape
Signed-off-by: John Howard <jhoward@microsoft.com>

Also fixes https://github.com/moby/moby/issues/22874

This commit is a pre-requisite to moving moby/moby on Windows to using
Containerd for its runtime.

The reason for this is that the interface between moby and containerd
for the runtime is an OCI spec which must be unambigious.

It is the responsibility of the runtime (runhcs in the case of
containerd on Windows) to ensure that arguments are escaped prior
to calling into HCS and onwards to the Win32 CreateProcess call.

Previously, the builder was always escaping arguments which has
led to several bugs in moby. Because the local runtime in
libcontainerd had context of whether or not arguments were escaped,
it was possible to hack around in daemon/oci_windows.go with
knowledge of the context of the call (from builder or not).

With a remote runtime, this is not possible as there's rightly
no context of the caller passed across in the OCI spec. Put another
way, as I put above, the OCI spec must be unambigious.

The other previous limitation (which leads to various subtle bugs)
is that moby is coded entirely from a Linux-centric point of view.

Unfortunately, Windows != Linux. Windows CreateProcess uses a
command line, not an array of arguments. And it has very specific
rules about how to escape a command line. Some interesting reading
links about this are:

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
https://stackoverflow.com/questions/31838469/how-do-i-convert-argv-to-lpcommandline-parameter-of-createprocess
https://docs.microsoft.com/en-us/cpp/cpp/parsing-cpp-command-line-arguments?view=vs-2017

For this reason, the OCI spec has recently been updated to cater
for more natural syntax by including a CommandLine option in
Process.

What does this commit do?

Primary objective is to ensure that the built OCI spec is unambigious.

It changes the builder so that `ArgsEscaped` as commited in a
layer is only controlled by the use of CMD or ENTRYPOINT.

Subsequently, when calling in to create a container from the builder,
if follows a different path to both `docker run` and `docker create`
using the added `ContainerCreateIgnoreImagesArgsEscaped`. This allows
a RUN from the builder to control how to escape in the OCI spec.

It changes the builder so that when shell form is used for RUN,
CMD or ENTRYPOINT, it builds (for WCOW) a more natural command line
using the original as put by the user in the dockerfile, not
the parsed version as a set of args which loses fidelity.
This command line is put into args[0] and `ArgsEscaped` is set
to true for CMD or ENTRYPOINT. A RUN statement does not commit
`ArgsEscaped` to the commited layer regardless or whether shell
or exec form were used.
2019-03-12 18:41:55 -07:00
John Howard
85ad4b16c1 Windows: Experimental: Allow containerd for runtime
Signed-off-by: John Howard <jhoward@microsoft.com>

This is the first step in refactoring moby (dockerd) to use containerd on Windows.
Similar to the current model in Linux, this adds the option to enable it for runtime.
It does not switch the graphdriver to containerd snapshotters.

 - Refactors libcontainerd to a series of subpackages so that either a
  "local" containerd (1) or a "remote" (2) containerd can be loaded as opposed
  to conditional compile as "local" for Windows and "remote" for Linux.

 - Updates libcontainerd such that Windows has an option to allow the use of a
   "remote" containerd. Here, it communicates over a named pipe using GRPC.
   This is currently guarded behind the experimental flag, an environment variable,
   and the providing of a pipename to connect to containerd.

 - Infrastructure pieces such as under pkg/system to have helper functions for
   determining whether containerd is being used.

(1) "local" containerd is what the daemon on Windows has used since inception.
It's not really containerd at all - it's simply local invocation of HCS APIs
directly in-process from the daemon through the Microsoft/hcsshim library.

(2) "remote" containerd is what docker on Linux uses for it's runtime. It means
that there is a separate containerd service running, and docker communicates over
GRPC to it.

To try this out, you will need to start with something like the following:

Window 1:
	containerd --log-level debug

Window 2:
	$env:DOCKER_WINDOWS_CONTAINERD=1
	dockerd --experimental -D --containerd \\.\pipe\containerd-containerd

You will need the following binary from github.com/containerd/containerd in your path:
 - containerd.exe

You will need the following binaries from github.com/Microsoft/hcsshim in your path:
 - runhcs.exe
 - containerd-shim-runhcs-v1.exe

For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`.
This is no different to the current requirements. However, you may need updated binaries,
particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit
binaries are somewhat out of date.

Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. This will come in time - this is a baby (although large)
step to migrating Docker on Windows to containerd.

Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use
HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable.
This abstraction is done in HCSShim. (Referring specifically to runtime)

Note the LCOW graphdriver still uses HCS v1 APIs regardless.

Note also that this does not migrate docker to use containerd snapshotters
rather than graphdrivers. This needs to be done in conjunction with Linux also
doing the same switch.
2019-03-12 18:41:55 -07:00
Yusuf Tarık Günaydın
86bd2e9864 Implemented memory and CPU limits for LCOW.
Signed-off-by: Yusuf Tarık Günaydın <yusuf_tarik@hotmail.com>
2019-02-02 13:02:23 +03:00
Simão Reis
3134161be3 Fix nil pointer derefence on failure to connect to containerd
Signed-off-by: Simão Reis <smnrsti@gmail.com>
2019-01-30 12:41:54 -01:00
Brian Goff
eaad3ee3cf Make sure timers are stopped after use.
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.

Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-01-16 14:32:53 -08:00
Tonis Tiigi
332f134890 libcontainerd: prevent exec delete locking
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-12-17 12:22:37 +02:00
Justin Terry (VM)
b2d99865ea Add --device support for Windows
Implements the --device forwarding for Windows daemons. This maps the physical
device into the container at runtime.

Ex:

docker run --device="class/<clsid>" <image> <cmd>

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
2018-11-21 15:31:17 -08:00
Sebastiaan van Stijn
dd7799afd4
update containerd client and dependencies to v1.2.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-05 18:46:26 +01:00
Wei Fu
c7890f25a9 bugfix: wait for stdin creation before CloseIO
The stdin fifo of exec process is created in containerd side after
client calls Start. If the client calls CloseIO before Start call, the
stdin of exec process is still opened and wait for close.

For this case, client closes stdinCloseSync channel after Start.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2018-10-10 19:59:01 +08:00
Tibor Vass
34eede0296 Remove 'docker-' prefix for containerd and runc binaries
This allows to run the daemon in environments that have upstream containerd installed.

Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-24 21:49:03 +00:00
Sebastiaan van Stijn
06b9588c2d
Merge pull request #37759 from dmcgowan/fix-libcontainerd-startup-error
Add fail fast path when containerd fails on startup
2018-09-14 15:15:38 +02:00
Derek McGowan
ce0b0b72bc
Add fail fast path when containerd fails on startup
Prevents looping of startup errors such as containerd
not being found on the path.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2018-09-13 17:34:52 -07:00
Kir Kolyshkin
9b0097a699 Format code with gofmt -s from go-1.11beta1
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.

No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).

Patch generated with:

> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 15:24:16 -07:00
Derek McGowan
c3e3293843
Fix supervisor healthcheck throttling
Fix default case causing the throttling to not be used.
Ensure that nil client condition is handled.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2018-09-04 11:00:28 -07:00
John Howard
5accd82634 Add containerd.WithTimeout(60*time.Second) to match old calls
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-08-23 12:03:43 -07:00