Commit graph

162 commits

Author SHA1 Message Date
Brian Goff
611eb6ffb3 buildkit: Apply apparmor profile
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-01-28 21:33:12 +00:00
Sebastiaan van Stijn
f3d0f7054d
cmd/dockerd: sd_notify STOPPING=1 when shutting down
Signal systemd when we start shutting down to complement the "READY" notify
that was originally implemented in 97088ebef7

From [sd_notify(3)](https://www.freedesktop.org/software/systemd/man/sd_notify.html#STOPPING=1)

> STOPPING=1
> Tells the service manager that the service is beginning its shutdown. This is useful
> to allow the service manager to track the service's internal state, and present it to
> the user.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-12-22 10:51:17 +01:00
Brian Goff
816fbcd306
Merge pull request #41072 from AkihiroSuda/fix-41071
cgroup2: unshare cgroupns by default regardless to API version
2020-10-01 11:56:00 -07:00
Brian Goff
5f5285a6e2 Sterner warnings for unathenticated tcp
People keep doing this and getting pwned because they accidentally left
it exposed to the internet.

The warning about doing this has been there forever.
This introduces a sleep after warning.
To disable the extra sleep users must explicitly specify `--tls=false`
or `--tlsverify=false`

Warning also specifies this sleep will be removed in the next release
where the flag will be required if running unauthenticated.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-09-25 00:21:54 +00:00
Sebastiaan van Stijn
51c7992928
API: add "prune" events
This patch adds a new "prune" event type to indicate that pruning of a resource
type completed.

This event-type can be used on systems that want to perform actions after
resources have been cleaned up. For example, Docker Desktop performs an fstrim
after resources are deleted (https://github.com/linuxkit/linuxkit/tree/v0.7/pkg/trim-after-delete).

While the current (remove, destroy) events can provide information on _most_
resources, there is currently no event triggered after the BuildKit build-cache
is cleaned.

Prune events have a `reclaimed` attribute, indicating the amount of space that
was reclaimed (in bytes). The attribute can be used, for example, to use as a
threshold for performing fstrim actions. Reclaimed space for `network` events
will always be 0, but the field is added to be consistent with prune events for
other resources.

To test this patch:

Create some resources:

    for i in foo bar baz; do \
        docker network create network_$i \
        && docker volume create volume_$i \
        && docker run -d --name container_$i -v volume_$i:/volume busybox sh -c 'truncate -s 5M somefile; truncate -s 5M /volume/file' \
        && docker tag busybox:latest image_$i; \
    done;

    docker pull alpine
    docker pull nginx:alpine

    echo -e "FROM busybox\nRUN truncate -s 50M bigfile" | DOCKER_BUILDKIT=1 docker build -

Start listening for "prune" events in another shell:

    docker events --filter event=prune

Prune containers, networks, volumes, and build-cache:

    docker system prune -af --volumes

See the events that are returned:

    docker events --filter event=prune
    2020-07-25T12:12:09.268491000Z container prune  (reclaimed=15728640)
    2020-07-25T12:12:09.447890400Z network prune  (reclaimed=0)
    2020-07-25T12:12:09.452323000Z volume prune  (reclaimed=15728640)
    2020-07-25T12:12:09.517236200Z image prune  (reclaimed=21568540)
    2020-07-25T12:12:09.566662600Z builder prune  (reclaimed=52428841)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-07-28 12:41:14 +02:00
Akihiro Suda
79cfcba76c
cgroup2: unshare cgroupns by default regardless to API version
Fix #41071

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-06-15 16:11:32 +09:00
Tibor Vass
298ba5b131
Merge pull request #40427 from thaJeztah/prometheus_remove_experimental
Do not require "experimental" for metrics API
2020-05-08 11:10:53 -07:00
Sebastiaan van Stijn
c3b3aedfa4
Merge pull request #40662 from AkihiroSuda/cgroup2-dockerinfo
cgroup2: implement `docker info`
2020-04-29 22:57:00 +02:00
Sebastiaan van Stijn
f337a8d21d
Do not require "experimental" for metrics API
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-20 22:19:00 +02:00
Akihiro Suda
f350b53241 cgroup2: implement docker info
ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-17 07:20:01 +09:00
Sebastiaan van Stijn
7400375526
daemon: remove distribution/uuid package
This appeared to be unused because we no longer generate
a uuid using this package.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-16 09:16:38 +02:00
Tonis Tiigi
0cdf6ba9c8 vendor: update buildkit to ae7ff174
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2020-04-14 08:26:07 -07:00
Akihiro Suda
5ca47f5179 rootless: graduate from experimental
Close #40484

Note that the support for cgroup v2 isn't ready for production yet,
regardless to rootful or rootless.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-07 00:59:15 +09:00
Brian Goff
bc1c0c7a8a
Merge pull request #40510 from aiordache/moby_cluster_flags_deprecate
Deprecate '--cluster-xx' options and add warning
2020-02-27 11:25:31 -08:00
Anca Iordache
1470697b67 Deprecate '--cluster-xx' options and add warning
Co-authored-by: Yves Brissaud <yves.brissaud@gmail.com>

Signed-off-by: Anca Iordache <anca.iordache@docker.com>
2020-02-12 18:33:23 +01:00
Jintao Zhang
35d6c1870f enforce reserve internal labels.
The namespaces com.docker.*, io.docker.*, org.dockerproject.*
have been documented to be reserved for Docker's internal use.

Co-Authored-By: Sebastiaan van Stijn <thaJeztah@users.noreply.github.com>
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
2020-02-12 12:03:35 +08:00
Tibor Vass
6ca3ec88ae builder: remove legacy build's session handling
This feature was used by docker build --stream and it was kept experimental.

Users of this endpoint should enable BuildKit anyway by setting Version to BuilderBuildKit.

Signed-off-by: Tibor Vass <tibor@docker.com>
2019-10-02 20:29:15 +00:00
HuanHuan Ye
88c554f950 DaemonCli: Move check into startMetricsServer
Fix TODO: move into startMetricsServer()
Fix errors.Wrap return nil when passed err is nil

Co-Authored-By: Sebastiaan van Stijn <thaJeztah@users.noreply.github.com>
Signed-off-by: HuanHuan Ye <logindaveye@gmail.com>
2019-09-12 15:18:05 +08:00
Sebastiaan van Stijn
e554ab5589
Allow system.MkDirAll() to be used as drop-in for os.MkDirAll()
also renamed the non-windows variant of this file to be
consistent with other files in this package

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-08-08 15:05:49 +02:00
Sebastiaan van Stijn
f6b1f01de3
Remove hack MalformedHostHeaderOverride
This hack was added to fix a compatibility with clients
that were built using Go 1.5 and older (added in 3d6f5984f5)

This hack causes some problems with current clients; with Go 1.5 and older
no longer being supported for some time, and being several years old, it
should now be ok to remove this hack altogether.

People using tools that are built with those versions of Go wouldn't have
updated those for years, and are probably out of date anyway; that's not
something we can continue taking into account.

This will affect docker clients (the docker cli) for docker 1.12 and older.
Those versions have reached EOL a long time ago (and have known unpatched
vulnerabilities), so should no longer be used anyway, but We should add
a nebtuib in the release notes, just in case someone, somewhere, still
has such old tools.

For those affected, using a more recent client (and if needed, setting
the DOCKER_API_VERSION environment variable to the needed API version)
should provide a way out.

This reverts the changes originally made in; #22000 and #22888,
which were to address #20865.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-07-18 21:25:04 +02:00
Sebastiaan van Stijn
c7bbb1c5a1
Merge pull request #39329 from tiborvass/buildkit-honor-daemon-dnsconfig
build: buildkit now honors daemon's DNS config
2019-07-16 16:19:20 +02:00
Sebastiaan van Stijn
d470252e87
daemon: don't listen on the same address multiple times
Before this change:

    dockerd -H unix:///run/docker.sock -H unix:///run/docker.sock -H unix:///run/docker.sock
    ...
    INFO[2019-07-13T00:02:36.195090937Z] Daemon has completed initialization
    INFO[2019-07-13T00:02:36.215940441Z] API listen on /run/docker.sock
    INFO[2019-07-13T00:02:36.215933172Z] API listen on /run/docker.sock
    INFO[2019-07-13T00:02:36.215990566Z] API listen on /run/docker.sock

After this change:

    dockerd -H unix:///run/docker.sock -H unix:///run/docker.sock -H unix:///run/docker.sock
    ...
    INFO[2019-07-13T00:01:37.533579874Z] Daemon has completed initialization
    INFO[2019-07-13T00:01:37.567045771Z] API listen on /run/docker.sock

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-07-13 13:21:08 +02:00
Tibor Vass
a1cdd4bfcc build: buildkit now honors daemon's DNS config
Signed-off-by: Tibor Vass <tibor@docker.com>
2019-07-10 00:26:03 +00:00
Tibor Vass
53dad9f027 Remove v1 manifest code
Signed-off-by: Tibor Vass <tibor@docker.com>
2019-06-18 01:40:25 +00:00
Tibor Vass
f695e98cb7 Revert "Remove the rest of v1 manifest support"
This reverts commit 98fc09128b in order to
keep registry v2 schema1 handling and libtrust-key-based engine ID.

Because registry v2 schema1 was not officially deprecated and
registries are still relying on it, this patch puts its logic back.

However, registry v1 relics are not added back since v1 logic has been
removed a while ago.

This also fixes an engine upgrade issue in a swarm cluster. It was relying
on the Engine ID to be the same upon upgrade, but the mentioned commit
modified the logic to use UUID and from a different file.

Since the libtrust key is always needed to support v2 schema1 pushes,
that the old engine ID is based on the libtrust key, and that the engine ID
needs to be conserved across upgrades, adding a UUID-based engine ID logic
seems to add more complexity than it solves the problems.

Hence reverting the engine ID changes as well.

Signed-off-by: Tibor Vass <tibor@docker.com>
2019-06-18 00:36:01 +00:00
Tonis Tiigi
07b3aac902 builder-next: userns remap support
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-06-10 21:49:17 -07:00
Sebastiaan van Stijn
c85fe2d224
Merge pull request #38522 from cpuguy83/fix_timers
Make sure timers are stopped after use.
2019-06-07 13:16:46 +02:00
Brian Goff
595987fd08 Add log entries for daemon startup/shutdown
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-05-06 10:36:05 -07:00
Akihiro Suda
3518383ed9 dockerd: fix rootless detection (alternative to #39024)
The `--rootless` flag had a couple of issues:
* #38702: euid=0, $USER="root" but no access to cgroup ("rootful" Docker in rootless Docker)
* #39009: euid=0 but $USER="docker" (rootful boot2docker)

To fix #38702, XDG dirs are ignored as in rootful Docker, unless the
dockerd is directly running under RootlessKit namespaces.

RootlessKit detection is implemented by checking whether `$ROOTLESSKIT_STATE_DIR` is set.

To fix #39009, the non-robust `$USER` check is now completely removed.

The entire logic can be illustrated as follows:

```
withRootlessKit := getenv("ROOTLESSKIT_STATE_DIR")
rootlessMode := withRootlessKit || cliFlag("--rootless")
honorXDG := withRootlessKit
useRootlessKitDockerProxy := withRootlessKit
removeCgroupSpec := rootlessMode
adjustOOMScoreAdj := rootlessMode
```

Close #39024
Fix #38702 #39009

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-04-25 16:47:01 +09:00
Akihiro Suda
3bc02fc040 fix containerd WaitTimeout
`defer r.WaitTimeout(10s)` was in a wrong place and had caused the
daemon to hang for 10 seconds.

Fix #39025

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2019-04-08 18:44:14 +09:00
Tibor Vass
05c5d20a2c grpc: register BuildKit controller to /grpc
Signed-off-by: Tibor Vass <tibor@docker.com>
2019-04-02 19:57:59 +00:00
John Howard
a3eda72f71
Merge pull request #38541 from Microsoft/jjh/containerd
Windows: Experimental: ContainerD runtime
2019-03-19 21:09:19 -07:00
John Howard
85ad4b16c1 Windows: Experimental: Allow containerd for runtime
Signed-off-by: John Howard <jhoward@microsoft.com>

This is the first step in refactoring moby (dockerd) to use containerd on Windows.
Similar to the current model in Linux, this adds the option to enable it for runtime.
It does not switch the graphdriver to containerd snapshotters.

 - Refactors libcontainerd to a series of subpackages so that either a
  "local" containerd (1) or a "remote" (2) containerd can be loaded as opposed
  to conditional compile as "local" for Windows and "remote" for Linux.

 - Updates libcontainerd such that Windows has an option to allow the use of a
   "remote" containerd. Here, it communicates over a named pipe using GRPC.
   This is currently guarded behind the experimental flag, an environment variable,
   and the providing of a pipename to connect to containerd.

 - Infrastructure pieces such as under pkg/system to have helper functions for
   determining whether containerd is being used.

(1) "local" containerd is what the daemon on Windows has used since inception.
It's not really containerd at all - it's simply local invocation of HCS APIs
directly in-process from the daemon through the Microsoft/hcsshim library.

(2) "remote" containerd is what docker on Linux uses for it's runtime. It means
that there is a separate containerd service running, and docker communicates over
GRPC to it.

To try this out, you will need to start with something like the following:

Window 1:
	containerd --log-level debug

Window 2:
	$env:DOCKER_WINDOWS_CONTAINERD=1
	dockerd --experimental -D --containerd \\.\pipe\containerd-containerd

You will need the following binary from github.com/containerd/containerd in your path:
 - containerd.exe

You will need the following binaries from github.com/Microsoft/hcsshim in your path:
 - runhcs.exe
 - containerd-shim-runhcs-v1.exe

For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`.
This is no different to the current requirements. However, you may need updated binaries,
particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit
binaries are somewhat out of date.

Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. This will come in time - this is a baby (although large)
step to migrating Docker on Windows to containerd.

Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use
HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable.
This abstraction is done in HCSShim. (Referring specifically to runtime)

Note the LCOW graphdriver still uses HCS v1 APIs regardless.

Note also that this does not migrate docker to use containerd snapshotters
rather than graphdrivers. This needs to be done in conjunction with Linux also
doing the same switch.
2019-03-12 18:41:55 -07:00
Justin Cormack
98fc09128b Remove the rest of v1 manifest support
As people are using the UUID in `docker info` that was based on the v1 manifest signing key, replace
with a UUID instead.

Remove deprecated `--disable-legacy-registry` option that was scheduled to be removed in 18.03.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2019-03-02 10:46:37 -08:00
Tonis Tiigi
f9b9d5f584 builder-next: fixes for rootless mode
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-02-28 10:44:21 -08:00
Akihiro Suda
56bea903ef dockerd: call StickRuntimeDirContents only in rootless mode
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2019-02-14 12:48:41 +09:00
Yong Tang
86312a4732 Fix go-vet issue
This fix fixes the following issue with `go vet`:
```
$ go tool vet cmd/dockerd/daemon.go
cmd/dockerd/daemon.go:163: the cancel function is not used on all paths (possible context leak)
cmd/dockerd/daemon.go:167: this return statement may be reached without using the cancel var defined on line 163
```

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2019-02-06 23:30:28 +00:00
Akihiro Suda
ec87479b7e allow running dockerd in an unprivileged user namespace (rootless mode)
Please refer to `docs/rootless.md`.

TLDR:
 * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
 * `dockerd-rootless.sh --experimental`
 * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2019-02-04 00:24:27 +09:00
Brian Goff
eaad3ee3cf Make sure timers are stopped after use.
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.

Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-01-16 14:32:53 -08:00
Sebastiaan van Stijn
1edf943dc7
Configure log-format earlier, and small refactor
Some messages are logged before the logrus format was set,
therefore resulting in inconsistent log-message formatting
during startup;

Before this patch;

```
dockerd --experimental
WARN[0000] Running experimental build
INFO[2018-11-24T11:24:05.615249610Z] libcontainerd: started new containerd process  pid=132
INFO[2018-11-24T11:24:05.615348322Z] parsed scheme: "unix"                         module=grpc
...
```

With this patch applied;

```
dockerd --experimental
WARN[2018-11-24T13:41:51.199057259Z] Running experimental build
INFO[2018-11-24T13:41:51.200412645Z] libcontainerd: started new containerd process  pid=293
INFO[2018-11-24T13:41:51.200523051Z] parsed scheme: "unix"                         module=grpc
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 18:53:18 +01:00
Tibor Vass
34eede0296 Remove 'docker-' prefix for containerd and runc binaries
This allows to run the daemon in environments that have upstream containerd installed.

Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-24 21:49:03 +00:00
Tibor Vass
4a776d0ca7 builder: use buildkit's GC for build cache
This allows users to configure the buildkit GC.

The following enables the default GC:
```
{
  "builder": {
    "gc": {
      "enabled": true
    }
  }
}
```

The default GC policy has a simple config:
```
{
  "builder": {
    "gc": {
      "enabled": true,
      "defaultKeepStorage": "30GB"
    }
  }
}
```

A custom GC policy can be used instead by specifying a list of cache prune rules:
```
{
  "builder": {
    "gc": {
      "enabled": true,
      "policy": [
        {"keepStorage": "512MB", "filter": ["unused-for=1400h"]]},
        {"keepStorage": "30GB", "all": true}
      ]
    }
  }
}
```

Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-21 22:06:00 +00:00
Anda Xu
171d51c861 add support of registry-mirrors and insecure-registries to buildkit
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-20 11:53:02 -07:00
Anda Xu
66ac92cdc6 create newBuildKit function separately in daemon_unix.go and daemon_windows.go for cross platform build
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-11 11:22:48 -07:00
Anda Xu
54b3af4c7d update vendor
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-07 17:48:41 -07:00
Anda Xu
d52485c2f9 propagate the dockerd cgroup-parent config to buildkitd
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-07 17:48:41 -07:00
Xiaoxi He
5c0d2a0932 Fix some typos
Signed-off-by: Xiaoxi He <xxhe@alauda.io>
2018-09-07 13:13:47 +08:00
Tõnis Tiigi
4842f7a867
Merge pull request #37738 from tiborvass/remove-unused-field-in-builder
builder: remove unused netnsRoot field in builder-next
2018-09-06 13:33:35 -07:00
Anda Xu
58a75cebdd allow features option live reloadable
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-08-31 12:43:04 -07:00
Tibor Vass
8ab9e78ee4 builder: remove unused netnsRoot field in builder-next
Signed-off-by: Tibor Vass <tibor@docker.com>
2018-08-31 19:09:52 +00:00