Commit graph

6673 commits

Author SHA1 Message Date
Sebastiaan van Stijn
4534a7afc3
daemon: use containerd/sys to detect UserNamespaces
The implementation in libcontainer/system is quite complicated,
and we only use it to detect if user-namespaces are enabled.

In addition, the implementation in containerd uses a sync.Once,
so that detection (and reading/parsing `/proc/self/uid_map`) is
only performed once.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-15 13:06:08 +02:00
Brian Goff
d984d3053b
Merge pull request #41075 from wangyumu/fix-syslog-empty-lines
Fixes #41010 skip empty lines
2020-06-11 12:53:37 -07:00
Brian Goff
7fa2026620
Merge pull request #40938 from thaJeztah/move_pidslimit
API: swarm: move PidsLimit to TaskTemplate.Resources
2020-06-11 12:04:44 -07:00
Sebastiaan van Stijn
d378625554
info: add warnings about missing blkio cgroup support
These warnings were only logged, and could therefore be overlooked
by users. This patch makes these more visible by returning them as
warnings in the API response.

We should probably consider adding "boolean" (?) fields for these
as well, so that they can be consumed in other ways. In addition,
some of these warnings could potentially be grouped to reduce the
number of warnings that are printed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-08 17:16:44 +02:00
Sebastiaan van Stijn
3aac5f0bbb
Merge pull request #41018 from akhilerm/identity-mapping
remove group name from identity mapping
2020-06-08 15:15:05 +02:00
Wang Yumu
96556854a7 Fixes #41010 skip empty lines
Signed-off-by: Wang Yumu <37442693@qq.com>
2020-06-06 12:36:50 +08:00
Tibor Vass
5ffd677824
Merge pull request #41020 from thaJeztah/fix_sandbox_cleanup
allocateNetwork: fix network sandbox not cleaned up on failure
2020-06-05 09:55:54 -07:00
Sebastiaan van Stijn
687bdc7c71
API: swarm: move PidsLimit to TaskTemplate.Resources
The initial implementation followed the Swarm API, where
PidsLimit is located in ContainerSpec. This is not the
desired place for this property, so moving the field to
TaskTemplate.Resources in our API.

A similar change should be made in the SwarmKit API (likely
keeping the old field for backward compatibility, because
it was merged some releases back)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-05 12:50:38 +02:00
Tibor Vass
fa38a6cd21
Merge pull request #40937 from thaJeztah/split_resource_types
API: split types for Resources Reservations and Limits
2020-06-04 17:33:47 -07:00
Sebastiaan van Stijn
888da28d42
Merge pull request #41030 from justincormack/default-sysctls
Add default sysctls to allow ping sockets and privileged ports with no capabilities
2020-06-04 22:31:51 +02:00
Justin Cormack
dae652e2e5
Add default sysctls to allow ping sockets and privileged ports with no capabilities
Currently default capability CAP_NET_RAW allows users to open ICMP echo
sockets, and CAP_NET_BIND_SERVICE allows binding to ports under 1024.
Both of these are safe operations, and Linux now provides ways that
these can be set, per container, to be allowed without any capabilties
for non root users. Enable these by default. Users can revert to the
previous behaviour by overriding the sysctl values explicitly.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2020-06-04 18:11:08 +01:00
Akhil Mohan
7ad0da7051
remove group name from identity mapping
NewIdentityMapping took group name as an argument, and used
the group name also to parse the /etc/sub{uid,gui}. But as per
linux man pages, the sub{uid,gid} file maps username or uid,
not a group name.

Therefore, all occurrences where mapping is used need to
consider only username and uid. Code trying to map using gid
and group name in the daemon is also removed.

Signed-off-by: Akhil Mohan <akhil.mohan@mayadata.io>
2020-06-03 20:04:42 +05:30
Brian Goff
89382f2f20
Merge pull request #41023 from thaJeztah/better_logs
daemon.allocateNetwork: include original error in logs
2020-05-28 13:42:42 -07:00
Brian Goff
763f9e799b
Merge pull request #40846 from AkihiroSuda/cgroup2-use-systemd-by-default
cgroup2: use "systemd" cgroup driver by default when available
2020-05-28 11:37:39 -07:00
Sebastiaan van Stijn
a5324d6950
Better selection of DNS server
Commit e353e7e3f0 updated selection of the
`resolv.conf` file to use in situations where systemd-resolvd is used as
a resolver.

If a host uses `systemd-resolvd`, the system's `/etc/resolv.conf` file is
updated to set `127.0.0.53` as DNS, which is the local IP address for
systemd-resolvd. The DNS servers that are configured by the user will now
be stored in `/run/systemd/resolve/resolv.conf`, and systemd-resolvd acts
as a forwarding DNS for those.

Originally, Docker copied the DNS servers as configured in `/etc/resolv.conf`
as default DNS servers in containers, which failed to work if systemd-resolvd
is used (as `127.0.0.53` is not available inside the container's networking
namespace). To resolve this, e353e7e3f0 instead
detected if systemd-resolvd is in use, and in that case copied the "upstream"
DNS servers from the `/run/systemd/resolve/resolv.conf` configuration.

While this worked for most situations, it had some downsides, among which:

- we're skipping systemd-resolvd altogether, which means that we cannot take
  advantage of addition functionality provided by it (such as per-interface
  DNS servers)
- when updating DNS servers in the system's configuration, those changes were
  not reflected in the container configuration, which could be problematic in
  "developer" scenarios, when switching between networks.

This patch changes the way we select which resolv.conf to use as template
for the container's resolv.conf;

- in situations where a custom network is attached to the container, and the
  embedded DNS is available, we use `/etc/resolv.conf` unconditionally. If
  systemd-resolvd is used, the embedded DNS forwards external DNS lookups to
  systemd-resolvd, which in turn is responsible for forwarding requests to
  the external DNS servers configured by the user.
- if the container is running in "host mode" networking, we also use the
  DNS server that's configured in `/etc/resolv.conf`. In this situation, no
  embedded DNS server is available, but the container runs in the host's
  networking namespace, and can use the same DNS servers as the host (which
  could be systemd-resolvd or DNSMasq
- if the container uses the default (bridge) network, no embedded DNS is
  available, and the container has its own networking namespace. In this
  situation we check if systemd-resolvd is used, in which case we skip
  systemd-resolvd, and configure the upstream DNS servers as DNS for the
  container. This situation is the same as is used currently, which means
  that dynamically switching DNS servers won't be supported for these
  containers.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-25 18:20:56 +02:00
Sebastiaan van Stijn
288ed93dc5
daemon.allocateNetwork: include original error in logs
When failing to destroy a stale sandbox, we logged that the removal
failed, but omitted the original error message.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-25 18:10:58 +02:00
Sebastiaan van Stijn
84ef60cba2
allocateNetwork: don't assign unneeded variables
allocateNetwork() can return early, in which case these variables were unused.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-25 14:12:33 +02:00
Sebastiaan van Stijn
b98b8df886
allocateNetwork: fix network sandbox not cleaned up on failure
The defer function was checking for the local `err` variable, not
on the error that was returned by the function. As a result, the
sandbox would never be cleaned up for containers that used "none"
networking, and a failiure occured during setup.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-25 14:10:48 +02:00
Tibor Vass
5c10ea6ae8
Merge pull request #40725 from cpuguy83/check_img_platform
Accept platform spec on container create
2020-05-21 11:33:27 -07:00
Akihiro Suda
50867791d6
Merge pull request #40967 from tonistiigi/tls-fix
registry: fix mtls config dir passing
2020-05-20 00:26:13 +09:00
Akihiro Suda
225d64ebf1
Merge pull request #40969 from cpuguy83/fix_flakey_log_rotate_test
Fix flakey test for log file rotate.
2020-05-20 00:24:58 +09:00
Brian Goff
5ea5c02c88 Fix flakey test for log file rotate.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-05-18 10:27:53 -07:00
Sebastiaan van Stijn
84748c7d4e
API: split types for Resources Reservations and Limits
This introduces A new type (`Limit`), which allows Limits
and "Reservations" to have different options, as it's not
possible to make "Reservations" for some kind of limits.

The `GenericResources` have been removed from the new type;
the API did not handle specifying `GenericResources` as a
_Limit_ (only as _Reservations_), and this field would
therefore always be empty (omitted) in the `Limits` case.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-18 14:21:23 +02:00
Jaime Cepeda
f48b7d66f3
Fix filter on expose and publish
- Add tests to ensure it's working
- Rename variables for better clarification
- Fix validation test
- Remove wrong filter assertion based on publish filter
- Change port on test

Signed-off-by: Jaime Cepeda <jcepedavillamayor@gmail.com>
2020-05-15 11:12:03 +02:00
Tonis Tiigi
fdb71e410c registry: fix mtls config dir passing
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2020-05-14 12:02:09 -07:00
Tibor Vass
298ba5b131
Merge pull request #40427 from thaJeztah/prometheus_remove_experimental
Do not require "experimental" for metrics API
2020-05-08 11:10:53 -07:00
Brian Goff
75d655320e
Merge pull request #40920 from cpuguy83/log_rotate_error_handling
logfile: Check if log is closed on close error during rotate
2020-05-07 14:45:42 -07:00
Sebastiaan van Stijn
b453b64d04
Merge pull request #40845 from AkihiroSuda/allow-privileged-cgroupns-private-on-cgroup-v1
support `--privileged --cgroupns=private` on cgroup v1
2020-05-07 21:11:42 +02:00
Brian Goff
47d9489e7c
Merge pull request #40907 from thaJeztah/bump_selinux
vendor: opencontainers/selinux v1.5.1
2020-05-07 11:51:08 -07:00
Brian Goff
3989f91075 logfile: Check if log is closed on close error during rotate
This prevents getting into a situation where a container log cannot make
progress because we tried to rotate a file, got an error, and now the
file is closed. The next time we try to write a log entry it will try
and rotate again but error that the file is already closed.

I wonder if there is more we can do to beef up this rotation logic.
Found this issue while investigating missing logs with errors in the
docker daemon logs like:

```
Failed to log message for json-file: error closing file: close <file>:
file already closed
```

I'm not sure why the original rotation failed since the data was no
longer available.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-05-07 11:37:06 -07:00
Brian Goff
232ebf7fa4
Merge pull request #40868 from thaJeztah/what_is_the_cause
Replace errors.Cause() with errors.Is() / errors.As()
2020-05-07 11:33:26 -07:00
Sebastiaan van Stijn
a8216806ce
vendor: opencontainers/selinux v1.5.1
full diff: https://github.com/opencontainers/selinux/compare/v1.3.3...v1.5.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-05 20:33:06 +02:00
Brian Goff
f6163d3f7a
Merge pull request #40673 from kolyshkin/scan
Simplify daemon.overlaySupportsSelinux(), fix use of bufio.Scanner.Err()
2020-04-29 17:18:37 -07:00
Sebastiaan van Stijn
c3b3aedfa4
Merge pull request #40662 from AkihiroSuda/cgroup2-dockerinfo
cgroup2: implement `docker info`
2020-04-29 22:57:00 +02:00
Sebastiaan van Stijn
07d60bc257
Replace errors.Cause() with errors.Is() / errors.As()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-29 00:28:41 +02:00
Wilson Júnior
964731e1d3
Improve error feedback when plugin does not implement desired interface
Signed-off-by: Wilson Júnior <wilsonpjunior@gmail.com>
2020-04-21 18:06:24 -03:00
Akihiro Suda
4714ab5d6c cgroup2: use "systemd" cgroup driver by default when available
The "systemd" cgroup driver is always preferred over "cgroupfs" on
systemd-based hosts.

This commit does not affect cgroup v1 hosts.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-22 05:13:37 +09:00
Sebastiaan van Stijn
8312004f41
remove uses of deprecated pkg/term
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-21 16:29:27 +02:00
Akihiro Suda
33ee7941d4 support --privileged --cgroupns=private on cgroup v1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-21 23:11:32 +09:00
Sebastiaan van Stijn
f337a8d21d
Do not require "experimental" for metrics API
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-20 22:19:00 +02:00
Akihiro Suda
55e6d7d36f
Merge pull request #37867 from mountkin/fix-ov2
enhance storage-opt validation logic in overlay2 driver
2020-04-19 23:02:19 +09:00
Akihiro Suda
f350b53241 cgroup2: implement docker info
ref: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-17 07:20:01 +09:00
Brian Goff
2200d938a2
Merge pull request #40796 from cpuguy83/log_reads_allocs
Reduce allocations for logfile reader
2020-04-16 12:09:41 -07:00
Sebastiaan van Stijn
c8e31dc2f2
Merge pull request #39882 from thaJeztah/swarm_pids_limit
Add API support for PidsLimit on services
2020-04-16 21:02:30 +02:00
Brian Goff
620bce847d
Merge pull request #40749 from DanielQujun/zombie_check_for_container
add zombie check for container when killing it
2020-04-16 12:01:58 -07:00
Sebastiaan van Stijn
54d88a7cd3
Merge pull request #40478 from cpuguy83/dont-prime-the-stats
Add stats options to not prime the stats
2020-04-16 20:57:06 +02:00
Sebastiaan van Stijn
157c53c8e0
Add API support for PidsLimit on services
Support for PidsLimit was added to SwarmKit in docker/swarmkit/pull/2415,
but never exposed through the Docker remove API.

This patch exposes the feature in the repote API.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-15 22:37:42 +02:00
Sebastiaan van Stijn
db669cd117
Merge pull request #40814 from tonistiigi/buildkit-update
vendor: update buildkit to ae7ff174
2020-04-15 21:18:51 +02:00
Akihiro Suda
2e5923c547
Merge pull request #39705 from thaJeztah/daemon_nits
daemon: various nits and small fixes
2020-04-15 09:11:25 +09:00
Tonis Tiigi
0cdf6ba9c8 vendor: update buildkit to ae7ff174
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2020-04-14 08:26:07 -07:00
Sebastiaan van Stijn
eb14d936bf
daemon: rename variables that collide with imported package names
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:23 +02:00
Sebastiaan van Stijn
797ec8e913
daemon: rename all receivers to "daemon"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:21 +02:00
Sebastiaan van Stijn
5d040cbd16
daemon: fix capitalization of some functions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:19 +02:00
Sebastiaan van Stijn
eeef12f469
daemon: address some minor linting issues and nits
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-14 17:22:17 +02:00
Brian Goff
ced91bee4b On startup, actually shutdown the container.
When a container is left running after the daemon exits (e.g. the daemon
is SIGKILL'd or crashes), it should stop any running containers when the
daemon starts back up.

What actually happens is the daemon only sends the container's
configured stop signal and does not check if it has exited.
If the container does not actually exit then it is left running.

This fixes this unexpected behavior by calling the same function to shut
down the container that the daemon shutdown process does.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-04-13 14:20:12 -07:00
Shijiang Wei
e9d785ce3f enhance storage-opt validation logic in overlay2 driver
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
2020-04-12 23:22:30 +08:00
屈骏
f3c1eec99e add zombie check for container when killing it, alernative fix for #40735.
Signed-off-by: 屈骏 <qujun@tiduyun.com>
2020-04-10 16:46:31 +08:00
Brian Goff
933a87236f Reduce allocations for logfile reader
Before this change, the log decoder function provided by the log driver
to logfile would not be able to re-use buffers, causing undeeded
allocations and memory bloat for dockerd.

This change introduces an interface that allows the log driver to manage
it's memory usge more effectively.
This only affects json-file and local log drivers.

`json-file` still is not great just because of how the json decoder in the
stdlib works.
`local` is significantly improved.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-04-08 12:24:31 -07:00
Carlos de Paula
7ac638f86a Add support to riscv64 to the build scripts
Added riscv64 architecture support to the scripts used to build Docker
and it's dependencies.

Signed-off-by: Carlos de Paula <me@carlosedp.com>
2020-04-03 14:33:32 -03:00
Sebastiaan van Stijn
af0415257e
Merge pull request #40694 from kolyshkin/moby-sys-mount-part-II
switch to moby/sys/{mount,mountinfo} part II
2020-04-02 21:52:21 +02:00
Sebastiaan van Stijn
b6f07fad51
Merge pull request #40712 from vboulineau/bugfix-windows-user-cpu
Fix CPU Stat value UsageInUsermode on Windows
2020-04-02 21:37:57 +02:00
Akihiro Suda
3802830989 cgroup2: implement docker stats
The following fields are unsupported:

* BlkioStats: all fields other than IoServiceBytesRecursive
* CPUStats: CPUUsage.PercpuUsage
* MemoryStats: MaxUsage and Failcnt

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-04-02 17:51:34 +09:00
Kir Kolyshkin
5b658a0348 daemon.overlaySupportsSelinux: simplify check
1. Sscanf is very slow, and we don't use the first two fields -- get rid of it.

2. Since the field we search for is at the end of line and prepended by
   a space, we can just use strings.HaveSuffix.

3. Error checking for bufio.Scanner should be done after the Scan()
   loop, not inside it.

Fixes: 885b29df09
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-03-31 14:32:42 -07:00
vboulineau
ec16053ccf
Fix UsageInUsermode value on Windows
Looks like a wrong copy-paste using `RuntimeKernel100ns` twice instead of `RuntimeUser100ns`

Signed-off-by: Vincent Boulineau <vincent.boulineau@datadoghq.com>
2020-03-27 16:43:22 +01:00
Brian Goff
7a9cb29fb9 Accept platform spec on container create
This enables image lookup when creating a container to fail when the
reference exists but it is for the wrong platform. This prevents trying
to run an image for the wrong platform, as can be the case with, for
example binfmt_misc+qemu.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-03-20 16:10:36 -07:00
Kir Kolyshkin
39048cf656 Really switch to moby/sys/mount*
Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential
outside users.

This commit was generated by the following bash script:

```
set -e -u -o pipefail

for file in $(git grep -l 'docker/docker/pkg/mount"' | grep -v ^pkg/mount); do
	sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \
		-e 's#mount\.\(GetMounts\|Mounted\|Info\|[A-Za-z]*Filter\)#mountinfo.\1#g' \
		$file
	goimports -w $file
done
```

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-03-20 09:46:25 -07:00
Brian Goff
714cba6740
Merge pull request #38788 from AkihiroSuda/bind-nonrecursive-swarm
service: support --mount type=bind,bind-nonrecursive
2020-03-13 15:42:46 -07:00
Sebastiaan van Stijn
1e078e1ac5
Merge pull request #40666 from AkihiroSuda/prohibit-rootless-systemd-cgroup1
daemon: fail early if rootless && cgroupdriver == "systemd" && cgroup v1
2020-03-13 22:29:04 +01:00
Akihiro Suda
745fa04e52 service: support --mount type=bind,bind-nonrecursive
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-13 19:01:28 +09:00
Akihiro Suda
f073ed4187
Merge pull request #39816 from tao12345666333/rm-systeminfo-error-handler
Remove `SystemInfo()` error handling.
2020-03-13 01:59:13 +09:00
Brian Goff
470d32d32a
Merge pull request #40483 from AkihiroSuda/fuse-overlayfs
new storage driver: fuse-overlayfs
2020-03-11 11:32:37 -07:00
Akihiro Suda
92e7f8f67c daemon: fail early if rootless && cgroupdriver == "systemd" && cgroup v1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-11 12:49:03 +09:00
Jintao Zhang
18c22f5bc1 fix backingFs assignment
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
2020-03-10 00:03:59 +08:00
Akihiro Suda
5ddbe511a1
Merge pull request #40614 from thaJeztah/more_deprecating
Add warnings for deprecated "cluster" functions, and deprecate API fields
2020-03-06 16:35:42 +09:00
Sebastiaan van Stijn
616e64b42f
API: deprecate /info "ClusterStore" and "ClusterAdvertise" fields
These fields will now be omitted when empty.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-03-03 18:10:47 +01:00
Sebastiaan van Stijn
a5538c06f9
Add warning about deprecated "cluster" options to "docker info"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-03-03 18:10:31 +01:00
Akihiro Suda
9a82a9a8ea vendor containerd, BuildKit, protobuf, grpc, and golang.org/x
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-03 10:25:20 +09:00
Brian Goff
76e3a49933
Merge pull request #40486 from AkihiroSuda/rootless-cgroup2-systemd
rootless: support `--exec-opt native.cgroupdriver=systemd`
2020-03-02 16:11:21 -08:00
Brian Goff
ce1ceeb257 Add stats options to not prime the stats
Metrics collectors generally don't need the daemon to prime the stats
with something to compare since they already have something to compare
with.
Before this change, the API does 2 collection cycles (which takes
roughly 2s) in order to provide comparison for CPU usage over 1s. This
was primarily added so that `docker stats --no-stream` had something to
compare against.

Really the CLI should have just made a 2nd call and done the comparison
itself rather than forcing it on all API consumers.
That ship has long sailed, though.

With this change, clients can set an option to just pull a single stat,
which is *at least* a full second faster:

Old:
```
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=false > /dev/null
2>&1

real0m1.864s
user0m0.005s
sys0m0.007s

time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=false > /dev/null
2>&1

real0m1.173s
user0m0.010s
sys0m0.006s
```

New:
```
time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=true > /dev/null
2>&1
real0m0.680s
user0m0.008s
sys0m0.004s

time curl --unix-socket
/go/src/github.com/docker/docker/bundles/test-integration-shell/docker.sock
http://./containers/test/stats?stream=false\&one-shot=true > /dev/null
2>&1

real0m0.156s
user0m0.007s
sys0m0.007s
```

This fixes issues with downstreams ability to use the stats API to
collect metrics.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-02-28 09:54:37 -08:00
Brian Goff
40b2b4b083
Merge pull request #40594 from sfzhu93/Mis_Unlock
daemon/cluster: add a missing Unlock
2020-02-28 09:47:53 -08:00
Sebastiaan van Stijn
39679991f4
Merge pull request #40543 from SamWhited/upstream_logging
Upstream logging changes from Enterprise Edition
2020-02-27 13:54:14 +01:00
Ziheng Liu
83c0bedba9 daemon/cluster: add a missing Unlock
Signed-off-by: Ziheng Liu <lzhfromustc@gmail.com>
2020-02-25 13:55:11 -05:00
Brian Goff
62bd5a33f7
Merge pull request #40137 from fuweid/me-wait-for-remote-containerd-before-reload
daemon: add grpc.WithBlock option
2020-02-21 10:11:10 -08:00
Akihiro Suda
0c6b85717b
Merge pull request #40481 from cpuguy83/stats_use_cond_var
Use condition variable to wake stats collector.
2020-02-21 13:37:14 +09:00
Brian Goff
750f0d1648 Support configuration of log cacher.
Configuration over the API per container is intentionally left out for
the time being, but is supported to configure the default from the
daemon config.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit cbecf48bc352e680a5390a7ca9cff53098cd16d7)
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2020-02-19 17:02:34 -05:00
Brian Goff
e2ceb83a53 Support reads for all log drivers.
This supplements any log driver which does not support reads with a
custom read implementation that uses a local file cache.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit d675e2bf2b75865915c7a4552e00802feeb0847f)
Signed-off-by: Madhu Venugopal <madhu@docker.com>
2020-02-19 17:01:44 -05:00
Akihiro Suda
ca4b51868a rootless: support --exec-opt native.cgroupdriver=systemd
Support cgroup as in Rootless Podman.

Requires cgroup v2 host with crun.
Tested with Ubuntu 19.10 (kernel 5.3, systemd 242), crun v0.12.1.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-02-14 15:32:31 +09:00
Sebastiaan van Stijn
3af8d484b1
Merge pull request #40394 from tao12345666333/reserved-namespace-labels
enforce reserve internal labels.
2020-02-13 01:47:05 +01:00
Sebastiaan van Stijn
58c2615208
Merge pull request #40497 from arkodg/fix-bip-subnet-config
Set the bip network value as the subnet
2020-02-12 12:41:29 +01:00
Jintao Zhang
35d6c1870f enforce reserve internal labels.
The namespaces com.docker.*, io.docker.*, org.dockerproject.*
have been documented to be reserved for Docker's internal use.

Co-Authored-By: Sebastiaan van Stijn <thaJeztah@users.noreply.github.com>
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
2020-02-12 12:03:35 +08:00
Sebastiaan van Stijn
562880b276
Fix more goimports
```
daemon/logger/splunk/splunk_test.go:33: File is not `goimports`-ed (goimports)
        envKey:      "a",
        envRegexKey: "^foo",
        labelsKey:   "b",
        tagKey:      "c",
integration/build/build_test.go:41: File is not `goimports`-ed (goimports)
            rm:      false,
            forceRm: false,
integration/image/remove_unix_test.go:49: File is not `goimports`-ed (goimports)
        Root: d.Root,
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 18:56:25 +01:00
Arko Dasgupta
f800d5f786 Set the bip network value as the subnet
Dont assign the --bip value directly to the subnet
for the default bridge. Instead use the network value
from the ParseCIDR output

Addresses: https://github.com/moby/moby/issues/40392

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2020-02-10 17:38:54 -08:00
Sebastiaan van Stijn
008fc67974
Fluentd: add fluentd-request-ack option
This adds a new `fluentd-request-ack` logging option for the Fluentd
logging driver. If enabled, the server will respond with an acknowledgement.
This option improves the reliability of the message transmission. This
change is not versioned, and affects all API versions.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 02:13:24 +01:00
Sebastiaan van Stijn
cc1f3c750e
Fluentd: add fluentd-async option, deprecate fluentd-async-connect
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 02:13:22 +01:00
Sebastiaan van Stijn
a1d4a081dd
Fluentd: extract parsing config, and validate early
This extracts parsing the driver's configuration to a
function, and uses the same function both when initializing
the driver, and when validating logging options.

Doing so allows validating if the provided options are in
the correct format when calling `ValidateOpts`, instead
of resulting in an error when initializing the logging driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 02:13:20 +01:00
Sebastiaan van Stijn
8bd4aedb02
Fluentd: sort consts alphabetically
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 02:13:18 +01:00
Sebastiaan van Stijn
ad13a2a4ba
Fluentd: return "invalid parameter" for invalid config options
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 02:13:15 +01:00
Sebastiaan van Stijn
9f0b3f5609
bump gotest.tools v3.0.1 for compatibility with Go 1.14
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 00:06:42 +01:00
Akihiro Suda
7418745001 new storage driver: fuse-overlayfs
`fuse-overlayfs` provides rootless overlayfs functionality without depending
on any kernel patch.

Aside from rootless, `fuse-overlayfs` could be potentially used for eliminating
`chown()` calls that happen in userns-remap mode, because `fuse-overlayfs` also
provides shiftfs functionality.

System requirements:
* fuse-overlayfs needs to be installed. Tested with 0.7.6.
* kernel >= 4.18

Unit test: `go test -exec sudo -v ./daemon/graphdriver/fuse-overlayfs`

The implementation is based on Podman's `overlay` driver which supports
both kernel-mode overlayfs and fuse-overlayfs in the single driver instance:
https://github.com/containers/storage/blob/39a8d5ed/drivers/overlay/overlay.go

However, Moby's implementation aims to decouple `fuse-overlayfs` driver from the
kernel-mode driver (`overlay2`) for simplicity.

Fix #40218

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-02-10 23:48:52 +09:00
Brian Goff
e75e6b0e31 Use condition variable to wake stats collector.
Before the collection goroutine wakes up every 1 second (as configured).
This sleep interval is in case there are no stats to collect we don't
end up in a tight loop.

Instead use a condition variable to signal that a collection is needed.
This prevents us from waking the goroutine needlessly when there is no
one looking for stats.

For now I've kept the sleep just moved it to the end of the loop, which
gives some space between collections.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-02-08 11:06:34 -08:00
Brian Goff
f464c31668 Check tmpfs mounts before create anon volume
This makes sure that things like `--tmpfs` mounts over an anonymous
volume don't create volumes uneccessarily.
One method only checks mountpoints, the other checks both mountpoints
and tmpfs... the usage of these should likely be consolidated.

Ideally, processing for `--tmpfs` mounts would get merged in with the
rest of the mount parsing. I opted not to do that for this change so the
fix is minimal and can potentially be backported with fewer changes of
breaking things.
Merging the mount processing for tmpfs can be handled in a followup.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-02-04 10:12:05 -08:00
Sebastiaan van Stijn
ca20bc4214
Merge pull request #40007 from arkodg/add-host-docker-internal
Support host.docker.internal in dockerd on Linux
2020-01-27 13:42:26 +01:00
Arko Dasgupta
92e809a680 Support host.docker.internal in dockerd on Linux
Docker Desktop (on MAC and Windows hosts) allows containers
running inside a Linux VM to connect to the host using
the host.docker.internal DNS name, which is implemented by
VPNkit (DNS proxy on the host)

This PR allows containers to connect to Linux hosts
by appending a special string "host-gateway" to --add-host
e.g. "--add-host=host.docker.internal:host-gateway" which adds
host.docker.internal DNS entry in /etc/hosts and maps it to host-gateway-ip

This PR also add a daemon flag call host-gateway-ip which defaults to
the default bridge IP
Docker Desktop will need to set this field to the Host Proxy IP
so DNS requests for host.docker.internal can be routed to VPNkit

Addresses: https://github.com/docker/for-linux/issues/264

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2020-01-22 13:30:00 -08:00
Akihiro Suda
f9d136b6c6
Merge pull request #40307 from dperny/swarm-jobs
Add support for swarm jobs
2020-01-20 12:57:05 +09:00
Sebastiaan van Stijn
be095a1859
Merge pull request #40366 from arkodg/check-cidr-ipv6
Handle the error case when fixed-cidr-ipv6 is empty and ipv6 is enabled
2020-01-14 13:53:45 +01:00
Drew Erny
30d9fe30b1 Add swarm jobs
Adds support for ReplicatedJob and GlobalJob service modes. These modes
allow running service which execute tasks that exit upon success,
instead of daemon-type tasks.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2020-01-13 13:21:12 -06:00
Arko Dasgupta
bdad16b0ee Handle error case when fixed-cidr-ipv6 is empty
When IPv6 is enabled, make sure fixed-cidr-ipv6 is set
by the user since there is no default IPv6 local subnet
in the IPAM

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2020-01-13 09:56:41 -08:00
Sebastiaan van Stijn
339fb74cbc
prevent panic if TINI_COMMIT isn't set during build
If TINI_COMMIT isn't set, .go-autogen sets an empty value
as the "expected" commit. Attempting to truncate the value
caused a panic in that situation.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-01-10 15:27:20 +01:00
Sebastiaan van Stijn
e6c1820ef5
Merge pull request #40174 from AkihiroSuda/cgroup2
support cgroup2
2020-01-09 20:09:11 +01:00
Brian Goff
03163f6825
Merge pull request #40291 from akhilerm/privileged-device
35991- make `--device` works at privileged mode
2020-01-02 10:09:31 -08:00
Akhil Mohan
86ebbe16de
remove host directory check
Signed-off-by: Akhil Mohan <akhil.mohan@mayadata.io>
2020-01-02 14:28:51 +05:30
Akihiro Suda
491531c12b cgroup2: mark cpu-rt-{period,runtime} unimplemented
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-01 02:58:40 +09:00
Akihiro Suda
19baeaca26 cgroup2: enable cgroup namespace by default
For cgroup v1, we were unable to change the default because of
compatibility issue.

For cgroup v2, we should change the default right now because switching
to cgroup v2 is already breaking change.

See also containers/libpod#4363 containers/libpod#4374

Privileged containers also use cgroupns=private by default.
https://github.com/containers/libpod/pull/4374#issuecomment-549776387

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-01 02:58:40 +09:00
Akihiro Suda
612343618d cgroup2: use shim V2
* Requires containerd binaries from containerd/containerd#3799 . Metrics are unimplemented yet.
* Works with crun v0.10.4, but `--security-opt seccomp=unconfined` is needed unless using master version of libseccomp
  ( containers/crun#156, seccomp/libseccomp#177 )
* Doesn't work with master runc yet
* Resource limitations are unimplemented

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-01 02:58:40 +09:00
Brian Goff
8d2456d1e6
Merge pull request #40246 from thaJeztah/system_windows_cleanup
pkg/system: minor cleanups and remove use of deprecated system.GetOSVersion()
2019-12-19 11:36:29 -08:00
Brian Goff
b95fad8e51
Merge pull request #40263 from thaJeztah/normalize_comments
Normalize comment formatting
2019-12-12 12:06:22 -08:00
Akhil Mohan
35b9e6989f
Make --device flag work in privileged mode
When a container is started in privileged mode, the device mappings
provided by `--device` flag was ignored. Now the device mappings will
be considered even in privileged mode.

Signed-off-by: Akhil Mohan <akhil.mohan@mayadata.io>
2019-12-06 18:43:56 +05:30
wenlxie
03b3ec1dd5
make --device works at privileged mode
Signed-off-by: wenlxie <wenlxie@ebay.com>
2019-12-06 18:17:03 +05:30
Brian Goff
e25754b80c
Merge pull request #40186 from pradipd/default-nat-subnet
Dockerd won't start if a network with the default subnet prefix already exists in HNS.
2019-12-03 09:31:29 -08:00
Sebastiaan van Stijn
f4f56b1197
daemon: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:43:53 +01:00
Sebastiaan van Stijn
ec4bc83258
daemon/graphdriver: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:43:23 +01:00
Sebastiaan van Stijn
6625fa6103
daemon/cluster: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:42:53 +01:00
Sebastiaan van Stijn
ba6bbca89a
daemon/logger: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:42:27 +01:00
Akihiro Suda
a4a0429eec
Merge pull request #40234 from Jonas-Heinrich/40232-comply-with-gelf-spec
Change gelf logger to comply with gelf spec
2019-11-26 12:06:47 +09:00
Akihiro Suda
7bd40b56ef
Merge pull request #40249 from thaJeztah/os_seek_deprecated
remove use of deprecated os.SEEK_END
2019-11-26 12:06:29 +09:00
Akihiro Suda
d2d8e96f51
Merge pull request #39940 from carlosedp/runtime-version
Change version parsing to support alternate runtimes
2019-11-26 09:55:53 +09:00
Sebastiaan van Stijn
6ee536b4a0
daemon: remove use of deprecated os.SEEK_END
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-25 18:53:03 +01:00
Sebastiaan van Stijn
044b74e33b
daemon: remove use of deprecated system.GetOSVersion()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-25 13:39:50 +01:00
Jonas Heinrich
5c6b913ff1
logger/gelf: Skip empty lines to comply with spec
The [gelf payload specification](http://docs.graylog.org/en/2.4/pages/gelf.html#gelf-payload-specification)
demands that the field `short_message` *MUST* be set by the client library.
Since docker logging via the gelf driver sends messages line by line, it can happen that messages with an empty
`short_message` are passed on. This causes strict downstream processors (like graylog) to raise an exception.

The logger now skips messages with an empty line.

Resolves: #40232
See also: #37572

Signed-off-by: Jonas Heinrich <Jonas@JonasHeinrich.com>
2019-11-25 11:55:15 +01:00
Tõnis Tiigi
d1d5f64766
Merge pull request #40021 from thaJeztah/carry_40017
Use newer x/sys/windows SecurityAttributes struct (carry 40017)
2019-11-21 08:57:22 -08:00
Sebastiaan van Stijn
1086671441
Merge pull request #40212 from olljanat/default-capabilities-to-caps
Move DefaultCapabilities() to caps package
2019-11-17 14:00:06 +01:00
Akihiro Suda
4124e78d57
Merge pull request #40100 from thaJeztah/test_fixes
Some small (test) fixes/improvements
2019-11-16 14:13:35 +09:00
Akihiro Suda
4f8070c84d
Merge pull request #40210 from kolyshkin/ovr-rm-checks
overlay[2]: rm extra checks in init
2019-11-15 22:43:52 +09:00
Olli Janatuinen
1308a3a99f Move DefaultCapabilities() to caps package
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2019-11-14 21:13:16 +02:00
Kir Kolyshkin
e226aea280 overlay[2]: rm fs checks
Now that we do check if overlay is working by performing an actual
overlayfs mount, there's no need in extra checks for the kernel version
or the filesystem type. Actual mount check is sufficient.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-11-14 08:13:12 -08:00
Olli Janatuinen
447a840254 Windows: Use system specific parallelism value on containers restart
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2019-11-11 15:44:47 +02:00
Kir Kolyshkin
649e4c8889 Fix/improve overlay support check
Before this commit, overlay check was performed by looking for
`overlay` in /proc/filesystem. This obviously might not work
for rootless Docker (fs is there, but one can't use it as non-root).

This commit changes the check to perform the actual mount, by reusing
the code previously written to check for multiple lower dirs support.

The old check is removed from both drivers, as well as the additional
check for the multiple lower dirs support in overlay2 since it's now
a part of the main check.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-11-08 11:49:39 -08:00
Kir Kolyshkin
d5687079ad overlay: move supportsMultipleLowerDir to utils
This moves supportsMultipleLowerDir() to overlayutils
so it can be used from both overlay and overlay2.

The only changes made were:
 * replace logger with logrus
 * don't use workDirName mergedDirName constants
 * add mnt var to improve readability a bit

This is a preparation for the next commit.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-11-08 11:48:47 -08:00
Yong Tang
f09dc2f4fc Fix docker crash when creating namespaces with UID in /etc/subuid and /etc/subgid
This fix tries to address the issue raised in 39353 where
docker crash when creating namespaces with UID in /etc/subuid and /etc/subgid.

The issue was that, mapping to `/etc/sub[u,g]id` in docker does not
allow numeric ID.

This fix fixes the issue by probing other combinations (uid:groupname, username:gid, uid:gid)
when normal username:groupname fails.

This fix fixes 39353.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2019-11-07 20:17:11 +00:00
Pradip Dhara
89c6febfc2 Dockerd won't start if a network with the default subnet prefix already exists in HNS.
Signed-off-by: Pradip Dhara <pradipd@microsoft.com>
2019-11-06 10:54:28 -08:00
Brian Goff
47c5c67ed8
Merge pull request #40032 from jmartin84/fix-grpc-withdialer-deprecation-warning
Fix grpc withdialer deprecation warning
2019-11-05 12:20:33 -08:00
Sebastiaan van Stijn
9a7e96b5b7
Rename "v1" to "statsV1"
follow-up to 27552ceb15, where this
was left as a review comment, but the PR was already merged.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-01 16:18:06 +01:00
Tõnis Tiigi
64fd3dc0d5
Merge pull request #40157 from lzhfromustc/GL_2test
awslogs & archive: prevent 2 goroutine leaks in test functions
2019-10-31 11:24:41 -07:00
Sebastiaan van Stijn
27552ceb15
bump containerd/cgroups 5fbad35c2a7e855762d3c60f2e474ffcad0d470a
full diff: c4b9ac5c76...5fbad35c2a

- containerd/cgroups#82 Add go module support
- containerd/cgroups#96 Move metrics proto package to stats/v1
- containerd/cgroups#97 Allow overriding the default /proc folder in blkioController
- containerd/cgroups#98 Allows ignoring memory modules
- containerd/cgroups#99 Add Go 1.13 to Travis
- containerd/cgroups#100 stats/v1: export per-cgroup stats

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-31 01:09:12 +01:00
Ziheng Liu
d7bc994a08 awslogs & archive: prevent 2 goroutine leaks in test functions
Signed-off-by: Ziheng Liu <lzhfromustc@gmail.com>
2019-10-29 17:03:38 -04:00
Wei Fu
9f73396dab daemon: add grpc.WithBlock option
WithBlock makes sure that the following containerd request is reliable.

In one edge case with high load pressure, kernel kills dockerd, containerd
and containerd-shims caused by OOM. When both dockerd and containerd
restart, but containerd will take time to recover all the existing
containers. Before containerd serving, dockerd will failed with gRPC
error. That bad thing is that restore action will still ignore the
any non-NotFound errors and returns running state for
already stopped container. It is unexpected behavior. And
we need to restart dockerd to make sure that anything is OK.

It is painful. Add WithBlock can prevent the edge case. And
n common case, the containerd will be serving in shortly.
It is not harm to add WithBlock for containerd connection.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2019-10-25 12:19:35 +08:00
Sebastiaan van Stijn
6b91ceff74
Use hcsshim osversion package for Windows versions
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-22 02:53:00 +02:00
Sebastiaan van Stijn
9d726f1c18
Add GoDoc to fix linting validation
The validate step in CI was broken, due to a combination of
086b4541cf, fbdd437d29,
and 85733620eb being merged to master.

```
api/types/filters/parse.go:39:1: exported method `Args.Keys` should have comment or be unexported (golint)
func (args Args) Keys() []string {
^
daemon/config/builder.go:19:6: exported type `BuilderGCFilter` should have comment or be unexported (golint)
type BuilderGCFilter filters.Args
     ^
daemon/config/builder.go:21:1: exported method `BuilderGCFilter.MarshalJSON` should have comment or be unexported (golint)
func (x *BuilderGCFilter) MarshalJSON() ([]byte, error) {
^
daemon/config/builder.go:35:1: exported method `BuilderGCFilter.UnmarshalJSON` should have comment or be unexported (golint)
func (x *BuilderGCFilter) UnmarshalJSON(data []byte) error {
^
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-21 21:36:22 +02:00
Tibor Vass
c4cf72bad3
Merge pull request #39964 from thaJeztah/bump_golangci_lint
bump golangci-lint v1.20.0
2019-10-21 10:57:02 -07:00
Sebastiaan van Stijn
5003128854
Merge pull request #40082 from thaJeztah/is_windows_const
daemon: add "isWindows" const
2019-10-19 01:17:24 +02:00