Commit graph

6229 commits

Author SHA1 Message Date
akolomentsev
e017717d96 keep old network ids
for windows all networks are re-populated in the store during network controller initialization. In current version it also regenerate network Ids which may be referenced by other components and it may cause broken references to a networks. This commit avoids regeneration of network ids.

Signed-off-by: Andrey Kolomentsev <andrey.kolomentsev@docker.com>
2019-01-23 14:53:27 -08:00
Olli Janatuinen
80d7bfd54d Capabilities refactor
- Add support for exact list of capabilities, support only OCI model
- Support OCI model on CapAdd and CapDrop but remain backward compatibility
- Create variable locally instead of declaring it at the top
- Use const for magic "ALL" value
- Rename `cap` variable as it overlaps with `cap()` built-in
- Normalize and validate capabilities before use
- Move validation for conflicting options to validateHostConfig()
- TweakCapabilities: simplify logic to calculate capabilities

Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-22 21:50:41 +02:00
Sebastiaan van Stijn
3449b12cc7
Use assert.NilError() instead of assert.Assert()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-21 13:16:02 +01:00
Sebastiaan van Stijn
2137b8ccf2
Add containerd, runc, and docker-init versions to /version
This patch adds version information about the containerd,
runc, and docker-init components to the /version endpoint.

With this patch applied, running:

```
curl --unix-socket /var/run/docker.sock http://localhost/version | jq .
```

Will produce this response:

```json
{
  "Platform": {
    "Name": ""
  },
  "Components": [
    {
      "Name": "Engine",
      "Version": "dev",
      "Details": {
        "ApiVersion": "1.40",
        "Arch": "amd64",
        "BuildTime": "2018-11-08T10:23:42.000000000+00:00",
        "Experimental": "false",
        "GitCommit": "7d02782d2f",
        "GoVersion": "go1.11.2",
        "KernelVersion": "4.9.93-linuxkit-aufs",
        "MinAPIVersion": "1.12",
        "Os": "linux"
      }
    },
    {
      "Name": "containerd",
      "Version": "v1.1.4",
      "Details": {
        "GitCommit": "9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b"
      }
    },
    {
      "Name": "runc",
      "Version": "1.0.0-rc5+dev",
      "Details": {
        "GitCommit": "a00bf0190895aa465a5fbed0268888e2c8ddfe85"
      }
    },
    {
      "Name": "docker-init",
      "Version": "0.18.0",
      "Details": {
        "GitCommit": "fec3683"
      }
    }
  ],
  "Version": "dev",
  "ApiVersion": "1.40",
  "MinAPIVersion": "1.12",
  "GitCommit": "7d02782d2f",
  "GoVersion": "go1.11.2",
  "Os": "linux",
  "Arch": "amd64",
  "KernelVersion": "4.9.93-linuxkit-aufs",
  "BuildTime": "2018-11-08T10:23:42.000000000+00:00"
}
```

When using a recent version of the CLI, that information is included in the
output of `docker version`:

```
Client: Docker Engine - Community
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:46:51 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          dev
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.11.2
  Git commit:       7d02782d2f
  Built:            Thu Nov  8 10:23:42 2018
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.1.4
  GitCommit:        9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b
 runc:
  Version:          1.0.0-rc5+dev
  GitCommit:        a00bf0190895aa465a5fbed0268888e2c8ddfe85
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-14 23:27:05 +01:00
Sebastiaan van Stijn
8364d1c9d5
Fix: network=host using wrong resolv.conf with systemd-resolved
When running a container in the host's network namespace, the container
gets a copy of the host's resolv.conf (copied to `/etc/resolv.conf` inside
the container).

The current code always used the default (`/etc/resolv.conf`) path on the
host, irregardless if `systemd-resolved` was used or not.

This patch uses the correct file if `systemd-resolved` was detected
to be running.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-10 22:58:55 +01:00
Sascha Grunert
7f3910c92e
Fix possible segfault in config reload
This commit fixes two possible crashes in the `*Daemon` bound method
`reloadMaxConcurrentDownloadsAndUploads()`.

The first fixed issue is when `daemon.imageService` is `nil`. The second
panic can occur if the provided `*config.Config` is incomplete and the
fields `conf.MaxConcurrentDownloads` or `conf.MaxConcurrentUploads` are
`nil`.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-01-10 15:34:02 +01:00
zhangyue
c6894aa492 fix: simplify code logic
Signed-off-by: zhangyue <zy675793960@yeah.net>
2019-01-08 11:00:34 +08:00
Yong Tang
7315a2bb11 Fix go vet issue in daemon/daemon.go
This fix fixes go vet issue:
```
daemon/daemon.go:273: loop variable id captured by func literal
daemon/daemon.go:280: loop variable id captured by func literal
```

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2019-01-06 00:18:29 +00:00
Yong Tang
0104abf0d6
Merge pull request #38409 from innovimax/patch-1
fix typo
2019-01-04 23:35:09 -08:00
Brian Goff
6825db8c94
Merge pull request #38450 from thaJeztah/remove_deprecated_grpc_functions
Replace deprecated grpc.ErrorDesc() and grpc.Code() calls
2019-01-04 16:46:49 -08:00
Yong Tang
0281db99a9 Follow up to PR 38407
This fix is a follow up to PR 38407 to use assert.Error
and assert.NilError when appropriate

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2019-01-03 01:23:24 +00:00
Yong Tang
626022d0f6
Merge pull request #38407 from maximilianomaccanti/master
Add two configurable options to awslogs driver
2019-01-03 08:48:25 +08:00
Sebastiaan van Stijn
72b0b0387d
Replace deprecated grpc.ErrorDesc() and grpc.Code() calls
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-30 12:34:28 +01:00
Radostin Stoyanov
64b3b13576 Enable checkpoint/restore of containers with tty
CRIU supports checkpoint and restore of tty devices since version 2.12
which was released on 8th of March 2017. Support for this functionality
was implemented with opencontainers/runc@1c43d09 (checkpoint: add
support for containers with terminals) and containerd/containerd@60daa41
(Allow to checkpoint and restore a container with console).

Therefore, we can enable the support in moby/docker.

Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
2018-12-30 07:52:37 +00:00
Innovimax
31348a2936 fix typo
Signed-off-by: innovimax <innovimax@gmail.com>
2018-12-28 01:30:31 +01:00
Joao Trindade
a7020454ca
Add options validation to syslog logger test
Adds the following validations to the syslog logger test:

 1. Only supported options are valid
 2. Log option syslog-address has to be a valid URI
 3. Log option syslog-address if is file has to exist
 4. Log option syslog-address if udp/tcp scheme, default to port 513
 5. Log-option syslog-facility has to be a valid facility
 6. Log-option syslog-format has to be a valid format

Signed-off-by: Joao Trindade <trindade.joao@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-24 20:43:41 +01:00
Sebastiaan van Stijn
8fbf2598f5
Merge pull request #37940 from olljanat/replicas-max-per-node
Added support for maximum replicas per node
2018-12-24 19:29:53 +01:00
Olli Janatuinen
153171e9dd Added support for maximum replicas per node to services
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2018-12-24 02:04:15 +02:00
Maximiliano Maccanti
ad8a8e8a9e NewStreamConfig UTest fixes
Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 22:24:40 +00:00
Maximiliano Maccanti
687cbfa739 Split StreamConfig from New, Utest table driven
Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 20:45:11 +00:00
Maximiliano Maccanti
512ac778bf Add two configurable options to awslogs driver
Add awslogs-force-flush-interval-seconds and awslogs-max-buffered-events configurable options to aswlogs driver to replace hardcoded values of repsectively 5 seconds and 4K.

Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 20:45:11 +00:00
Akihiro Suda
2cb26cfe9c
Merge pull request #38301 from cyphar/waitgroup-limits
daemon: switch to semaphore-gated WaitGroup for startup tasks
2018-12-22 00:07:55 +09:00
Aleksa Sarai
5a52917e4d
daemon: switch to semaphore-gated WaitGroup for startup tasks
Many startup tasks have to run for each container, and thus using a
WaitGroup (which doesn't have a limit to the number of parallel tasks)
can result in Docker exceeding the NOFILE limit quite trivially. A more
optimal solution is to have a parallelism limit by using a semaphore.

In addition, several startup tasks were not parallelised previously
which resulted in very long startup times. According to my testing, 20K
dead containers resulted in ~6 minute startup times (during which time
Docker is completely unusable).

This patch fixes both issues, and the parallelStartupTimes factor chosen
(128 * NumCPU) is based on my own significant testing of the 20K
container case. This patch (on my machines) reduces the startup time
from 6 minutes to less than a minute (ideally this could be further
reduced by removing the need to scan all dead containers on startup --
but that's beyond the scope of this patchset).

In order to avoid the NOFILE limit problem, we also detect this
on-startup and if NOFILE < 2*128*NumCPU we will reduce the parallelism
factor to avoid hitting NOFILE limits (but also emit a warning since
this is almost certainly a mis-configuration).

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2018-12-21 21:51:02 +11:00
Vincent Demeester
bcd817ee6b
Merge pull request #38393 from thaJeztah/refactor_container_validation
Refactor container validation
2018-12-20 14:20:01 +01:00
Vincent Demeester
b33dc72523
Merge pull request #38335 from yongtang/38258-syslog-rfc5424
Add zero padding for RFC5424 syslog format
2018-12-20 08:30:22 +01:00
Sebastiaan van Stijn
f6002117a4
Extract container-config and container-hostconfig validation
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 13:09:12 +01:00
Sebastiaan van Stijn
5fc0f03426
Extract workingdir validation/conversion to a function
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 10:24:39 +01:00
Sebastiaan van Stijn
c0697c27aa
Extract port-mapping validation to a function
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 10:24:33 +01:00
Sebastiaan van Stijn
e1809510ca
Extract restart-policy-validation to a function
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 10:24:28 +01:00
Sebastiaan van Stijn
6a7da0b31b
Extract healthcheck-validation to a function
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 10:24:23 +01:00
Sebastiaan van Stijn
b6e373c525
Rename verifyContainerResources to verifyPlatformContainerResources
This validation function is platform-specific; rename it to be
more explicit.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 10:24:09 +01:00
Sebastiaan van Stijn
e278678705
Remove unused argument from verifyPlatformContainerSettings
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 09:23:09 +01:00
Sebastiaan van Stijn
10c97b9357
Unify logging container validation warnings
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-19 09:15:21 +01:00
Sebastiaan van Stijn
2e23ef5350
Move port-publishing check to linux platform-check
Windows does not have host-mode networking, so on Windows, this
check was a no-op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-18 22:46:05 +01:00
Sebastiaan van Stijn
57f1305e74
Move "OOM Kill disable" warning to the daemon
Disabling the oom-killer for a container without setting a memory limit
is dangerous, as it can result in the container consuming unlimited memory,
without the kernel being able to kill it. A check for this situation is curently
done in the CLI, but other consumers of the API won't receive this warning.

This patch adds a check for this situation to the daemon, so that all consumers
of the API will receive this warning.

This patch will have one side-effect; docker cli's that also perform this check
client-side will print the warning twice; this can be addressed by disabling
the cli-side check for newer API versions, but will generate a bit of extra
noise when using an older CLI.

With this patch applied (and a cli that does not take the new warning into account);

```
docker create --oom-kill-disable busybox
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.
669933b9b237fa27da699483b5cf15355a9027050825146587a0e5be0d848adf

docker run --rm --oom-kill-disable busybox
WARNING: Disabling the OOM killer on containers without setting a '-m/--memory' limit may be dangerous.
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-18 22:30:56 +01:00
Vincent Demeester
56cc56b0fa
Merge pull request #38126 from mjameswh/fix-1715
Use idtools.LookupGroup instead of parsing /etc/group file for docker.sock ownership
2018-12-12 17:29:28 +01:00
Vincent Demeester
d4a6e1c44f
Merge pull request #38068 from kolyshkin/err
More context for errors
2018-12-12 09:02:37 +01:00
Akihiro Suda
62d80835ab
Merge pull request #38342 from crosbymichael/oci-refactor
Move caps and device spec utils to `oci` pkg
2018-12-11 13:48:38 -08:00
Vincent Demeester
510805655b
Merge pull request #38265 from AkihiroSuda/remove-migrate-v1
Remove v1.10 migrator
2018-12-11 16:21:09 +01:00
Michael Crosby
b940cc5cff Move caps and device spec utils to oci pkg
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-12-11 10:20:25 -05:00
Kir Kolyshkin
6533136961 pkg/mount: wrap mount/umount errors
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.

Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:

> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted

or

> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted

Before this patch, it was just

> operation not permitted

[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-12-10 20:07:02 -08:00
Kir Kolyshkin
2f98b5f51f aufs: get rid of mount()
The function is not needed as it's just a shallow wrapper around
unix.Mount().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-12-10 20:06:10 -08:00
Kir Kolyshkin
77bc327e24 UnmountIpcMount: simplify
As standard mount.Unmount does what we need, let's use it.

In addition, this adds ignoring "not mounted" condition, which
was previously implemented (see PR#33329, commit cfa2591d3f)
via a very expensive call to mount.Mounted().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-12-10 20:06:10 -08:00
Tibor Vass
6e3113f700
Merge pull request #38327 from andrewhsu/ctrd
update containerd to v1.2.1
2018-12-10 17:28:50 +01:00
Yong Tang
1082d1edf2 go vet fix for TestfillLicense
This small fix renames `TestfillLicense` to `TestFillLicense`
as otherwise go vet reports:
```
$ go tool vet daemon/licensing_test.go
daemon/licensing_test.go:11: TestfillLicense has malformed name: first letter after 'Test' must not be lowercase
```

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-12-09 00:51:37 +00:00
Yong Tang
fa6dabf876 Add zero padding for RFC5424 syslog format
This fix tries to address the issue raised in 38258
where current RFC5424 sys log format does not zero pad
the time (trailing zeros are removed)

This fix apply the patch to fix the issue. This fix fixes 38258.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-12-08 22:40:02 +00:00
Andrew Hsu
78045a5419
use empty string as cgroup path to grab first find
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-07 18:44:00 +01:00
Yong Tang
65d9a5dde5
Merge pull request #38267 from thaJeztah/wrap_errors
Use errors.Wrap() in daemon/config
2018-12-03 08:30:40 -08:00
Aleksa Sarai
f38ac72bca
oci: add integration tests for kernel.domainname configuration
This also includes a few refactors of oci_linux_test.go.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2018-11-30 19:44:50 +11:00
Aleksa Sarai
7417f50575
oci: include the domainname in "kernel.domainname"
The OCI doesn't have a specific field for an NIS domainname[1] (mainly
because FreeBSD and Solaris appear to have a similar concept but it is
configured entirely differently).

However, on Linux, the NIS domainname can be configured through both the
setdomainname(2) syscall but also through the "kernel.domainname"
sysctl. Since the OCI has a way of injecting sysctls this means we don't
need to have any OCI changes to support NIS domainnames (and we can
always switch if the OCI picks up such support in the future).

It should be noted that because we have to generate this each spec
creation we also have to make sure that it's not clobbered by the
HostConfig. I'm pretty sure making this change generic (so that
HostConfig will not clobber any pre-set sysctls) will not cause other
issues to crop up.

[1]: https://github.com/opencontainers/runtime-spec/issues/592

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2018-11-30 17:31:38 +11:00
Sebastiaan van Stijn
a8d2b29e8d
Use errors.Wrap() in daemon/config
using `errors.Wrap()` preserves the original error

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-30 01:27:47 +01:00
Sebastiaan van Stijn
813a7da526
Revert "Add limit to page size used by overlay2 driver"
This reverts commit 520034e35b.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-29 23:02:18 +01:00
James Watkins-Harvey
a2e384682b Use idtools.LookupGroup instead of parsing /etc/group file for docker.sock ownership
Signed-off-by: James Watkins-Harvey <jwatkins@progi-media.com>
2018-11-29 16:24:42 -05:00
Sebastiaan van Stijn
6f70946a27
Add warning to /info if KernelMemoryTCP is not supported
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-11-27 22:47:39 +01:00
Sebastiaan van Stijn
d3e75e4220
Merge pull request #37043 from yongtang/37038-kernelTCP
Add memory.kernelTCP support for linux
2018-11-27 22:36:10 +01:00
Vincent Demeester
6fa149805c
Merge pull request #37638 from jterry75/devices_windows
Add --device support for Windows
2018-11-27 15:03:17 +01:00
Yong Tang
f023816608 Add memory.kernelTCP support for linux
This fix tries to address the issue raised in 37038 where
there were no memory.kernelTCP support for linux.

This fix add MemoryKernelTCP to HostConfig, and pass
the config to runtime-spec.

Additional test case has been added.

This fix fixes 37038.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-11-26 21:03:08 +00:00
Sebastiaan van Stijn
0b7cb16dde
Merge pull request #38102 from selansen/master
VXLAN UDP Port configuration support
2018-11-24 11:50:10 +01:00
Akihiro Suda
1fea38856a Remove v1.10 migrator
The v1.10 layout and the migrator was added in 2015 via #17924.

Although the migrator is not marked as "deprecated" explicitly in
cli/docs/deprecated.md, I suppose people should have already migrated
from pre-v1.10 and they no longer need the migrator, because pre-v1.10
version do not support schema2 images (and these versions no longer
receives security updates).

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-11-24 17:45:13 +09:00
selansen
32180ac0c7 VXLAN UDP Port configuration support
This commit contains changes to configure DataPathPort
option. By default we use 4789 port number. But this commit
will allow user to configure port number during swarm init.
DataPathPort can't be modified after swarm init.
Signed-off-by: selansen <elango.siva@docker.com>
2018-11-22 17:35:02 -05:00
zhangyue
5007c36d71 cli: fix images filter when use multi reference filter
Signed-off-by: zhangyue <zy675793960@yeah.net>
2018-11-22 10:33:45 +08:00
Justin Terry (VM)
b2d99865ea Add --device support for Windows
Implements the --device forwarding for Windows daemons. This maps the physical
device into the container at runtime.

Ex:

docker run --device="class/<clsid>" <image> <cmd>

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
2018-11-21 15:31:17 -08:00
Sebastiaan van Stijn
758255791e
Merge pull request #38177 from mooncak/fix_duplicate
Cleanup duplication in daemon files
2018-11-13 09:55:51 +01:00
mooncake
345d1fd089 Cleanup duplication in daemon files
Signed-off-by: Bily Zhang <xcoder@tenxcloud.com>
2018-11-13 10:42:57 +08:00
Akihiro Suda
596cdffb9f mount: add BindOptions.NonRecursive (API v1.40)
This allows non-recursive bind-mount, i.e. mount(2) with "bind" rather than "rbind".

Swarm-mode will be supported in a separate PR because of mutual vendoring.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-11-06 17:51:58 +09:00
Sebastiaan van Stijn
12bba16306
Merge pull request #38029 from lifubang/checkpointrm
fixes checkpoint rm fail
2018-11-05 19:09:35 +01:00
Tonis Tiigi
489b8eda66 cluster: set bigger grpc limit for array requests
4MB client side limit was introduced in vendoring go-grpc#1165 (v1.4.0)
making these requests likely to produce errors

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-10-30 16:02:34 -07:00
Sebastiaan van Stijn
13ef0ebd2b
Deprecate AuFS storage driver, and add warning
The `aufs` storage driver is deprecated in favor of `overlay2`, and will
be removed in a future release. Users of the `aufs` storage driver are
recommended to migrate to a different storage driver, such as `overlay2`, which
is now the default storage driver.

The `aufs` storage driver facilitates running Docker on distros that have no
support for OverlayFS, such as Ubuntu 14.04 LTS, which originally shipped with
a 3.14 kernel.

Now that Ubuntu 14.04 is no longer a supported distro for Docker, and `overlay2`
is available to all supported distros (as they are either on kernel 4.x, or have
support for multiple lowerdirs backported), there is no reason to continue
maintenance of the `aufs` storage driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-26 18:41:46 +02:00
Sebastiaan van Stijn
1527a67212
Merge pull request #37999 from Microsoft/jjh/tar2vhd
LCOW: ApplyDiff() use tar2ext4, not SVM
2018-10-24 22:35:34 +02:00
Sebastiaan van Stijn
b48bf39a79
Merge pull request #37944 from IRCody/awslogs_error_context
Return more context on awslogs create failure
2018-10-24 21:00:15 +02:00
Anusha Ragunathan
6611ab1c6f
Merge pull request #37986 from samuelkarp/moby/moby-37747
awslogs: account for UTF-8 normalization in limits
2018-10-18 10:17:24 -07:00
Yong Tang
533e07afbe
Merge pull request #38032 from RohitK89/21497-log-image-name
Add IMAGE_NAME attribute to journald log events
2018-10-17 12:18:05 -07:00
Colin Panisset
5cd2bb315a Only add CONTAINER_PARTIAL_MESSAGE if not the last partial
Addresses #38045

Signed-off-by: Colin Panisset <colin.panisset@cevo.com.au>
2018-10-17 07:51:59 +11:00
Cody Roseborough
7a5c813d9c Return more context on awslogs create failure
Signed-off-by: Cody Roseborough <crrosebo@amazon.com>
2018-10-16 11:36:52 -07:00
Yong Tang
ee6fc90b2c
Merge pull request #37993 from kolyshkin/ovr2-index
overlay2: use index=off if possible (fix EBUSY on mount)
2018-10-13 08:28:10 -07:00
Yong Tang
9d4ac4b8d2
Merge pull request #38019 from thaJeztah/skip_deprecated_drivers_in_autoselect
Skip deprecated storage-drivers in auto-selection
2018-10-13 08:26:03 -07:00
Rohit Kapur
5f7e102df7 Add IMAGE_NAME as a key to journald log messages
Signed-off-by: Rohit Kapur <rkapur@flatiron.com>
2018-10-12 16:16:31 -04:00
Vincent Demeester
10ebe6381e
Merge pull request #38025 from thaJeztah/itsy_bitsy_teeny_weeny
Remove redundant nil checks
2018-10-12 18:43:11 +02:00
Lifubang
99a7a4dcd0 checkpoint rm fail
Signed-off-by: Lifubang <lifubang@acmcoder.com>
2018-10-12 19:08:28 +08:00
Kir Kolyshkin
16d822bba8 btrfs: ensure graphdriver home is bind mount
For some reason, shared mount propagation between the host
and a container does not work for btrfs, unless container
root directory (i.e. graphdriver home) is a bind mount.

The above issue was reproduced on SLES 12sp3 + btrfs using
the following script:

	#!/bin/bash
	set -eux -o pipefail

	# DIR should not be under a subvolume
	DIR=${DIR:-/lib}
	MNT=$DIR/my-mnt
	FILE=$MNT/file

	ID=$(docker run -d --privileged -v $DIR:$DIR:rshared ubuntu sleep 24h)
	docker exec $ID mkdir -p $MNT
	docker exec $ID mount -t tmpfs tmpfs $MNT
	docker exec $ID touch $FILE
	ls -l $FILE
	umount $MNT
	docker rm -f $ID

which fails this way:

	+ ls -l /lib/my-mnt/file
	ls: cannot access '/lib/my-mnt/file': No such file or directory

meaning the mount performed inside a priviledged container is not
propagated back to the host (even if all the mounts have "shared"
propagation mode).

The remedy to the above is to make graphdriver home a bind mount.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-10-11 23:45:00 -07:00
Sebastiaan van Stijn
de1094bc95
Remove redundant nil checks
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-11 23:19:01 +02:00
Kir Kolyshkin
8422d85087 overlay2: use index=off if possible
As pointed out in https://github.com/moby/moby/issues/37970,
Docker overlay driver can't work with index=on feature of
the Linux kernel "overlay" filesystem. In case the global
default is set to "yes", Docker will fail with EBUSY when
trying to mount, like this:

> error creating overlay mount to ...../merged: device or resource busy

and the kernel log should contain something like:

> overlayfs: upperdir is in-use by another mount, mount with
> '-o index=off' to override exclusive upperdir protection.

A workaround is to set index=off in overlay kernel module
parameters, or even recompile the kernel with
CONFIG_OVERLAY_FS_INDEX=n in .config. Surely this is not
always practical or even possible.

The solution, as pointed out my Amir Goldstein (as well as
the above kernel message:) is to use 'index=off' option
when mounting.

NOTE since older (< 4.13rc1) kernels do not support "index="
overlayfs parameter, try to figure out whether the option
is supported. In case it's not possible to figure out,
assume it is not.

NOTE the default can be changed anytime (by writing to
/sys/module/overlay/parameters/index) so we need to always
use index=off.

[v2: move the detection code to Init()]
[v3: don't set index=off if stat() failed]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-10-11 12:52:57 -07:00
Kir Kolyshkin
a55d32546a overlay2: use global logger instance
This simplifies the code a lot.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-10-11 12:50:45 -07:00
Sebastiaan van Stijn
b72db8b82c
Skip deprecated storage-drivers in auto-selection
Discourage users from using deprecated storage-drivers
by skipping them when automatically selecting a storage-
driver.

This change does not affect existing installations, because
existing state will take precedence.

Users can still use deprecated drivers by manually configuring
the daemon to use a specific driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-11 15:52:19 +02:00
Sebastiaan van Stijn
31be4e0ba1
Deprecate legacy overlay storage driver, and add warning
The `overlay` storage driver is deprecated in favor of the `overlay2` storage
driver, which has all the benefits of `overlay`, without its limitations (excessive
inode consumption). The legacy `overlay` storage driver will be removed in a future
release. Users of the `overlay` storage driver should migrate to the `overlay2`
storage driver.

The legacy `overlay` storage driver allowed using overlayFS-backed filesystems
on pre 4.x kernels. Now that all supported distributions are able to run `overlay2`
(as they are either on kernel 4.x, or have support for multiple lowerdirs
backported), there is no reason to keep maintaining the `overlay` storage driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-11 15:49:15 +02:00
Sebastiaan van Stijn
06fcabbaa0
Deprecate "devicemapper" storage driver, and add warning
The `devicemapper` storage driver is deprecated in favor of `overlay2`, and will
be removed in a future release. Users of the `devicemapper` storage driver are
recommended to migrate to a different storage driver, such as `overlay2`, which
is now the default storage driver.

The `devicemapper` storage driver facilitates running Docker on older (3.x) kernels
that have no support for other storage drivers (such as overlay2, or AUFS).

Now that support for `overlay2` is added to all supported distros (as they are
either on kernel 4.x, or have support for multiple lowerdirs backported), there
is no reason to continue maintenance of the `devicemapper` storage driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-11 15:46:26 +02:00
Samuel Karp
1e8ef38627 awslogs: account for UTF-8 normalization in limits
The CloudWatch Logs API defines its limits in terms of bytes, but its
inputs in terms of UTF-8 encoded strings.  Byte-sequences which are not
valid UTF-8 encodings are normalized to the Unicode replacement
character U+FFFD, which is a 3-byte sequence in UTF-8.  This replacement
can cause the input to grow, exceeding the API limit and causing failed
API calls.

This commit adds logic for counting the effective byte length after
normalization and splitting input without splitting valid UTF-8
byte-sequences into two invalid byte-sequences.

Fixes https://github.com/moby/moby/issues/37747

Signed-off-by: Samuel Karp <skarp@amazon.com>
2018-10-10 14:45:06 -07:00
Sebastiaan van Stijn
6efa2767d4
Merge pull request #38000 from Microsoft/jjh/processandiot
Windows: Client: Allow process isolation [RS5+]
2018-10-10 19:29:23 +02:00
John Howard
bde9996065 LCOW: ApplyDiff() use tar2ext4, not SVM
Signed-off-by: John Howard <jhoward@microsoft.com>

This removes the need for an SVM in the LCOW driver to ApplyDiff.

This change relates to a fix for https://github.com/moby/moby/issues/36353

However, it found another issue, tracked by https://github.com/moby/moby/issues/37955
2018-10-09 16:10:46 -07:00
Yong Tang
82a4797499
Merge pull request #37988 from mirake/fix-typos
Fix typo: adapater -> adapter
2018-10-09 12:47:18 -07:00
John Howard
c907c2486c Windows:Allow process isolation
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-10-09 11:58:26 -07:00
Yong Tang
2cc338c100
Merge pull request #37967 from thaJeztah/upstream_dos_fix
Fix denial of service with large numbers in cpuset-cpus and cpuset-mems
2018-10-08 13:23:03 +00:00
Sebastiaan van Stijn
fddefa72c4
Merge pull request #37983 from IRCody/tar_id_logging
Add layer id to NaiveDiffDriver untar timing log
2018-10-08 14:17:41 +02:00
Rui Cao
d3e155d926 Fix typo: adapater -> adapter
Signed-off-by: Rui Cao <ruicao@alauda.io>
2018-10-08 19:15:38 +08:00
Vincent Demeester
2bbd0bd8ef
Merge pull request #37802 from Microsoft/jjh/37687-docker-system-df
Fix docker system df when LCOW and WCOW images loaded
2018-10-08 09:26:35 +02:00
Akihiro Suda
b5ed4ebe06
Merge pull request #36537 from Microsoft/jjh/lcow-log-stderr
LCOW: Log stderr on failures to ease diagnosis
2018-10-06 11:05:55 +09:00
Cody Roseborough
3b4df3d146 Add layer id to NaiveDiffDriver untar timing log
Signed-off-by: Cody Roseborough <crrosebo@amazon.com>
2018-10-05 16:28:40 -07:00
mooncake
ea60a87fcf Remove duplicated words in daemon files
Signed-off-by: mooncake <xcoder@tenxcloud.com>
2018-10-06 00:06:38 +08:00
Justin Cormack
f8e876d761
Fix denial of service with large numbers in cpuset-cpus and cpuset-mems
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.

Reported by Huawei PSIRT.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-05 15:09:02 +02:00
Vincent Demeester
e3b712152d
Merge pull request #37968 from thaJeztah/no_more_version_mismatch
Remove version-checks for containerd and runc
2018-10-05 12:07:44 +02:00
Sebastiaan van Stijn
c65f0bd13c
Remove version-checks for containerd and runc
With containerd reaching 1.0, the runtime now
has a stable API, so there's no need to do a check
if the installed version matches the expected version.

Current versions of Docker now also package containerd
and runc separately, and can be _updated_ separately.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-10-04 23:17:13 +02:00
Sebastiaan van Stijn
192ff56d87
Merge pull request #37949 from selansen/master
Fix for default-addr-pool-mask-length param max value check
2018-10-04 22:00:42 +02:00
Sebastiaan van Stijn
9c4982685e
Merge pull request #37934 from dani-docker/esc-879
Masking proxy credentials from URL when displayed in system info
2018-10-04 19:37:58 +02:00
selansen
d25c5df80e Fix for default-addr-pool-mask-length param max value check
We check for max value for -default-addr-pool-mask-length param as 32.
But There won't be enough addresses on the  overlay network. Hence we are
keeping it 29 so that we would be having atleast 8 addresses in /29 network.

Signed-off-by: selansen <elango.siva@docker.com>
2018-10-04 00:30:22 -04:00
Sebastiaan van Stijn
f4d74d3802
Merge pull request #37774 from simonferquel/windows-network-plugin-miss-fix
Fix long startup on windows, with non-hns governed Hyper-V networks
2018-10-03 19:26:19 +02:00
Kir Kolyshkin
c378fb774e gd/dm: fix error message
The parameter name was wrong, which may mislead a user.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-10-02 16:19:08 -07:00
Dani Louca
78fd978454 Masking credentials from proxy URL
Signed-off-by: Dani Louca <dani.louca@docker.com>
2018-10-01 14:06:00 -04:00
Brian Goff
299015de40
Merge pull request #37888 from lifubang/renameimprove
oldName release too early when docker rename
2018-10-01 09:00:09 -07:00
Sebastiaan van Stijn
deac65c929
Merge pull request #37850 from AkihiroSuda/propagate-exec-root-to-libnetwork
daemon: propagate exec-root to libnetwork-setkey
2018-09-28 15:20:37 +02:00
John Howard
63f9c7784b LCOW: Log stderr on failures
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-09-26 13:23:04 -07:00
Yong Tang
7bfec8cd80
Merge pull request #37400 from olljanat/34795-allow-npipe
Allow mount type npipe on service/stack
2018-09-26 09:54:42 -07:00
Yong Tang
472a52861c
Merge pull request #37907 from tiborvass/remove-docker-prefix-containerd
Remove 'docker-' prefix for containerd and runc binaries
2018-09-26 02:52:31 -07:00
Vincent Demeester
9f296d1e6f
Merge pull request #37701 from dperny/add-swarmkit-sysctl-support
Add support for sysctl options in services
2018-09-26 09:06:22 +02:00
Brian Goff
12d5eb8e22
Merge pull request #37703 from kolyshkin/rm-dead-code
daemon/setMounts(): remove dead code
2018-09-25 16:07:15 -07:00
Tibor Vass
34eede0296 Remove 'docker-' prefix for containerd and runc binaries
This allows to run the daemon in environments that have upstream containerd installed.

Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-24 21:49:03 +00:00
Sebastiaan van Stijn
2a63a5f7a5
Merge pull request #37891 from mirake/fix-typos
Fix some typos
2018-09-24 17:22:34 +02:00
Tibor Vass
4a776d0ca7 builder: use buildkit's GC for build cache
This allows users to configure the buildkit GC.

The following enables the default GC:
```
{
  "builder": {
    "gc": {
      "enabled": true
    }
  }
}
```

The default GC policy has a simple config:
```
{
  "builder": {
    "gc": {
      "enabled": true,
      "defaultKeepStorage": "30GB"
    }
  }
}
```

A custom GC policy can be used instead by specifying a list of cache prune rules:
```
{
  "builder": {
    "gc": {
      "enabled": true,
      "policy": [
        {"keepStorage": "512MB", "filter": ["unused-for=1400h"]]},
        {"keepStorage": "30GB", "all": true}
      ]
    }
  }
}
```

Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-21 22:06:00 +00:00
Tõnis Tiigi
b1116479b2
Merge pull request #37852 from AntaresS/patch-buildkit
add support for "registry-mirrors" and "insecure-registries" to buildkit
2018-09-21 09:42:35 -07:00
Lifubang
73cf7dfe17 docker rename enhancement
Signed-off-by: Lifubang <lifubang@acmcoder.com>
2018-09-21 09:43:06 +08:00
Anda Xu
171d51c861 add support of registry-mirrors and insecure-registries to buildkit
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-20 11:53:02 -07:00
Drew Erny
14da20f5e7 Add support for sysctl options in services
Adds support for sysctl options in docker services.

* Adds API plumbing for creating services with sysctl options set.
* Adds swagger.yaml documentation for new API field.
* Updates the API version history document.
* Changes executor package to make use of the Sysctls field on objects
* Includes integration test to verify that new behavior works.

Essentially, everything needed to support the equivalent of docker run's
`--sysctl` option except the CLI.

Includes a vendoring of swarmkit for proto changes to support the new
behavior.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2018-09-20 10:51:56 -05:00
Rui Cao
3f02d91ef8 Fix some typos
Signed-off-by: Rui Cao <ruicao@alauda.io>
2018-09-20 20:00:35 +08:00
Sebastiaan van Stijn
dc26e1e7b7
Merge pull request #37871 from AntaresS/fix-config-conflicts
fix daemon won't start bug caused by daemon.json and cli flags duplications
2018-09-19 20:19:54 +02:00
Sebastiaan van Stijn
d6a7c22f7b
Merge pull request #37861 from TinySong/fix-typo
fix typos in service.go and plugin.go
2018-09-18 12:48:37 +02:00
Vincent Demeester
8efb908581
Merge pull request #37847 from thaJeztah/more_permissive_daeon_conf_dir
Loosen permissions on /etc/docker directory
2018-09-18 08:49:38 +02:00
song
c80e20f93f fix typos in service.go and plugin.go
Signed-off-by: song <tinysong1226@gmail.com>
Signed-off-by: Rongxiang Song <tinysong1226@gmail.com>
2018-09-18 10:48:39 +08:00
Anda Xu
8392d0930b fixed the dockerd won't start bug when 'runtimes' field is defined in both daemon config file and cli flags
Signed-off-by: Anda Xu <anda.xu@docker.com>
2018-09-17 16:12:04 -07:00
Olli Janatuinen
83d9b9e4d9 Allow mount type npipe on Windows
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2018-09-16 06:57:38 +00:00
Akihiro Suda
40385208cb daemon: propagate exec-root to libnetwork-setkey
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-09-15 13:49:30 +09:00
Tibor Vass
5aa222d0fe daemon/images: removed "found leaked image layer" warning, because it is expected now with buildkit
Signed-off-by: Tibor Vass <tibor@docker.com>
2018-09-15 01:14:00 +00:00
Sebastiaan van Stijn
9299561bd3
Merge pull request #37736 from selansen/master
Global Default AddressPool - Update
2018-09-14 18:47:42 +02:00
Sebastiaan van Stijn
cecd981717
Loosen permissions on /etc/docker directory
The `/etc/docker` directory is used both by the dockerd daemon
and the docker cli (if installed on the saem host as the daemon).

In situations where the `/etc/docker` directory does not exist,
and an initial `key.json` (legacy trust key) is generated (at the
default location), the `/etc/docker/` directory was created with
0700 permissions, making the directory only accessible by `root`.

Given that the `0600` permissions on the key itself already protect
it from being used by other users, the permissions of `/etc/docker`
can be less restrictive.

This patch changes the permissions for the directory to `0755`, so
that the CLI (if executed as non-root) can also access this directory.

> **NOTE**: "strictly", this patch is only needed for situations where no _custom_
> location for the trustkey is specified (not overridden with `--deprecated-key-path`),
> but setting the permissions only for the "default" case would make
> this more complicated.

```bash
make binary shell

make install

ls -la /etc/ | grep docker

dockerd
^C

ls -la /etc/ | grep docker
drwxr-xr-x 2 root root    4096 Sep 14 12:11 docker
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-09-14 14:14:39 +02:00
selansen
148ff00a0a Global Default AddressPool - Update
Addressing few review comments as part of code refactoring.
Also moved validation logic from CLI to Moby.

Signed-off-by: selansen <elango.siva@docker.com>
2018-09-11 19:02:54 -04:00
Kir Kolyshkin
b2b169f13f daemon/logger/journald: simplify readers field
As in other similar drivers (jsonlog, local), use a set
(i.e. `map[whatever]struct{}`), making the code simpler.

While at it, make sure we remove the reader from the set
after calling `ProducerGone()` on it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-10 16:17:05 -07:00
John Howard
98380b1791 Fix docker system df when LCOW and WCOW images loaded
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-09-10 10:56:56 -07:00
Sebastiaan van Stijn
77faf158f5
Merge pull request #37576 from kolyshkin/logs-f-leak
daemon.ContainerLogs(): fix resource leak on follow
2018-09-10 14:26:43 +02:00
Yong Tang
703a04ebc9
Merge pull request #37712 from Microsoft/jjh/detach
Windows: Try to detach sandbox on cleanup to avoid permission denied
2018-09-09 16:50:27 -07:00
Sebastiaan van Stijn
3d9adede13
Merge pull request #37782 from jianliao82/patch-1
fix a couple of typo
2018-09-08 09:44:00 +02:00
jliao
7427fe12d8 fix typo
fix typo

Signed-off-by: jian liao <jliao@alauda.io>
2018-09-08 08:13:30 +08:00
Sebastiaan van Stijn
e33ea4fbde
Merge pull request #37794 from Lihua93/fixtypos
Fix typos in comment
2018-09-07 16:46:46 +02:00
Doug Davis
9c3c3537ec
Merge pull request #37796 from mirake/fix-typo
Typo fix: retore -> restore
2018-09-07 06:24:35 -04:00
ruicao
1ca3ea121e Typo fix: retore -> restore
Signed-off-by: ruicao <ruicao@alauda.io>
2018-09-07 13:55:31 +08:00
Lihua Tang
8df0b2de54 Fix typos in comment
Signed-off-by: Lihua Tang <lhtang@alauda.io>
2018-09-07 13:17:42 +08:00
Kir Kolyshkin
9b0097a699 Format code with gofmt -s from go-1.11beta1
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.

No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).

Patch generated with:

> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 15:24:16 -07:00
John Howard
efdad53744 Windows: DetachVhd attempt in cleanup
Signed-off-by: John Howard <jhoward@microsoft.com>

This is a fix for a few related scenarios where it's impossible to remove layers or containers
until the host is rebooted. Generally (or at least easiest to repro) through a forced daemon kill
while a container is running.

Possibly slightly worse than that, as following a host reboot, the scratch layer would possibly be leaked and
left on disk under the dataroot\windowsfilter directory after the container is removed.

One such example of a failure:

1. run a long running container with the --rm flag
docker run --rm -d --name test microsoft/windowsservercore powershell sleep 30
2. Force kill the daemon not allowing it to cleanup. Simulates a crash or a host power-cycle.
3. (re-)Start daemon
4. docker ps -a
PS C:\control> docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                PORTS               NAMES
7aff773d782b        malloc              "powershell start-sl…"   11 seconds ago      Removal In Progress                       malloc
5. Try to remove
PS C:\control> docker rm 7aff
Error response from daemon: container 7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d: driver "windowsfilter" failed to remove root filesystem: rename C:\control\windowsfilter\7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d C:\control\windowsfilter\7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d-removing: Access is denied.
PS C:\control>

Step 5 fails.
2018-09-06 13:17:50 -07:00
Kir Kolyshkin
f845d76d04 TestFollowLogsProducerGone: add
This should test that
 - all the messages produced are delivered (i.e. not lost)
 - followLogs() exits

Loosely based on the test having the same name by Brian Goff, see
https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:48:37 -07:00
Kir Kolyshkin
916eabd459 daemon.ContainerLogs(): fix resource leak on follow
When daemon.ContainerLogs() is called with options.follow=true
(as in "docker logs --follow"), the "loggerutils.followLogs()"
function never returns (even then the logs consumer is gone).
As a result, all the resources associated with it (including
an opened file descriptor for the log file being read, two FDs
for a pipe, and two FDs for inotify watch) are never released.

If this is repeated (such as by running "docker logs --follow"
and pressing Ctrl-C a few times), this results in DoS caused by
either hitting the limit of inotify watches, or the limit of
opened files. The only cure is daemon restart.

Apparently, what happens is:

1. logs producer (a container) is gone, calling (*LogWatcher).Close()
for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175).

2. WatchClose() is properly handled by a dedicated goroutine in
followLogs(), cancelling the context.

3. Upon receiving the ctx.Done(), the code in followLogs()
(daemon/logger/loggerutils/logfile.go#L626-L638) keeps to
send messages _synchronously_ (which is OK for now).

4. Logs consumer is gone (Ctrl-C is pressed on a terminal running
"docker logs --follow"). Method (*LogWatcher).Close() is properly
called (see daemon/logs.go:114). Since it was called before and
due to to once.Do(), nothing happens (which is kinda good, as
otherwise it will panic on closing a closed channel).

5. A goroutine (see item 3 above) keeps sending log messages
synchronously to the logWatcher.Msg channel. Since the
channel reader is gone, the channel send operation blocks forever,
and resource cleanup set up in defer statements at the beginning
of followLogs() never happens.

Alas, the fix is somewhat complicated:

1. Distinguish between close from logs producer and logs consumer.
To that effect,
 - yet another channel is added to LogWatcher();
 - {Watch,}Close() are renamed to {Watch,}ProducerGone();
 - {Watch,}ConsumerGone() are added;

*NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed
in order to stop ConsumerLogs(follow=true) when a container is stopped;
otherwise we're not interested in it. In other words, we're only
using it in followLogs().

2. Code that was doing (logWatcher*).Close() is modified to either call
ProducerGone() or ConsumerGone(), depending on the context.

3. Code that was waiting for WatchClose() is modified to wait for
either ConsumerGone() or ProducerGone(), or both, depending on the
context.

4. followLogs() are modified accordingly:
 - context cancellation is happening on WatchProducerGone(),
and once it's received the FileWatcher is closed and waitRead()
returns errDone on EOF (i.e. log rotation handling logic is disabled);
 - due to this, code that was writing synchronously to logWatcher.Msg
can be and is removed as the code above it handles this case;
 - function returns once ConsumerGone is received, freeing all the
resources -- this is the bugfix itself.

While at it,

1. Let's also remove the ctx usage to simplify the code a bit.
It was introduced by commit a69a59ffc7 ("Decouple removing the
fileWatcher from reading") in order to fix a bug. The bug was actually
a deadlock in fsnotify, and the fix was just a workaround. Since then
the fsnofify bug has been fixed, and a new fsnotify was vendored in.
For more details, please see
https://github.com/moby/moby/pull/27782#issuecomment-416794490

2. Since `(*filePoller).Close()` is fixed to remove all the files
being watched, there is no need to explicitly call
fileWatcher.Remove(name) anymore, so get rid of the extra code.

Should fix https://github.com/moby/moby/issues/37391

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:47:42 -07:00
Brian Goff
d37a11bfba daemon/logger/loggerutils: add TestFollowLogsClose
This test case checks that followLogs() exits once the reader is gone.
Currently it does not (i.e. this test is supposed to fail) due to #37391.

[kolyshkin@: test case Brian Goff, changelog and all bugs are by me]
Source: https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:46:34 -07:00
Kir Kolyshkin
2e4c2a6bf9 daemon.ContainerLogs: minor debug logging cleanup
This code has many return statements, for some of them the
"end logs" or "end stream" message was not printed, giving
the impression that this "for" loop never ended.

Make sure that "begin logs" is to be followed by "end logs".

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:45:50 -07:00
Sebastiaan van Stijn
7129bebe0a
Merge pull request #37665 from kolyshkin/dev-init
Fix docker --init with /dev bind mount
2018-09-06 13:16:10 +01:00