beenull/moby

Author	SHA1	Message	Date
Wei Fu	9ed0504592	daemon: add grpc.WithBlock option WithBlock makes sure that the following containerd request is reliable. In one edge case with high load pressure, kernel kills dockerd, containerd and containerd-shims caused by OOM. When both dockerd and containerd restart, but containerd will take time to recover all the existing containers. Before containerd serving, dockerd will failed with gRPC error. That bad thing is that restore action will still ignore the any non-NotFound errors and returns running state for already stopped container. It is unexpected behavior. And we need to restart dockerd to make sure that anything is OK. It is painful. Add WithBlock can prevent the edge case. And n common case, the containerd will be serving in shortly. It is not harm to add WithBlock for containerd connection. Signed-off-by: Wei Fu <fuweid89@gmail.com> (cherry picked from commit `9f73396dab`) Signed-off-by: Wei Fu <fuweid89@gmail.com>	2020-02-22 14:28:28 +08:00
Brian Goff	34418110ec	Add (hidden) flags to set containerd namespaces This allows our tests, which all share a containerd instance, to be a bit more isolated by setting the containerd namespaces to the generated daemon ID's rather than the default namespaces. This came about because I found in some cases we had test daemons failing to start (really very slow to start) because it was (seemingly) processing events from other tests. Signed-off-by: Brian Goff <cpuguy83@gmail.com> (cherry picked from commit `24ad2f486d`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-09-25 17:26:26 +02:00
Deep Debroy	685565ad18	Fix regression in handling of NotFound err during startup Signed-off-by: Deep Debroy <ddebroy@docker.com> (cherry picked from commit `4d5b6260bc`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-08-09 02:09:13 +02:00
Tibor Vass	99cd23cefd	Revert "Remove the rest of v1 manifest support" This reverts commit `98fc09128b` in order to keep registry v2 schema1 handling and libtrust-key-based engine ID. Because registry v2 schema1 was not officially deprecated and registries are still relying on it, this patch puts its logic back. However, registry v1 relics are not added back since v1 logic has been removed a while ago. This also fixes an engine upgrade issue in a swarm cluster. It was relying on the Engine ID to be the same upon upgrade, but the mentioned commit modified the logic to use UUID and from a different file. Since the libtrust key is always needed to support v2 schema1 pushes, that the old engine ID is based on the libtrust key, and that the engine ID needs to be conserved across upgrades, adding a UUID-based engine ID logic seems to add more complexity than it solves the problems. Hence reverting the engine ID changes as well. Signed-off-by: Tibor Vass <tibor@docker.com> (cherry picked from commit `f695e98cb7`) Signed-off-by: Tibor Vass <tibor@docker.com>	2019-06-18 18:54:57 +00:00
Michael Crosby	b9b5dc37e3	Remove inmemory container map Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-05 15:48:07 -04:00
Michael Crosby	45e328b0ac	Remove libcontainerd status type Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-04 15:17:13 -04:00
Tonis Tiigi	1a0f04e08e	daemon: fix mirrors validation Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2019-04-02 11:38:21 -07:00
John Howard	a3eda72f71	Merge pull request #38541 from Microsoft/jjh/containerd Windows: Experimental: ContainerD runtime	2019-03-19 21:09:19 -07:00
John Howard	85ad4b16c1	Windows: Experimental: Allow containerd for runtime Signed-off-by: John Howard <jhoward@microsoft.com> This is the first step in refactoring moby (dockerd) to use containerd on Windows. Similar to the current model in Linux, this adds the option to enable it for runtime. It does not switch the graphdriver to containerd snapshotters. - Refactors libcontainerd to a series of subpackages so that either a "local" containerd (1) or a "remote" (2) containerd can be loaded as opposed to conditional compile as "local" for Windows and "remote" for Linux. - Updates libcontainerd such that Windows has an option to allow the use of a "remote" containerd. Here, it communicates over a named pipe using GRPC. This is currently guarded behind the experimental flag, an environment variable, and the providing of a pipename to connect to containerd. - Infrastructure pieces such as under pkg/system to have helper functions for determining whether containerd is being used. (1) "local" containerd is what the daemon on Windows has used since inception. It's not really containerd at all - it's simply local invocation of HCS APIs directly in-process from the daemon through the Microsoft/hcsshim library. (2) "remote" containerd is what docker on Linux uses for it's runtime. It means that there is a separate containerd service running, and docker communicates over GRPC to it. To try this out, you will need to start with something like the following: Window 1: containerd --log-level debug Window 2: $env:DOCKER_WINDOWS_CONTAINERD=1 dockerd --experimental -D --containerd \\.\pipe\containerd-containerd You will need the following binary from github.com/containerd/containerd in your path: - containerd.exe You will need the following binaries from github.com/Microsoft/hcsshim in your path: - runhcs.exe - containerd-shim-runhcs-v1.exe For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`. This is no different to the current requirements. However, you may need updated binaries, particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit binaries are somewhat out of date. Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required functionality needed for docker. This will come in time - this is a baby (although large) step to migrating Docker on Windows to containerd. Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable. This abstraction is done in HCSShim. (Referring specifically to runtime) Note the LCOW graphdriver still uses HCS v1 APIs regardless. Note also that this does not migrate docker to use containerd snapshotters rather than graphdrivers. This needs to be done in conjunction with Linux also doing the same switch.	2019-03-12 18:41:55 -07:00
Justin Cormack	98fc09128b	Remove the rest of v1 manifest support As people are using the UUID in `docker info` that was based on the v1 manifest signing key, replace with a UUID instead. Remove deprecated `--disable-legacy-registry` option that was scheduled to be removed in 18.03. Signed-off-by: Justin Cormack <justin.cormack@docker.com>	2019-03-02 10:46:37 -08:00
Ryo Nakao	894ecb24d1	Merge the divided loops Signed-off-by: Ryo Nakao <nakabonne@gmail.com>	2019-02-24 16:16:19 +09:00
Akihiro Suda	ec87479b7e	allow running `dockerd` in an unprivileged user namespace (rootless mode) Please refer to `docs/rootless.md`. TLDR: * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you * `dockerd-rootless.sh --experimental` * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...` Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2019-02-04 00:24:27 +09:00
Yong Tang	7315a2bb11	Fix go vet issue in daemon/daemon.go This fix fixes go vet issue: ``` daemon/daemon.go:273: loop variable id captured by func literal daemon/daemon.go:280: loop variable id captured by func literal ``` Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2019-01-06 00:18:29 +00:00
Akihiro Suda	2cb26cfe9c	Merge pull request #38301 from cyphar/waitgroup-limits daemon: switch to semaphore-gated WaitGroup for startup tasks	2018-12-22 00:07:55 +09:00
Aleksa Sarai	5a52917e4d	daemon: switch to semaphore-gated WaitGroup for startup tasks Many startup tasks have to run for each container, and thus using a WaitGroup (which doesn't have a limit to the number of parallel tasks) can result in Docker exceeding the NOFILE limit quite trivially. A more optimal solution is to have a parallelism limit by using a semaphore. In addition, several startup tasks were not parallelised previously which resulted in very long startup times. According to my testing, 20K dead containers resulted in ~6 minute startup times (during which time Docker is completely unusable). This patch fixes both issues, and the parallelStartupTimes factor chosen (128 * NumCPU) is based on my own significant testing of the 20K container case. This patch (on my machines) reduces the startup time from 6 minutes to less than a minute (ideally this could be further reduced by removing the need to scan all dead containers on startup -- but that's beyond the scope of this patchset). In order to avoid the NOFILE limit problem, we also detect this on-startup and if NOFILE < 2128NumCPU we will reduce the parallelism factor to avoid hitting NOFILE limits (but also emit a warning since this is almost certainly a mis-configuration). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-12-21 21:51:02 +11:00
Akihiro Suda	1fea38856a	Remove v1.10 migrator The v1.10 layout and the migrator was added in 2015 via #17924. Although the migrator is not marked as "deprecated" explicitly in cli/docs/deprecated.md, I suppose people should have already migrated from pre-v1.10 and they no longer need the migrator, because pre-v1.10 version do not support schema2 images (and these versions no longer receives security updates). Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-11-24 17:45:13 +09:00
Anda Xu	171d51c861	add support of registry-mirrors and insecure-registries to buildkit Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-09-20 11:53:02 -07:00
Kir Kolyshkin	9b0097a699	Format code with gofmt -s from go-1.11beta1 This should eliminate a bunch of new (go-1.11 related) validation errors telling that the code is not formatted with `gofmt -s`. No functional change, just whitespace (i.e. `git show --ignore-space-change` shows nothing). Patch generated with: > git ls-files \| grep -v ^vendor/ \| grep .go$ \| xargs gofmt -s -w Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-09-06 15:24:16 -07:00
Anda Xu	58a75cebdd	allow features option live reloadable Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-08-31 12:43:04 -07:00
John Howard	5accd82634	Add containerd.WithTimeout(60*time.Second) to match old calls Signed-off-by: John Howard <jhoward@microsoft.com>	2018-08-23 12:03:43 -07:00
John Stephens	b3e9f7b13b	Merge pull request #35521 from salah-khan/35507 Add --chown flag support for ADD/COPY commands for Windows	2018-08-17 11:31:16 -07:00
Salahuddin Khan	763d839261	Add ADD/COPY --chown flag support to Windows This implements chown support on Windows. Built-in accounts as well as accounts included in the SAM database of the container are supported. NOTE: IDPair is now named Identity and IDMappings is now named IdentityMapping. The following are valid examples: ADD --chown=Guest . <some directory> COPY --chown=Administrator . <some directory> COPY --chown=Guests . <some directory> COPY --chown=ContainerUser . <some directory> On Windows an owner is only granted the permission to read the security descriptor and read/write the discretionary access control list. This fix also grants read/write and execute permissions to the owner. Signed-off-by: Salahuddin Khan <salah@docker.com>	2018-08-13 21:59:11 -07:00
Derek McGowan	dd2e19ebd5	libcontainerd: split client and supervisor Adds a supervisor package for starting and monitoring containerd. Separates grpc connection allowing access from daemon. Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2018-08-06 10:23:04 -07:00
Flavio Crisciani	e353e7e3f0	Fixes for resolv.conf Handle the case of systemd-resolved, and if in place use a different resolv.conf source. Set appropriately the option on libnetwork. Move unix specific code to container_operation_unix Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2018-07-26 11:17:56 -07:00
Sebastiaan van Stijn	3737194b9f	daemon/*.go: fix some Wrap[f]/Warn[f] errors In particular, these two: > daemon/daemon_unix.go:1129: Wrapf format %v reads arg #1, but call has 0 args > daemon/kill.go:111: Warn call has possible formatting directive %s and a few more. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-07-11 15:51:51 +02:00
Tonis Tiigi	157b0b30db	builder: lint fixes Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:29 -07:00
Tonis Tiigi	ea36c3cbaf	daemon: access to distribution internals Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:26 -07:00
Brian Goff	e4b6adc88e	Extract volume interaction to a volumes service This cleans up some of the package API's used for interacting with volumes, and simplifies management. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-05-25 14:21:07 -04:00
Brian Goff	82d9185470	Merge pull request #36396 from selansen/master Allow user to specify default address pools for docker networks	2018-05-03 06:34:14 -04:00
Alessandro Boch	173b3c364e	Allow user to control the default address pools - Via daemon flag --default-address-pools base=<CIDR>,size=<int> Signed-off-by: Elango Siva <elango@docker.com>	2018-04-30 11:14:08 -04:00
Antonio Murdaca	75d3214934	restartmanager: do not apply restart policy on created containers Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2018-04-24 11:41:09 +02:00
Brian Goff	977109d808	Remove use of global volume driver store Instead of using a global store for volume drivers, scope the driver store to the caller (e.g. the volume store). This makes testing much simpler. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:07:08 -04:00
Brian Goff	0023abbad3	Remove old/uneeded volume migration from vers 1.7 Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:06:53 -04:00
Daniel Nephin	2b1a2b10af	Move ImageService to new package Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:49:37 -05:00
Daniel Nephin	0dab53ff3c	Move all daemon image methods into imageService imageService provides the backend for the image API and handles the imageStore, and referenceStore. Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:48:29 -05:00
Tibor Vass	747c163a65	Merge pull request #36303 from dnephin/cleanup-in-daemon-unix Cleanup unnecessary and duplicate functions in `daemon_unix.go`	2018-02-16 14:55:18 -08:00
Brian Goff	b0b9a25e7e	Move log validator logic after plugins are loaded This ensures that all log plugins are registered when the log validator is run. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-15 11:53:11 -05:00
Daniel Nephin	c502bcff33	Remove unnecessary getLayerInit Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-14 11:59:10 -05:00
Kir Kolyshkin	195893d381	c.RWLayer: check for nil before use Since commit `e9b9e4ace2` has landed, there is a chance that container.RWLayer is nil (due to some half-removed container). Let's check the pointer before use to avoid any potential nil pointer dereferences, resulting in a daemon crash. Note that even without the abovementioned commit, it's better to perform an extra check (even it's totally redundant) rather than to have a possibility of a daemon crash. In other words, better be safe than sorry. [v2: add a test case for daemon.getInspectData] [v3: add a check for container.Dead and a special error for the case] Fixes: `e9b9e4ace2` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-02-09 11:24:09 -08:00
Daniel Nephin	4f0d95fa6e	Add canonical import comment Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-05 16:51:57 -05:00
Brian Goff	c379d2681f	Fix race in attachable network attachment Attachable networks are networks created on the cluster which can then be attached to by non-swarm containers. These networks are lazily created on the node that wants to attach to that network. When no container is currently attached to one of these networks on a node, and then multiple containers which want that network are started concurrently, this can cause a race condition in the network attachment where essentially we try to attach the same network to the node twice. To easily reproduce this issue you must use a multi-node cluster with a worker node that has lots of CPUs (I used a 36 CPU node). Repro steps: 1. On manager, `docker network create -d overlay --attachable test` 2. On worker, `docker create --restart=always --network test busybox top`, many times... 200 is a good number (but not much more due to subnet size restrictions) 3. Restart the daemon When the daemon restarts, it will attempt to start all those containers simultaneously. Note that you could try to do this yourself over the API, but it's harder to trigger due to the added latency from going over the API. The error produced happens when the daemon tries to start the container upon allocating the network resources: ``` attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded ``` What happens here is the worker makes a network attachment request to the manager. This is an async call which in the happy case would cause a task to be placed on the node, which the worker is waiting for to get the network configuration. In the case of this race, the error ocurrs on the manager like this: ``` task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e ``` The task is not created and the worker times out waiting for the task. --- The mitigation for this is to make sure that only one attachment reuest is in flight for a given network at a time when the network doesn't already exist on the node. If the network already exists on the node there is no need for synchronization because the network is already allocated and on the node so there is no need to request it from the manager. This basically comes down to a race with `Find(network) \|\| Create(network)` without any sort of syncronization. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-02 13:46:23 -05:00
Allen Sun	de68ac8393	Simplify codes on calculating shutdown timeout Signed-off-by: Allen Sun <shlallen1990@gmail.com> Signed-off-by: Vincent Demeester <vincent@sbr.pm>	2018-01-26 09:18:07 -08:00
John Howard	0cba7740d4	Address feedback from Tonis Signed-off-by: John Howard <jhoward@microsoft.com>	2018-01-18 12:30:39 -08:00
John Howard	afd305c4b5	LCOW: Refactor to multiple layer-stores based on feedback Signed-off-by: John Howard <jhoward@microsoft.com>	2018-01-18 08:31:05 -08:00
John Howard	ce8e529e18	LCOW: Re-coalesce stores Signed-off-by: John Howard <jhoward@microsoft.com> The re-coalesces the daemon stores which were split as part of the original LCOW implementation. This is part of the work discussed in https://github.com/moby/moby/issues/34617, in particular see the document linked to in that issue.	2018-01-18 08:29:19 -08:00
Yong Tang	c36274da83	Merge pull request #35638 from cpuguy83/error_helpers2 Add helpers to create errdef errors	2018-01-15 10:56:46 -08:00
Sebastiaan van Stijn	b4a6313969	Golint: remove redundant ifs Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-01-15 00:42:25 +01:00
Brian Goff	d453fe35b9	Move api/errdefs to errdefs Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-01-11 21:21:43 -05:00
Victor Vieux	745278d242	Merge pull request #35812 from stevvooe/follow-conventions daemon, plugin: follow containerd namespace conventions	2017-12-19 15:55:39 -08:00
Brian Goff	e69127bd5b	Ensure containers are stopped on daemon startup When the containerd 1.0 runtime changes were made, we inadvertantly removed the functionality where any running containers are killed on startup when not using live-restore. This change restores that behavior. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2017-12-18 14:33:45 -05:00

1 2 3 4 5 ...

833 commits