0ct0pu5/moby

Author	SHA1	Message	Date
Brian Goff	eaad3ee3cf	Make sure timers are stopped after use. `time.After` keeps a timer running until the specified duration is completed. It also allocates a new timer on each call. This can wind up leaving lots of uneccessary timers running in the background that are not needed and consume resources. Instead of `time.After`, use `time.NewTimer` so the timer can actually be stopped. In some of these cases it's not a big deal since the duraiton is really short, but in others it is much worse. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2019-01-16 14:32:53 -08:00
Yong Tang	7315a2bb11	Fix go vet issue in daemon/daemon.go This fix fixes go vet issue: ``` daemon/daemon.go:273: loop variable id captured by func literal daemon/daemon.go:280: loop variable id captured by func literal ``` Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2019-01-06 00:18:29 +00:00
Akihiro Suda	2cb26cfe9c	Merge pull request #38301 from cyphar/waitgroup-limits daemon: switch to semaphore-gated WaitGroup for startup tasks	2018-12-22 00:07:55 +09:00
Aleksa Sarai	5a52917e4d	daemon: switch to semaphore-gated WaitGroup for startup tasks Many startup tasks have to run for each container, and thus using a WaitGroup (which doesn't have a limit to the number of parallel tasks) can result in Docker exceeding the NOFILE limit quite trivially. A more optimal solution is to have a parallelism limit by using a semaphore. In addition, several startup tasks were not parallelised previously which resulted in very long startup times. According to my testing, 20K dead containers resulted in ~6 minute startup times (during which time Docker is completely unusable). This patch fixes both issues, and the parallelStartupTimes factor chosen (128 * NumCPU) is based on my own significant testing of the 20K container case. This patch (on my machines) reduces the startup time from 6 minutes to less than a minute (ideally this could be further reduced by removing the need to scan all dead containers on startup -- but that's beyond the scope of this patchset). In order to avoid the NOFILE limit problem, we also detect this on-startup and if NOFILE < 2128NumCPU we will reduce the parallelism factor to avoid hitting NOFILE limits (but also emit a warning since this is almost certainly a mis-configuration). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-12-21 21:51:02 +11:00
Akihiro Suda	1fea38856a	Remove v1.10 migrator The v1.10 layout and the migrator was added in 2015 via #17924. Although the migrator is not marked as "deprecated" explicitly in cli/docs/deprecated.md, I suppose people should have already migrated from pre-v1.10 and they no longer need the migrator, because pre-v1.10 version do not support schema2 images (and these versions no longer receives security updates). Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-11-24 17:45:13 +09:00
Anda Xu	171d51c861	add support of registry-mirrors and insecure-registries to buildkit Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-09-20 11:53:02 -07:00
Kir Kolyshkin	9b0097a699	Format code with gofmt -s from go-1.11beta1 This should eliminate a bunch of new (go-1.11 related) validation errors telling that the code is not formatted with `gofmt -s`. No functional change, just whitespace (i.e. `git show --ignore-space-change` shows nothing). Patch generated with: > git ls-files \| grep -v ^vendor/ \| grep .go$ \| xargs gofmt -s -w Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-09-06 15:24:16 -07:00
Anda Xu	58a75cebdd	allow features option live reloadable Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-08-31 12:43:04 -07:00
John Howard	5accd82634	Add containerd.WithTimeout(60*time.Second) to match old calls Signed-off-by: John Howard <jhoward@microsoft.com>	2018-08-23 12:03:43 -07:00
John Stephens	b3e9f7b13b	Merge pull request #35521 from salah-khan/35507 Add --chown flag support for ADD/COPY commands for Windows	2018-08-17 11:31:16 -07:00
Salahuddin Khan	763d839261	Add ADD/COPY --chown flag support to Windows This implements chown support on Windows. Built-in accounts as well as accounts included in the SAM database of the container are supported. NOTE: IDPair is now named Identity and IDMappings is now named IdentityMapping. The following are valid examples: ADD --chown=Guest . <some directory> COPY --chown=Administrator . <some directory> COPY --chown=Guests . <some directory> COPY --chown=ContainerUser . <some directory> On Windows an owner is only granted the permission to read the security descriptor and read/write the discretionary access control list. This fix also grants read/write and execute permissions to the owner. Signed-off-by: Salahuddin Khan <salah@docker.com>	2018-08-13 21:59:11 -07:00
Derek McGowan	dd2e19ebd5	libcontainerd: split client and supervisor Adds a supervisor package for starting and monitoring containerd. Separates grpc connection allowing access from daemon. Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2018-08-06 10:23:04 -07:00
Flavio Crisciani	e353e7e3f0	Fixes for resolv.conf Handle the case of systemd-resolved, and if in place use a different resolv.conf source. Set appropriately the option on libnetwork. Move unix specific code to container_operation_unix Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2018-07-26 11:17:56 -07:00
Sebastiaan van Stijn	3737194b9f	daemon/*.go: fix some Wrap[f]/Warn[f] errors In particular, these two: > daemon/daemon_unix.go:1129: Wrapf format %v reads arg #1, but call has 0 args > daemon/kill.go:111: Warn call has possible formatting directive %s and a few more. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-07-11 15:51:51 +02:00
Tonis Tiigi	157b0b30db	builder: lint fixes Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:29 -07:00
Tonis Tiigi	ea36c3cbaf	daemon: access to distribution internals Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:26 -07:00
Brian Goff	e4b6adc88e	Extract volume interaction to a volumes service This cleans up some of the package API's used for interacting with volumes, and simplifies management. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-05-25 14:21:07 -04:00
Brian Goff	82d9185470	Merge pull request #36396 from selansen/master Allow user to specify default address pools for docker networks	2018-05-03 06:34:14 -04:00
Alessandro Boch	173b3c364e	Allow user to control the default address pools - Via daemon flag --default-address-pools base=<CIDR>,size=<int> Signed-off-by: Elango Siva <elango@docker.com>	2018-04-30 11:14:08 -04:00
Antonio Murdaca	75d3214934	restartmanager: do not apply restart policy on created containers Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2018-04-24 11:41:09 +02:00
Brian Goff	977109d808	Remove use of global volume driver store Instead of using a global store for volume drivers, scope the driver store to the caller (e.g. the volume store). This makes testing much simpler. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:07:08 -04:00
Brian Goff	0023abbad3	Remove old/uneeded volume migration from vers 1.7 Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:06:53 -04:00
Daniel Nephin	2b1a2b10af	Move ImageService to new package Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:49:37 -05:00
Daniel Nephin	0dab53ff3c	Move all daemon image methods into imageService imageService provides the backend for the image API and handles the imageStore, and referenceStore. Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:48:29 -05:00
Tibor Vass	747c163a65	Merge pull request #36303 from dnephin/cleanup-in-daemon-unix Cleanup unnecessary and duplicate functions in `daemon_unix.go`	2018-02-16 14:55:18 -08:00
Brian Goff	b0b9a25e7e	Move log validator logic after plugins are loaded This ensures that all log plugins are registered when the log validator is run. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-15 11:53:11 -05:00
Daniel Nephin	c502bcff33	Remove unnecessary getLayerInit Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-14 11:59:10 -05:00
Kir Kolyshkin	195893d381	c.RWLayer: check for nil before use Since commit `e9b9e4ace2` has landed, there is a chance that container.RWLayer is nil (due to some half-removed container). Let's check the pointer before use to avoid any potential nil pointer dereferences, resulting in a daemon crash. Note that even without the abovementioned commit, it's better to perform an extra check (even it's totally redundant) rather than to have a possibility of a daemon crash. In other words, better be safe than sorry. [v2: add a test case for daemon.getInspectData] [v3: add a check for container.Dead and a special error for the case] Fixes: `e9b9e4ace2` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-02-09 11:24:09 -08:00
Daniel Nephin	4f0d95fa6e	Add canonical import comment Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-05 16:51:57 -05:00
Brian Goff	c379d2681f	Fix race in attachable network attachment Attachable networks are networks created on the cluster which can then be attached to by non-swarm containers. These networks are lazily created on the node that wants to attach to that network. When no container is currently attached to one of these networks on a node, and then multiple containers which want that network are started concurrently, this can cause a race condition in the network attachment where essentially we try to attach the same network to the node twice. To easily reproduce this issue you must use a multi-node cluster with a worker node that has lots of CPUs (I used a 36 CPU node). Repro steps: 1. On manager, `docker network create -d overlay --attachable test` 2. On worker, `docker create --restart=always --network test busybox top`, many times... 200 is a good number (but not much more due to subnet size restrictions) 3. Restart the daemon When the daemon restarts, it will attempt to start all those containers simultaneously. Note that you could try to do this yourself over the API, but it's harder to trigger due to the added latency from going over the API. The error produced happens when the daemon tries to start the container upon allocating the network resources: ``` attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded ``` What happens here is the worker makes a network attachment request to the manager. This is an async call which in the happy case would cause a task to be placed on the node, which the worker is waiting for to get the network configuration. In the case of this race, the error ocurrs on the manager like this: ``` task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e ``` The task is not created and the worker times out waiting for the task. --- The mitigation for this is to make sure that only one attachment reuest is in flight for a given network at a time when the network doesn't already exist on the node. If the network already exists on the node there is no need for synchronization because the network is already allocated and on the node so there is no need to request it from the manager. This basically comes down to a race with `Find(network) \|\| Create(network)` without any sort of syncronization. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-02 13:46:23 -05:00
Allen Sun	de68ac8393	Simplify codes on calculating shutdown timeout Signed-off-by: Allen Sun <shlallen1990@gmail.com> Signed-off-by: Vincent Demeester <vincent@sbr.pm>	2018-01-26 09:18:07 -08:00
John Howard	0cba7740d4	Address feedback from Tonis Signed-off-by: John Howard <jhoward@microsoft.com>	2018-01-18 12:30:39 -08:00
John Howard	afd305c4b5	LCOW: Refactor to multiple layer-stores based on feedback Signed-off-by: John Howard <jhoward@microsoft.com>	2018-01-18 08:31:05 -08:00
John Howard	ce8e529e18	LCOW: Re-coalesce stores Signed-off-by: John Howard <jhoward@microsoft.com> The re-coalesces the daemon stores which were split as part of the original LCOW implementation. This is part of the work discussed in https://github.com/moby/moby/issues/34617, in particular see the document linked to in that issue.	2018-01-18 08:29:19 -08:00
Yong Tang	c36274da83	Merge pull request #35638 from cpuguy83/error_helpers2 Add helpers to create errdef errors	2018-01-15 10:56:46 -08:00
Sebastiaan van Stijn	b4a6313969	Golint: remove redundant ifs Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-01-15 00:42:25 +01:00
Brian Goff	d453fe35b9	Move api/errdefs to errdefs Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-01-11 21:21:43 -05:00
Victor Vieux	745278d242	Merge pull request #35812 from stevvooe/follow-conventions daemon, plugin: follow containerd namespace conventions	2017-12-19 15:55:39 -08:00
Brian Goff	e69127bd5b	Ensure containers are stopped on daemon startup When the containerd 1.0 runtime changes were made, we inadvertantly removed the functionality where any running containers are killed on startup when not using live-restore. This change restores that behavior. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2017-12-18 14:33:45 -05:00
Stephen J Day	521e7eba86	daemon, plugin: follow containerd namespace conventions Follow the conventions for namespace naming set out by other projects, such as linuxkit and cri-containerd. Typically, they are some sort of host name, with a subdomain describing functionality of the namespace. In the case of linuxkit, services are launched in `services.linuxkit`. In cri-containerd, pods are launched in `k8s.io`, making it clear that these are from kubernetes. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-12-15 17:20:42 -08:00
Kir Kolyshkin	516010e92d	Simplify/fix MkdirAll usage This subtle bug keeps lurking in because error checking for `Mkdir()` and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`: - for `Mkdir()`, `IsExist` error should (usually) be ignored (unless you want to make sure directory was not there before) as it means "the destination directory was already there" - for `MkdirAll()`, `IsExist` error should NEVER be ignored. Mostly, this commit just removes ignoring the IsExist error, as it should not be ignored. Also, there are a couple of cases then IsExist is handled as "directory already exist" which is wrong. As a result, some code that never worked as intended is now removed. NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()` rather than `os.Mkdir()` -- so its description is amended accordingly, and its usage is handled as such (i.e. IsExist error is not ignored). For more details, a quote from my runc commit 6f82d4b (July 2015): TL;DR: check for IsExist(err) after a failed MkdirAll() is both redundant and wrong -- so two reasons to remove it. Quoting MkdirAll documentation: > MkdirAll creates a directory named path, along with any necessary > parents, and returns nil, or else returns an error. If path > is already a directory, MkdirAll does nothing and returns nil. This means two things: 1. If a directory to be created already exists, no error is returned. 2. If the error returned is IsExist (EEXIST), it means there exists a non-directory with the same name as MkdirAll need to use for directory. Example: we want to MkdirAll("a/b"), but file "a" (or "a/b") already exists, so MkdirAll fails. The above is a theory, based on quoted documentation and my UNIX knowledge. 3. In practice, though, current MkdirAll implementation [1] returns ENOTDIR in most of cases described in #2, with the exception when there is a race between MkdirAll and someone else creating the last component of MkdirAll argument as a file. In this very case MkdirAll() will indeed return EEXIST. Because of #1, IsExist check after MkdirAll is not needed. Because of #2 and #3, ignoring IsExist error is just plain wrong, as directory we require is not created. It's cleaner to report the error now. Note this error is all over the tree, I guess due to copy-paste, or trying to follow the same usage pattern as for Mkdir(), or some not quite correct examples on the Internet. [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2017-11-27 17:32:12 -08:00
Darren Stahl	ed74ee127f	Increase container default shutdown timeout on Windows The shutdown timeout for containers in insufficient on Windows. If the daemon is shutting down, and a container takes longer than expected to shut down, this can cause the container to remain in a bad state after restart, and never be able to start again. Increasing the timeout makes this less likely to occur. Signed-off-by: Darren Stahl <darst@microsoft.com>	2017-10-23 10:31:31 -07:00
Brian Goff	402540708c	Merge pull request #34895 from mlaventure/containerd-1.0-client Containerd 1.0 client	2017-10-23 10:38:03 -04:00
Yong Tang	ab0eb8fcf6	Merge pull request #35077 from ryansimmen/35076-WindowsDaemonTmpDir Windows Daemon should respect DOCKER_TMPDIR	2017-10-20 08:40:43 -07:00
Kenfe-Mickael Laventure	ddae20c032	Update libcontainerd to use containerd 1.0 Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>	2017-10-20 07:11:37 -07:00
Ryan Simmen	5611f127a7	Windows Daemon should respect DOCKER_TMPDIR Signed-off-by: Ryan Simmen <ryan.simmen@gmail.com>	2017-10-19 10:47:46 -04:00
John Howard	0380fbff37	LCOW: API: Add platform to /images/create and /build Signed-off-by: John Howard <jhoward@microsoft.com> This PR has the API changes described in https://github.com/moby/moby/issues/34617. Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded OCI Image-spec `Platform` structure. In addition, it renames (almost all) uses of a string variable platform (and associated) methods/functions to os. This makes it much clearer to disambiguate with the swarm "platform" which is really os/arch. This is a stepping stone to getting the daemon towards fully multi-platform/arch-aware, and makes it clear when "operating system" is being referred to rather than "platform" which is misleadingly used - sometimes in the swarm meaning, but more often as just the operating system.	2017-10-06 11:44:18 -07:00
Pradip Dhara	d00a07b1e6	Updating moby to correspond to naming convention used in https://github.com/docker/swarmkit/pull/2385 Signed-off-by: Pradip Dhara <pradipd@microsoft.com>	2017-09-26 22:08:10 +00:00
Victor Vieux	a971f9c9d7	Merge pull request #34911 from dnephin/new-ci-entrypoint Add a new entrypoint for CI	2017-09-26 11:50:44 -07:00
Sebastiaan van Stijn	2b50b14aeb	Suppress warning for renaming missing tmp directory When starting `dockerd` on a host that has no `/var/lib/docker/tmp` directory, a warning was printed in the logs: $ dockerd --data-root=/no-such-directory ... WARN[2017-09-26T09:37:00.045153377Z] failed to rename /no-such-directory/tmp for background deletion: rename /no-such-directory/tmp /no-such-directory/tmp-old: no such file or directory. Deleting synchronously Although harmless, the warning does not show any useful information, so can be skipped. This patch checks thetype of error, so that warning is not printed. Other errors will still show up: $ touch /i-am-a-file $ dockerd --data-root=/i-am-a-file Unable to get the full path to root (/i-am-a-file): canonical path points to a file '/i-am-a-file' Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2017-09-26 12:04:30 +02:00

1 2 3 4 5 ...

922 commits