beenull/moby

Author	SHA1	Message	Date
Derek McGowan	bc5484d2dd	bump moby/buildkit f7042823e340d38d1746aa675b83d1aca431cee3 full diff: `588c73e1e4...f7042823e3` Signed-off-by: Sebastiaan van Stijn <github@gone.nl> fix daemon for changes in containerd registry configuration Signed-off-by: Evan Hazlett <ejhazlett@gmail.com> Update buildernext and daemon for buildkit update Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2019-10-04 15:05:35 -07:00
Lukas Heeren	ce61a1ed98	Adding ability to change max download attempts Moby works perfectly when you are in a situation when one has a good and stable internet connection. Operating in area's where internet connectivity is likely to be lost in undetermined intervals, like a satellite connection or 4G/LTE in rural area's, can become a problem when pulling a new image. When connection is lost while image layers are being pulled, Moby will try to reconnect up to 5 times. If this fails, the incompletely downloaded layers are lost will need to be completely downloaded again during the next pull request. This means that we are using more data than we might have to. Pulling a layer multiple times from the start can become costly over a satellite or 4G/LTE connection. As these techniques (especially 4G) quite common in IoT and Moby is used to run Azure IoT Edge devices, I would like to add a settable maximum download attempts. The maximum download attempts is currently set at 5 (distribution/xfer/download.go). I would like to change this constant to a variable that the user can set. The default will still be 5, so nothing will change from the current version unless specified when starting the daemon with the added flag or in the config file. I added a default value of 5 for DefaultMaxDownloadAttempts and a settable max-download-attempts in the daemon config file. It is also added to the config of dockerd so it can be set with a flag when starting the daemon. This value gets stored in the imageService of the daemon when it is initiated and can be passed to the NewLayerDownloadManager as a parameter. It will be stored in the LayerDownloadManager when initiated. This enables us to set the max amount of retries in makeDownoadFunc equal to the max download attempts. I also added some tests that are based on maxConcurrentDownloads/maxConcurrentUploads. You can pull this version and test in a development container. Either create a config `file /etc/docker/daemon.json` with `{"max-download-attempts"=3}``, or use `dockerd --max-download-attempts=3 -D &` to start up the dockerd. Start downloading a container and disconnect from the internet whilst downloading. The result would be that it stops pulling after three attempts. Signed-off-by: Lukas Heeren <lukas-heeren@hotmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-09-19 13:51:40 +02:00
Tibor Vass	32688a47f3	Merge pull request #39699 from thaJeztah/mkdirall_dropin Allow system.MkDirAll() to be used as drop-in for os.MkDirAll()	2019-08-27 16:27:53 -07:00
Deep Debroy	4d5b6260bc	Fix regression in handling of NotFound err during startup Signed-off-by: Deep Debroy <ddebroy@docker.com>	2019-08-08 16:58:52 -07:00
Sebastiaan van Stijn	e554ab5589	Allow system.MkDirAll() to be used as drop-in for os.MkDirAll() also renamed the non-windows variant of this file to be consistent with other files in this package Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-08-08 15:05:49 +02:00
Sebastiaan van Stijn	bad0b4e604	Remove skip evaluation of symlinks to data root on IoT Core This fix was added in `8e71b1e210` to work around a go issue (https://github.com/golang/go/issues/20506). That issue was fixed in `66c03d39f3`, which is part of Go 1.10 and up. This reverts the changes that were made in `8e71b1e210`, and are no longer needed. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-07-13 23:44:51 +02:00
Brian Goff	24ad2f486d	Add (hidden) flags to set containerd namespaces This allows our tests, which all share a containerd instance, to be a bit more isolated by setting the containerd namespaces to the generated daemon ID's rather than the default namespaces. This came about because I found in some cases we had test daemons failing to start (really very slow to start) because it was (seemingly) processing events from other tests. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2019-07-11 17:27:48 -07:00
Tibor Vass	f695e98cb7	Revert "Remove the rest of v1 manifest support" This reverts commit `98fc09128b` in order to keep registry v2 schema1 handling and libtrust-key-based engine ID. Because registry v2 schema1 was not officially deprecated and registries are still relying on it, this patch puts its logic back. However, registry v1 relics are not added back since v1 logic has been removed a while ago. This also fixes an engine upgrade issue in a swarm cluster. It was relying on the Engine ID to be the same upon upgrade, but the mentioned commit modified the logic to use UUID and from a different file. Since the libtrust key is always needed to support v2 schema1 pushes, that the old engine ID is based on the libtrust key, and that the engine ID needs to be conserved across upgrades, adding a UUID-based engine ID logic seems to add more complexity than it solves the problems. Hence reverting the engine ID changes as well. Signed-off-by: Tibor Vass <tibor@docker.com>	2019-06-18 00:36:01 +00:00
Sebastiaan van Stijn	28678f2226	Merge pull request #38349 from wk8/wk8/os_version Adding OS version info to nodes' `Info` struct and to the system info's API	2019-06-07 14:54:51 +02:00
Sebastiaan van Stijn	c85fe2d224	Merge pull request #38522 from cpuguy83/fix_timers Make sure timers are stopped after use.	2019-06-07 13:16:46 +02:00
Jean Rouge	d363a1881e	Adding OS version info to the nodes' `Info` struct This is needed so that we can add OS version constraints in Swarmkit, which does require the engine to report its host's OS version (see https://github.com/docker/swarmkit/issues/2770). The OS version is parsed from the `os-release` file on Linux, and from the `ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion` registry key on Windows. Added unit tests when possible, as well as Prometheus metrics. Signed-off-by: Jean Rouge <rougej+github@gmail.com>	2019-06-06 22:40:10 +00:00
Michael Crosby	b9b5dc37e3	Remove inmemory container map Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-05 15:48:07 -04:00
Michael Crosby	45e328b0ac	Remove libcontainerd status type Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-04 15:17:13 -04:00
Tonis Tiigi	1a0f04e08e	daemon: fix mirrors validation Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2019-04-02 11:38:21 -07:00
John Howard	a3eda72f71	Merge pull request #38541 from Microsoft/jjh/containerd Windows: Experimental: ContainerD runtime	2019-03-19 21:09:19 -07:00
John Howard	85ad4b16c1	Windows: Experimental: Allow containerd for runtime Signed-off-by: John Howard <jhoward@microsoft.com> This is the first step in refactoring moby (dockerd) to use containerd on Windows. Similar to the current model in Linux, this adds the option to enable it for runtime. It does not switch the graphdriver to containerd snapshotters. - Refactors libcontainerd to a series of subpackages so that either a "local" containerd (1) or a "remote" (2) containerd can be loaded as opposed to conditional compile as "local" for Windows and "remote" for Linux. - Updates libcontainerd such that Windows has an option to allow the use of a "remote" containerd. Here, it communicates over a named pipe using GRPC. This is currently guarded behind the experimental flag, an environment variable, and the providing of a pipename to connect to containerd. - Infrastructure pieces such as under pkg/system to have helper functions for determining whether containerd is being used. (1) "local" containerd is what the daemon on Windows has used since inception. It's not really containerd at all - it's simply local invocation of HCS APIs directly in-process from the daemon through the Microsoft/hcsshim library. (2) "remote" containerd is what docker on Linux uses for it's runtime. It means that there is a separate containerd service running, and docker communicates over GRPC to it. To try this out, you will need to start with something like the following: Window 1: containerd --log-level debug Window 2: $env:DOCKER_WINDOWS_CONTAINERD=1 dockerd --experimental -D --containerd \\.\pipe\containerd-containerd You will need the following binary from github.com/containerd/containerd in your path: - containerd.exe You will need the following binaries from github.com/Microsoft/hcsshim in your path: - runhcs.exe - containerd-shim-runhcs-v1.exe For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`. This is no different to the current requirements. However, you may need updated binaries, particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit binaries are somewhat out of date. Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required functionality needed for docker. This will come in time - this is a baby (although large) step to migrating Docker on Windows to containerd. Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable. This abstraction is done in HCSShim. (Referring specifically to runtime) Note the LCOW graphdriver still uses HCS v1 APIs regardless. Note also that this does not migrate docker to use containerd snapshotters rather than graphdrivers. This needs to be done in conjunction with Linux also doing the same switch.	2019-03-12 18:41:55 -07:00
Justin Cormack	98fc09128b	Remove the rest of v1 manifest support As people are using the UUID in `docker info` that was based on the v1 manifest signing key, replace with a UUID instead. Remove deprecated `--disable-legacy-registry` option that was scheduled to be removed in 18.03. Signed-off-by: Justin Cormack <justin.cormack@docker.com>	2019-03-02 10:46:37 -08:00
Ryo Nakao	894ecb24d1	Merge the divided loops Signed-off-by: Ryo Nakao <nakabonne@gmail.com>	2019-02-24 16:16:19 +09:00
Akihiro Suda	ec87479b7e	allow running `dockerd` in an unprivileged user namespace (rootless mode) Please refer to `docs/rootless.md`. TLDR: * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you * `dockerd-rootless.sh --experimental` * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...` Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2019-02-04 00:24:27 +09:00
Brian Goff	eaad3ee3cf	Make sure timers are stopped after use. `time.After` keeps a timer running until the specified duration is completed. It also allocates a new timer on each call. This can wind up leaving lots of uneccessary timers running in the background that are not needed and consume resources. Instead of `time.After`, use `time.NewTimer` so the timer can actually be stopped. In some of these cases it's not a big deal since the duraiton is really short, but in others it is much worse. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2019-01-16 14:32:53 -08:00
Yong Tang	7315a2bb11	Fix go vet issue in daemon/daemon.go This fix fixes go vet issue: ``` daemon/daemon.go:273: loop variable id captured by func literal daemon/daemon.go:280: loop variable id captured by func literal ``` Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2019-01-06 00:18:29 +00:00
Akihiro Suda	2cb26cfe9c	Merge pull request #38301 from cyphar/waitgroup-limits daemon: switch to semaphore-gated WaitGroup for startup tasks	2018-12-22 00:07:55 +09:00
Aleksa Sarai	5a52917e4d	daemon: switch to semaphore-gated WaitGroup for startup tasks Many startup tasks have to run for each container, and thus using a WaitGroup (which doesn't have a limit to the number of parallel tasks) can result in Docker exceeding the NOFILE limit quite trivially. A more optimal solution is to have a parallelism limit by using a semaphore. In addition, several startup tasks were not parallelised previously which resulted in very long startup times. According to my testing, 20K dead containers resulted in ~6 minute startup times (during which time Docker is completely unusable). This patch fixes both issues, and the parallelStartupTimes factor chosen (128 * NumCPU) is based on my own significant testing of the 20K container case. This patch (on my machines) reduces the startup time from 6 minutes to less than a minute (ideally this could be further reduced by removing the need to scan all dead containers on startup -- but that's beyond the scope of this patchset). In order to avoid the NOFILE limit problem, we also detect this on-startup and if NOFILE < 2128NumCPU we will reduce the parallelism factor to avoid hitting NOFILE limits (but also emit a warning since this is almost certainly a mis-configuration). Signed-off-by: Aleksa Sarai <asarai@suse.de>	2018-12-21 21:51:02 +11:00
Akihiro Suda	1fea38856a	Remove v1.10 migrator The v1.10 layout and the migrator was added in 2015 via #17924. Although the migrator is not marked as "deprecated" explicitly in cli/docs/deprecated.md, I suppose people should have already migrated from pre-v1.10 and they no longer need the migrator, because pre-v1.10 version do not support schema2 images (and these versions no longer receives security updates). Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2018-11-24 17:45:13 +09:00
Anda Xu	171d51c861	add support of registry-mirrors and insecure-registries to buildkit Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-09-20 11:53:02 -07:00
Kir Kolyshkin	9b0097a699	Format code with gofmt -s from go-1.11beta1 This should eliminate a bunch of new (go-1.11 related) validation errors telling that the code is not formatted with `gofmt -s`. No functional change, just whitespace (i.e. `git show --ignore-space-change` shows nothing). Patch generated with: > git ls-files \| grep -v ^vendor/ \| grep .go$ \| xargs gofmt -s -w Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-09-06 15:24:16 -07:00
Anda Xu	58a75cebdd	allow features option live reloadable Signed-off-by: Anda Xu <anda.xu@docker.com>	2018-08-31 12:43:04 -07:00
John Howard	5accd82634	Add containerd.WithTimeout(60*time.Second) to match old calls Signed-off-by: John Howard <jhoward@microsoft.com>	2018-08-23 12:03:43 -07:00
John Stephens	b3e9f7b13b	Merge pull request #35521 from salah-khan/35507 Add --chown flag support for ADD/COPY commands for Windows	2018-08-17 11:31:16 -07:00
Salahuddin Khan	763d839261	Add ADD/COPY --chown flag support to Windows This implements chown support on Windows. Built-in accounts as well as accounts included in the SAM database of the container are supported. NOTE: IDPair is now named Identity and IDMappings is now named IdentityMapping. The following are valid examples: ADD --chown=Guest . <some directory> COPY --chown=Administrator . <some directory> COPY --chown=Guests . <some directory> COPY --chown=ContainerUser . <some directory> On Windows an owner is only granted the permission to read the security descriptor and read/write the discretionary access control list. This fix also grants read/write and execute permissions to the owner. Signed-off-by: Salahuddin Khan <salah@docker.com>	2018-08-13 21:59:11 -07:00
Derek McGowan	dd2e19ebd5	libcontainerd: split client and supervisor Adds a supervisor package for starting and monitoring containerd. Separates grpc connection allowing access from daemon. Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2018-08-06 10:23:04 -07:00
Flavio Crisciani	e353e7e3f0	Fixes for resolv.conf Handle the case of systemd-resolved, and if in place use a different resolv.conf source. Set appropriately the option on libnetwork. Move unix specific code to container_operation_unix Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>	2018-07-26 11:17:56 -07:00
Sebastiaan van Stijn	3737194b9f	daemon/*.go: fix some Wrap[f]/Warn[f] errors In particular, these two: > daemon/daemon_unix.go:1129: Wrapf format %v reads arg #1, but call has 0 args > daemon/kill.go:111: Warn call has possible formatting directive %s and a few more. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-07-11 15:51:51 +02:00
Tonis Tiigi	157b0b30db	builder: lint fixes Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:29 -07:00
Tonis Tiigi	ea36c3cbaf	daemon: access to distribution internals Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>	2018-06-10 10:05:26 -07:00
Brian Goff	e4b6adc88e	Extract volume interaction to a volumes service This cleans up some of the package API's used for interacting with volumes, and simplifies management. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-05-25 14:21:07 -04:00
Brian Goff	82d9185470	Merge pull request #36396 from selansen/master Allow user to specify default address pools for docker networks	2018-05-03 06:34:14 -04:00
Alessandro Boch	173b3c364e	Allow user to control the default address pools - Via daemon flag --default-address-pools base=<CIDR>,size=<int> Signed-off-by: Elango Siva <elango@docker.com>	2018-04-30 11:14:08 -04:00
Antonio Murdaca	75d3214934	restartmanager: do not apply restart policy on created containers Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2018-04-24 11:41:09 +02:00
Brian Goff	977109d808	Remove use of global volume driver store Instead of using a global store for volume drivers, scope the driver store to the caller (e.g. the volume store). This makes testing much simpler. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:07:08 -04:00
Brian Goff	0023abbad3	Remove old/uneeded volume migration from vers 1.7 Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-04-17 14:06:53 -04:00
Daniel Nephin	2b1a2b10af	Move ImageService to new package Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:49:37 -05:00
Daniel Nephin	0dab53ff3c	Move all daemon image methods into imageService imageService provides the backend for the image API and handles the imageStore, and referenceStore. Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-26 16:48:29 -05:00
Tibor Vass	747c163a65	Merge pull request #36303 from dnephin/cleanup-in-daemon-unix Cleanup unnecessary and duplicate functions in `daemon_unix.go`	2018-02-16 14:55:18 -08:00
Brian Goff	b0b9a25e7e	Move log validator logic after plugins are loaded This ensures that all log plugins are registered when the log validator is run. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-15 11:53:11 -05:00
Daniel Nephin	c502bcff33	Remove unnecessary getLayerInit Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-14 11:59:10 -05:00
Kir Kolyshkin	195893d381	c.RWLayer: check for nil before use Since commit `e9b9e4ace2` has landed, there is a chance that container.RWLayer is nil (due to some half-removed container). Let's check the pointer before use to avoid any potential nil pointer dereferences, resulting in a daemon crash. Note that even without the abovementioned commit, it's better to perform an extra check (even it's totally redundant) rather than to have a possibility of a daemon crash. In other words, better be safe than sorry. [v2: add a test case for daemon.getInspectData] [v3: add a check for container.Dead and a special error for the case] Fixes: `e9b9e4ace2` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-02-09 11:24:09 -08:00
Daniel Nephin	4f0d95fa6e	Add canonical import comment Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-05 16:51:57 -05:00
Brian Goff	c379d2681f	Fix race in attachable network attachment Attachable networks are networks created on the cluster which can then be attached to by non-swarm containers. These networks are lazily created on the node that wants to attach to that network. When no container is currently attached to one of these networks on a node, and then multiple containers which want that network are started concurrently, this can cause a race condition in the network attachment where essentially we try to attach the same network to the node twice. To easily reproduce this issue you must use a multi-node cluster with a worker node that has lots of CPUs (I used a 36 CPU node). Repro steps: 1. On manager, `docker network create -d overlay --attachable test` 2. On worker, `docker create --restart=always --network test busybox top`, many times... 200 is a good number (but not much more due to subnet size restrictions) 3. Restart the daemon When the daemon restarts, it will attempt to start all those containers simultaneously. Note that you could try to do this yourself over the API, but it's harder to trigger due to the added latency from going over the API. The error produced happens when the daemon tries to start the container upon allocating the network resources: ``` attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded ``` What happens here is the worker makes a network attachment request to the manager. This is an async call which in the happy case would cause a task to be placed on the node, which the worker is waiting for to get the network configuration. In the case of this race, the error ocurrs on the manager like this: ``` task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e ``` The task is not created and the worker times out waiting for the task. --- The mitigation for this is to make sure that only one attachment reuest is in flight for a given network at a time when the network doesn't already exist on the node. If the network already exists on the node there is no need for synchronization because the network is already allocated and on the node so there is no need to request it from the manager. This basically comes down to a race with `Find(network) \|\| Create(network)` without any sort of syncronization. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-02 13:46:23 -05:00
Allen Sun	de68ac8393	Simplify codes on calculating shutdown timeout Signed-off-by: Allen Sun <shlallen1990@gmail.com> Signed-off-by: Vincent Demeester <vincent@sbr.pm>	2018-01-26 09:18:07 -08:00

1 2 3 4 5 ...

841 commits