Moby works perfectly when you are in a situation when one has a good and stable
internet connection. Operating in area's where internet connectivity is likely
to be lost in undetermined intervals, like a satellite connection or 4G/LTE in
rural area's, can become a problem when pulling a new image. When connection is
lost while image layers are being pulled, Moby will try to reconnect up to 5 times.
If this fails, the incompletely downloaded layers are lost will need to be completely
downloaded again during the next pull request. This means that we are using more
data than we might have to.
Pulling a layer multiple times from the start can become costly over a satellite
or 4G/LTE connection. As these techniques (especially 4G) quite common in IoT and
Moby is used to run Azure IoT Edge devices, I would like to add a settable maximum
download attempts. The maximum download attempts is currently set at 5
(distribution/xfer/download.go). I would like to change this constant to a variable
that the user can set. The default will still be 5, so nothing will change from
the current version unless specified when starting the daemon with the added flag
or in the config file.
I added a default value of 5 for DefaultMaxDownloadAttempts and a settable
max-download-attempts in the daemon config file. It is also added to the config
of dockerd so it can be set with a flag when starting the daemon. This value gets
stored in the imageService of the daemon when it is initiated and can be passed
to the NewLayerDownloadManager as a parameter. It will be stored in the
LayerDownloadManager when initiated. This enables us to set the max amount of
retries in makeDownoadFunc equal to the max download attempts.
I also added some tests that are based on maxConcurrentDownloads/maxConcurrentUploads.
You can pull this version and test in a development container. Either create a config
`file /etc/docker/daemon.json` with `{"max-download-attempts"=3}``, or use
`dockerd --max-download-attempts=3 -D &` to start up the dockerd. Start downloading
a container and disconnect from the internet whilst downloading. The result would
be that it stops pulling after three attempts.
Signed-off-by: Lukas Heeren <lukas-heeren@hotmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
also renamed the non-windows variant of this file to be
consistent with other files in this package
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This fix was added in 8e71b1e210 to work around
a go issue (https://github.com/golang/go/issues/20506).
That issue was fixed in
66c03d39f3,
which is part of Go 1.10 and up. This reverts the changes that were made in
8e71b1e210, and are no longer needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows our tests, which all share a containerd instance, to be a
bit more isolated by setting the containerd namespaces to the generated
daemon ID's rather than the default namespaces.
This came about because I found in some cases we had test daemons
failing to start (really very slow to start) because it was (seemingly)
processing events from other tests.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This reverts commit 98fc09128b in order to
keep registry v2 schema1 handling and libtrust-key-based engine ID.
Because registry v2 schema1 was not officially deprecated and
registries are still relying on it, this patch puts its logic back.
However, registry v1 relics are not added back since v1 logic has been
removed a while ago.
This also fixes an engine upgrade issue in a swarm cluster. It was relying
on the Engine ID to be the same upon upgrade, but the mentioned commit
modified the logic to use UUID and from a different file.
Since the libtrust key is always needed to support v2 schema1 pushes,
that the old engine ID is based on the libtrust key, and that the engine ID
needs to be conserved across upgrades, adding a UUID-based engine ID logic
seems to add more complexity than it solves the problems.
Hence reverting the engine ID changes as well.
Signed-off-by: Tibor Vass <tibor@docker.com>
This is needed so that we can add OS version constraints in Swarmkit, which
does require the engine to report its host's OS version (see
https://github.com/docker/swarmkit/issues/2770).
The OS version is parsed from the `os-release` file on Linux, and from the
`ReleaseId` string value of the `SOFTWARE\Microsoft\Windows NT\CurrentVersion`
registry key on Windows.
Added unit tests when possible, as well as Prometheus metrics.
Signed-off-by: Jean Rouge <rougej+github@gmail.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
This is the first step in refactoring moby (dockerd) to use containerd on Windows.
Similar to the current model in Linux, this adds the option to enable it for runtime.
It does not switch the graphdriver to containerd snapshotters.
- Refactors libcontainerd to a series of subpackages so that either a
"local" containerd (1) or a "remote" (2) containerd can be loaded as opposed
to conditional compile as "local" for Windows and "remote" for Linux.
- Updates libcontainerd such that Windows has an option to allow the use of a
"remote" containerd. Here, it communicates over a named pipe using GRPC.
This is currently guarded behind the experimental flag, an environment variable,
and the providing of a pipename to connect to containerd.
- Infrastructure pieces such as under pkg/system to have helper functions for
determining whether containerd is being used.
(1) "local" containerd is what the daemon on Windows has used since inception.
It's not really containerd at all - it's simply local invocation of HCS APIs
directly in-process from the daemon through the Microsoft/hcsshim library.
(2) "remote" containerd is what docker on Linux uses for it's runtime. It means
that there is a separate containerd service running, and docker communicates over
GRPC to it.
To try this out, you will need to start with something like the following:
Window 1:
containerd --log-level debug
Window 2:
$env:DOCKER_WINDOWS_CONTAINERD=1
dockerd --experimental -D --containerd \\.\pipe\containerd-containerd
You will need the following binary from github.com/containerd/containerd in your path:
- containerd.exe
You will need the following binaries from github.com/Microsoft/hcsshim in your path:
- runhcs.exe
- containerd-shim-runhcs-v1.exe
For LCOW, it will require and initrd.img and kernel in `C:\Program Files\Linux Containers`.
This is no different to the current requirements. However, you may need updated binaries,
particularly initrd.img built from Microsoft/opengcs as (at the time of writing), Linuxkit
binaries are somewhat out of date.
Note that containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. This will come in time - this is a baby (although large)
step to migrating Docker on Windows to containerd.
Note that the HCS v2 APIs are only called on RS5+ builds. RS1..RS4 will still use
HCS v1 APIs as the v2 APIs were not fully developed enough on these builds to be usable.
This abstraction is done in HCSShim. (Referring specifically to runtime)
Note the LCOW graphdriver still uses HCS v1 APIs regardless.
Note also that this does not migrate docker to use containerd snapshotters
rather than graphdrivers. This needs to be done in conjunction with Linux also
doing the same switch.
As people are using the UUID in `docker info` that was based on the v1 manifest signing key, replace
with a UUID instead.
Remove deprecated `--disable-legacy-registry` option that was scheduled to be removed in 18.03.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Please refer to `docs/rootless.md`.
TLDR:
* Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
* `dockerd-rootless.sh --experimental`
* `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.
Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fix fixes go vet issue:
```
daemon/daemon.go:273: loop variable id captured by func literal
daemon/daemon.go:280: loop variable id captured by func literal
```
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Many startup tasks have to run for each container, and thus using a
WaitGroup (which doesn't have a limit to the number of parallel tasks)
can result in Docker exceeding the NOFILE limit quite trivially. A more
optimal solution is to have a parallelism limit by using a semaphore.
In addition, several startup tasks were not parallelised previously
which resulted in very long startup times. According to my testing, 20K
dead containers resulted in ~6 minute startup times (during which time
Docker is completely unusable).
This patch fixes both issues, and the parallelStartupTimes factor chosen
(128 * NumCPU) is based on my own significant testing of the 20K
container case. This patch (on my machines) reduces the startup time
from 6 minutes to less than a minute (ideally this could be further
reduced by removing the need to scan all dead containers on startup --
but that's beyond the scope of this patchset).
In order to avoid the NOFILE limit problem, we also detect this
on-startup and if NOFILE < 2*128*NumCPU we will reduce the parallelism
factor to avoid hitting NOFILE limits (but also emit a warning since
this is almost certainly a mis-configuration).
Signed-off-by: Aleksa Sarai <asarai@suse.de>
The v1.10 layout and the migrator was added in 2015 via #17924.
Although the migrator is not marked as "deprecated" explicitly in
cli/docs/deprecated.md, I suppose people should have already migrated
from pre-v1.10 and they no longer need the migrator, because pre-v1.10
version do not support schema2 images (and these versions no longer
receives security updates).
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.
No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).
Patch generated with:
> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This implements chown support on Windows. Built-in accounts as well
as accounts included in the SAM database of the container are supported.
NOTE: IDPair is now named Identity and IDMappings is now named
IdentityMapping.
The following are valid examples:
ADD --chown=Guest . <some directory>
COPY --chown=Administrator . <some directory>
COPY --chown=Guests . <some directory>
COPY --chown=ContainerUser . <some directory>
On Windows an owner is only granted the permission to read the security
descriptor and read/write the discretionary access control list. This
fix also grants read/write and execute permissions to the owner.
Signed-off-by: Salahuddin Khan <salah@docker.com>
Adds a supervisor package for starting and monitoring containerd.
Separates grpc connection allowing access from daemon.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Handle the case of systemd-resolved, and if in place
use a different resolv.conf source.
Set appropriately the option on libnetwork.
Move unix specific code to container_operation_unix
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
In particular, these two:
> daemon/daemon_unix.go:1129: Wrapf format %v reads arg #1, but call has 0 args
> daemon/kill.go:111: Warn call has possible formatting directive %s
and a few more.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Instead of using a global store for volume drivers, scope the driver
store to the caller (e.g. the volume store). This makes testing much
simpler.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Since commit e9b9e4ace2 has landed, there is a chance that
container.RWLayer is nil (due to some half-removed container). Let's
check the pointer before use to avoid any potential nil pointer
dereferences, resulting in a daemon crash.
Note that even without the abovementioned commit, it's better to perform
an extra check (even it's totally redundant) rather than to have a
possibility of a daemon crash. In other words, better be safe than
sorry.
[v2: add a test case for daemon.getInspectData]
[v3: add a check for container.Dead and a special error for the case]
Fixes: e9b9e4ace2
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Attachable networks are networks created on the cluster which can then
be attached to by non-swarm containers. These networks are lazily
created on the node that wants to attach to that network.
When no container is currently attached to one of these networks on a
node, and then multiple containers which want that network are started
concurrently, this can cause a race condition in the network attachment
where essentially we try to attach the same network to the node twice.
To easily reproduce this issue you must use a multi-node cluster with a
worker node that has lots of CPUs (I used a 36 CPU node).
Repro steps:
1. On manager, `docker network create -d overlay --attachable test`
2. On worker, `docker create --restart=always --network test busybox
top`, many times... 200 is a good number (but not much more due to
subnet size restrictions)
3. Restart the daemon
When the daemon restarts, it will attempt to start all those containers
simultaneously. Note that you could try to do this yourself over the API,
but it's harder to trigger due to the added latency from going over
the API.
The error produced happens when the daemon tries to start the container
upon allocating the network resources:
```
attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
```
What happens here is the worker makes a network attachment request to
the manager. This is an async call which in the happy case would cause a
task to be placed on the node, which the worker is waiting for to get
the network configuration.
In the case of this race, the error ocurrs on the manager like this:
```
task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e
```
The task is not created and the worker times out waiting for the task.
---
The mitigation for this is to make sure that only one attachment reuest
is in flight for a given network at a time *when the network doesn't
already exist on the node*. If the network already exists on the node
there is no need for synchronization because the network is already
allocated and on the node so there is no need to request it from the
manager.
This basically comes down to a race with `Find(network) ||
Create(network)` without any sort of syncronization.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>