In cases there are failures in task start, swarmkit might be trying to
restart the task again in the same node which might keep failing. This
creates a race where when a failed task is getting removed it might
remove the associated network while another task for the same service
or a different service but connected to the same network is proceeding
with starting the container knowing that the network is still
present. Fix this by reacting to `ErrNoSuchNetwork` error during
container start by trying to recreate the managed networks. If they
have been removed it will be recreated. If they are already present
nothing bad will happen.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
(cherry picked from commit 117cef5e97)
Signed-off-by: Tibor Vass <tibor@docker.com>
Unlike `docker run -v..`, `docker service create --mount`
does not allow bind-mounting non-existing host paths.
This adds validation for the specified `source`, and
produces an error if the path is not found on the
host.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 84d5ab96ef)
Signed-off-by: Tibor Vass <tibor@docker.com>
This fix tries to address the issue raised in 25374 where the
output of `docker ps --filter` is in random order and
not deterministic.
This fix sorts the list of containers by creation time so that the
output is deterministic.
An integration test has been added.
This fix fixes 25374.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
(cherry picked from commit 3f97133546)
Signed-off-by: Tibor Vass <tibor@docker.com>
This is intended as a minor fix for 1.12.1 so that task creation doesn't
do unexpected things when the user supplies erroneous paths.
In particular, because we're currently using hostConfig.Binds to setup
mounts, if a user uses an absolute path for a volume mount source, or a
non-absolute path for a bind mount source, the engine will do the
opposite of what the user requested since all absolute paths are
treated as binds and all non-absolute paths are treated as named
volumes.
Fixes#25253
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 38f8b0eb10)
Signed-off-by: Tibor Vass <tibor@docker.com>
This fix tries to improve error messages when IP address
autodetection fails, as is specified in 25141.
Previously, error messages only indicate that multiple IPs
exist when autodetection fails. In this fix, if one
interface consists of multiple addresses or multiple
interfaces consist of addresses, the error messages output
the address names and interface names so that end user could
take notice.
This fix is verified manually.
When multiple addresses exist on multiple interfaces:
```
$ sudo docker swarm init
Error response from daemon: could not choose an IP address
to advertise since this system has multiple addresses on different
interfaces (192.168.186.128 on ens33 and 192.168.100.199 on eth10)
- specify one with --advertise-addr
```
When multiple addresses exist on single interface:
```
$ sudo docker swarm init
Error response from daemon: could not choose an IP address
to advertise since this system has multiple addresses
on interface ens33 (192.168.186.128 and 192.168.55.199)
- specify one with --advertise-addr
```
This fix fixes 25141.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
(cherry picked from commit 59db01049a)
Signed-off-by: Tibor Vass <tibor@docker.com>
Remove the swarm inspect command and use docker info instead to display
swarm information if the current node is a manager.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
(cherry picked from commit e6923f6d75)
Signed-off-by: Tibor Vass <tibor@docker.com>
Context cancellations were previously causing `Prepare` to fail
completely on re-entrant calls. To prevent this, we filtered out cancels
and deadline errors. While this allowed the service to proceed without
errors, it had the possibility of interrupting long pulls, causing the
pull to happen twice.
This PR forks the context of the pull to match the lifetime of
`Controller`, ensuring that for each task, the pull is only performed
once. It also ensures that multiple calls to `Prepare` are re-entrant,
ensuring that the pull resumes from its original position.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
(cherry picked from commit d8d71ad5b9)
Signed-off-by: Tibor Vass <tibor@docker.com>
With digests being added by default, all images have multiple references.
The check for whether force is required to remove the reference should use the new check for single reference which accounts for digest references.
This change restores pre-1.12 behavior and ensures images are not accidentally left dangling while a container is running.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
(cherry picked from commit 1f7a9b1ab3)
Signed-off-by: Tibor Vass <tibor@docker.com>
Ensure that cancellation of a pull propagates rather than continuing to
container creation. This ensures that the `Prepare` method is properly
re-entrant.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
(cherry picked from commit d99c6b837f)
Signed-off-by: Tibor Vass <tibor@docker.com>
Instead reserve exit code 2 to be future proof, document that it should
not be used. Implementation-wise, it is considered as unhealthy, but
users should not rely on this as it may change in the future.
Signed-off-by: Tibor Vass <tibor@docker.com>
(cherry picked from commit 91e9f38313)
Signed-off-by: Tibor Vass <tibor@docker.com>
This changes the default behavior so that rolling updates will not
proceed once an updated task fails to start, or stops running during the
update. Users can use docker service inspect --pretty servicename to see
the update status, and if it pauses due to a failure, it will explain
that the update is paused, and show the task ID that caused it to pause.
It also shows the time since the update started.
A new --update-on-failure=(pause|continue) flag selects the
behavior. Pause means the update stops once a task fails, continue means
the old behavior of continuing the update anyway.
In the future this will be extended with additional behaviors like
automatic rollback, and flags controlling parameters like how many tasks
need to fail for the update to stop proceeding. This is a minimal
solution for 1.12.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit 57ae29aa74)
Signed-off-by: Tibor Vass <tibor@docker.com>
When daemon has liveRestore set, daemon shutdown should not shutdown
plugins. Fixes#24759
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
(cherry picked from commit 4a44cf1d4c)
Signed-off-by: Tibor Vass <tibor@docker.com>
Truncated dir name can't give any useful information, print whole dir
name will.
Bad debug log is like this:
```
DEBU[2449] aufs error unmounting /var/lib/doc: no such file or directory
```
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
(cherry picked from commit af8359562c)
Signed-off-by: Tibor Vass <tibor@docker.com>
This is required to make the libnetwork's namespace mgmt
directory configurable
Signed-off-by: Madhu Venugopal <madhu@docker.com>
(cherry picked from commit d3af5e3d4b)
Signed-off-by: Tibor Vass <tibor@docker.com>
Hostnames are not supported for now because libnetwork can't use them
for overlay networking yet.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit fca0b18dcb)
Signed-off-by: Tibor Vass <tibor@docker.com>
There are currently problems with "swarm init" and "swarm join" when an
explicit --listen-addr flag is not provided. swarmkit defaults to
finding the IP address associated with the default route, and in cloud
setups this is often the wrong choice.
Introduce a notion of "advertised address", with the client flag
--advertise-addr, and the daemon flag --swarm-default-advertise-addr to
provide a default. The default listening address is now 0.0.0.0, but a
valid advertised address must be detected or specified.
If no explicit advertised address is specified, error out if there is
more than one usable candidate IP address on the system. This requires a
user to explicitly choose instead of letting swarmkit make the wrong
choice. For the purposes of this autodetection, we ignore certain
interfaces that are unlikely to be relevant (currently docker*).
The user is also required to choose a listen address on swarm init if
they specify an explicit advertise address that is a hostname or an IP
address that's not local to the system. This is a requirement for
overlay networking.
Also support specifying interface names to --listen-addr,
--advertise-addr, and the daemon flag --swarm-default-advertise-addr.
This will fail if the interface has multiple IP addresses (unless it has
a single IPv4 address and a single IPv6 address - then we resolve the
tie in favor of IPv4).
This change also exposes the node's externally-reachable address in
docker info, as requested by #24017.
Make corresponding API and CLI docs changes.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit a0ccd0d42f)
Signed-off-by: Tibor Vass <tibor@docker.com>
In 24823, `swarm join` has been updated to take a `--token`
flag and flag `--manager` has been removed. Though in errNoManager()
the error message still use the old description.
This fix update the error message in errNoManager() and conforms
to the current available flags.
This fix is related to 24823.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
(cherry picked from commit 3d30155735)
Signed-off-by: Tibor Vass <tibor@docker.com>
Implement the proposal from
https://github.com/docker/docker/issues/24430#issuecomment-233100121
Removes acceptance policy and secret in favor of an automatically
generated join token that combines the secret, CA hash, and
manager/worker role into a single opaque string.
Adds a docker swarm join-token subcommand to inspect and rotate the
tokens.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit 2cc5bd33ee)
Signed-off-by: Tibor Vass <tibor@docker.com>
This fix is an extension to last commit to expand the partial
filter to node and task searches.
Additional integration tests have been added to cover the changes.
This fix fixes 24270.
This fix fixes 24112.
Note: A separate pull request will be opened on swarmkit.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
(cherry picked from commit e734fa58ea)
Signed-off-by: Tibor Vass <tibor@docker.com>
This fix tries to address the issue raised in 24270 where it was
not possible to have a partial name match when list services
with name filter.
This fix updates swarmkit and allows prefix search when name is
provided as the filter for listing services.
An additional integration test is added to cover the changes.
This fix fixes 24270.
Note: A separate pull request will be opened on swarmkit.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
(cherry picked from commit 1d600ebcb5)
Signed-off-by: Tibor Vass <tibor@docker.com>
This version introduces the following:
- uses nanosecond timestamps for event
- ensure events are sent once their effect is "live"
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 29b2714580)
Signed-off-by: Tibor Vass <tibor@docker.com>
Adds log driver support for service creation and update. Add flags
`--log-driver` and `--log-opt` to match `docker run`. Log drivers are
configured per service.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
(cherry picked from commit e778ba2d5b)
Signed-off-by: Tibor Vass <tibor@docker.com>
This adds the `--live-restore` option to the documentation.
Also synched usage description in the documentation
with the actual description, and re-phrased some
flag descriptions to be a bit more consistent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 64a8317a5a)
Signed-off-by: Tibor Vass <tibor@docker.com>
This adds an `--oom-score-adjust` flag to the daemon so that the value
provided can be set for the docker daemon's process. The default value
for the flag is -500. This will allow the docker daemon to have a
less chance of being killed before containers do. The default value for
processes is 0 with a min/max of -1000/1000.
-500 is a good middle ground because it is less than the default for
most processes and still not -1000 which basically means never kill this
process in an OOM condition on the host machine. The only processes on
my machine that have a score less than -500 are dbus at -900 and sshd
and xfce( my window manager ) at -1000. I don't think docker should be
set lower, by default, than dbus or sshd so that is why I chose -500.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
(cherry picked from commit a894aec8d8)
Signed-off-by: Tibor Vass <tibor@docker.com>
Add option to skip kernel check for older kernels which have been patched to support multiple lower directories in overlayfs.
Fixes#24023
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
(cherry picked from commit ff98da0607)
Signed-off-by: Tibor Vass <tibor@docker.com>
For any operation that involves netwoks (other than network create),
swarmkit expects the target as network-id. Service upate was using
network-name as the target and that caused the issue.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
(cherry picked from commit b32cfb32a3)
Signed-off-by: Tibor Vass <tibor@docker.com>