Commit graph

545 commits

Author SHA1 Message Date
Drew Erny
89edb68e89
Fix possible overlapping IPs
A node is no longer using its load balancer IP address when it no longer
has tasks that use the network that requires that load balancer. When
this occurs, the swarmkit manager will free that IP in IPAM, and may
reaassign it.

When a task shuts down cleanly, it attempts removal of the networks it
uses, and if it is the last task using those networks, this removal
succeeds, and the load balancer IP is freed.

However, this behavior is absent if the container fails. Removal of the
networks is never attempted.

To address this issue, I amend the executor. Whenever a node load
balancer IP is removed or changed, that information is passedd to the
executor by way of the Configure method. By keeping track of the set of
node NetworkAttachments from the previous call to Configure, we can
determine which, if any, have been removed or changed.

At first, this seems to create a race, by which a task can be attempting
to start and the network is removed right out from under it. However,
this is already addressed in the controller. The controller will attempt
to recreate missing networks before starting a task.

Signed-off-by: Drew Erny <derny@mirantis.com>
(cherry picked from commit 0d9b0ed678)
Signed-off-by: Ameya Gawde <agawde@mirantis.com>
2021-06-18 10:13:59 -07:00
Drew Erny
295fb1c35e Fix jobs mode filter spelling
Oops.

Signed-off-by: Drew Erny <derny@mirantis.com>
2020-12-15 14:45:05 -06:00
Drew Erny
dd752ec87a Fix jobs-related bug in task conversion
While working on some other code, noticed a bug in the jobs code. We're
adding job version after we're checking if there are port configs.
Before, if there were no port configs, the job version would be missing,
because we would return before trying to convert.

This moves the jobs version conversion above that code, so we don't
accidentally return before it.

Signed-off-by: Drew Erny <derny@mirantis.com>
2020-12-02 12:27:23 -06:00
Albin Kerouanton
c76f380bea
Add ulimits support to services
Add Ulimits field to the ContainerSpec API type and wire it to Swarmkit.

This is related to #40639.

Signed-off-by: Albin Kerouanton <albin@akerouanton.name>
2020-07-29 02:09:06 +02:00
Brian Goff
24f173a003 Replace service "Capabilities" w/ add/drop API
After dicussing with maintainers, it was decided putting the burden of
providing the full cap list on the client is not a good design.
Instead we decided to follow along with the container API and use cap
add/drop.

This brings in the changes already merged into swarmkit.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-07-27 10:09:42 -07:00
Sebastiaan van Stijn
687bdc7c71
API: swarm: move PidsLimit to TaskTemplate.Resources
The initial implementation followed the Swarm API, where
PidsLimit is located in ContainerSpec. This is not the
desired place for this property, so moving the field to
TaskTemplate.Resources in our API.

A similar change should be made in the SwarmKit API (likely
keeping the old field for backward compatibility, because
it was merged some releases back)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-06-05 12:50:38 +02:00
Sebastiaan van Stijn
84748c7d4e
API: split types for Resources Reservations and Limits
This introduces A new type (`Limit`), which allows Limits
and "Reservations" to have different options, as it's not
possible to make "Reservations" for some kind of limits.

The `GenericResources` have been removed from the new type;
the API did not handle specifying `GenericResources` as a
_Limit_ (only as _Reservations_), and this field would
therefore always be empty (omitted) in the `Limits` case.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-18 14:21:23 +02:00
Sebastiaan van Stijn
07d60bc257
Replace errors.Cause() with errors.Is() / errors.As()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-29 00:28:41 +02:00
Sebastiaan van Stijn
157c53c8e0
Add API support for PidsLimit on services
Support for PidsLimit was added to SwarmKit in docker/swarmkit/pull/2415,
but never exposed through the Docker remove API.

This patch exposes the feature in the repote API.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-04-15 22:37:42 +02:00
Akihiro Suda
745fa04e52 service: support --mount type=bind,bind-nonrecursive
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-13 19:01:28 +09:00
Akihiro Suda
f073ed4187
Merge pull request #39816 from tao12345666333/rm-systeminfo-error-handler
Remove `SystemInfo()` error handling.
2020-03-13 01:59:13 +09:00
Ziheng Liu
83c0bedba9 daemon/cluster: add a missing Unlock
Signed-off-by: Ziheng Liu <lzhfromustc@gmail.com>
2020-02-25 13:55:11 -05:00
Sebastiaan van Stijn
9f0b3f5609
bump gotest.tools v3.0.1 for compatibility with Go 1.14
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-11 00:06:42 +01:00
Drew Erny
30d9fe30b1 Add swarm jobs
Adds support for ReplicatedJob and GlobalJob service modes. These modes
allow running service which execute tasks that exit upon success,
instead of daemon-type tasks.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2020-01-13 13:21:12 -06:00
Sebastiaan van Stijn
6625fa6103
daemon/cluster: normalize comment formatting
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-11-27 15:42:53 +01:00
Sebastiaan van Stijn
05469b5fa2
daemon: add "isWindows" const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-10-17 23:49:43 +02:00
Drew Erny
f36042d259 Add support for sending down service Running and Desired task counts
Adds a new ServiceStatus field to the Service object, which includes the
running and desired task counts. This new field is gated behind a
"status" query parameter.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-10-14 10:43:00 -05:00
John Howard
8988448729 Remove refs to jhowardmsft from .go code
Signed-off-by: John Howard <jhoward@microsoft.com>
2019-09-25 10:51:18 -07:00
Sebastiaan van Stijn
bd7180fcf9
cluster/controllers/plugin: remove unused Controller.taskID (unused)
```
daemon/cluster/controllers/plugin/controller.go:37:2: U1000: field `taskID` is unused (unused)
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:57:49 +02:00
Kir Kolyshkin
7b0e0335bc
Fix some inefassign warnings
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-09-18 12:57:29 +02:00
Sebastiaan van Stijn
07ff4f1de8
goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
Sebastiaan van Stijn
56e690f340
cluster/executor: remove unused containerConfig.endpoint()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:55:48 +02:00
Sebastiaan van Stijn
5ded7886c3
daemon/cluster: fix unused context (staticcheck)
```
daemon/cluster/nodes.go:69:36: SA4009: argument ctx is overwritten before first use (staticcheck)
13:06:14 	return c.lockedManagerAction(func(ctx context.Context, state nodeState) error {
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:55:42 +02:00
Jintao Zhang
9134130b39 Remove SystemInfo() error handling.
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
2019-08-29 07:44:39 +08:00
Michael Crosby
fb459f6671
Merge pull request #38441 from sirlatrom/swarm_plugin_env
Allow specifying environment variables when installing an engine plugin as a Swarm service
2019-07-08 15:26:55 -04:00
Sebastiaan van Stijn
77657ea737
Merge pull request #39346 from dperny/fix-more-grpc-sizes
Fix more grpc list message sizes
2019-07-02 23:07:53 +02:00
Drew Erny
a84a78e976 Fix more grpc list message sizes
There are a few more places, apparently, that List operations against
Swarm exist, besides just in the List methods. This increases the max
received message size in those places.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-06-13 12:01:49 -05:00
Ethan Mosbaugh
50c6a5fb07 Fix rate limiting for logger, increase refill rate
Signed-off-by: Ethan Mosbaugh <ethan@replicated.com>
2019-06-12 13:48:36 -07:00
Sebastiaan van Stijn
c85fe2d224
Merge pull request #38522 from cpuguy83/fix_timers
Make sure timers are stopped after use.
2019-06-07 13:16:46 +02:00
Kirill Kolyshkin
1d5748d975
Merge pull request #39173 from olljanat/25885-capabilities-swarm
Add support for capabilities options in services
2019-06-06 15:03:46 -07:00
Drew Erny
a0903e1fa3 Increase max recv gRPC message size for nodes and secrets
Increases the max recieved gRPC message size for Node and Secret list
operations. This has already been done for the other swarm types, but
was not done for these.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-06-03 11:42:31 -05:00
Olli Janatuinen
f787b235de Add support capabilities list on services
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2019-05-28 19:52:36 +03:00
Arko Dasgupta
70fa7b6a3f Network not deleted after stack is removed
Make sure adapter.removeNetworks executes during task Remove
adapter.removeNetworks was being skipped for cases when
isUnknownContainer(err) was true after adapter.remove was executed

This fix eliminates the nil return case forcing the function
to continue executing unless there is a true error

Fixes https://github.com/moby/moby/issues/39225

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2019-05-23 12:37:17 -07:00
Brian Goff
03a03c6c32
Merge pull request #39190 from ollypom/swarmnanocpu
Switch Swarm Mode services to NanoCpu
2019-05-20 10:14:07 -07:00
Olly Pomeroy
8a60a1e14a Switch swarmmode services to NanoCpu
Today `$ docker service create --limit-cpu` configures a containers
`CpuPeriod` and `CpuQuota` variables, this commit switches this to
configure a containers `NanoCpu` variable instead.

Signed-off-by: Olly Pomeroy <olly@docker.com>
2019-05-08 14:04:24 +00:00
Arko Dasgupta
680d0ba4ab Remove a network during task SHUTDOWN instead of REMOVE to
make sure the LB sandbox is removed when a service is updated
with a --network-rm option

Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
2019-05-06 20:26:59 -07:00
Sven Dowideit
29ad9379f4 I can lose a screw if its on too loose
Signed-off-by: Sven Dowideit <SvenDowideit@home.org.au>
2019-04-23 11:36:31 +10:00
Sune Keller
fca5ee3bd5 Support environment vars in Swarm plugins services
Allow specifying environment variables when installing an engine plugin
as a Swarm service. Invalid environment variable entries (without an
equals (`=`) char) will be ignored.

Signed-off-by: Sune Keller <absukl@almbrand.dk>
2019-04-07 09:48:19 +02:00
Tibor Vass
06c9ae1327
Merge pull request #38906 from thaJeztah/carry_38304_fix_swarm_leave_hanging
Fix for situation where swarm leave causes wait forever for agent to stop
2019-03-21 14:12:41 -07:00
Kyle Wuolle
e65c680394
Fix for situation where swarm leave causes wait forever for agent to stop
In this case the message to stop the agent is never actually sent
because the swarm node is nil

Signed-off-by: Kyle Wuolle <kyle.wuolle@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-03-19 18:45:14 +01:00
Sebastiaan van Stijn
81eef17e38
Return a warning when running in a two-manager setup
Running a cluster in a two-manager configuration effectively *doubles*
the chance of loosing control over the cluster (compared to running
in a single-manager setup). Users may have the assumption that having
two managers provides fault tolerance, so it's best to warn them if
they're using this configuration.

This patch adds a warning to the `info` response if Swarm is configured
with two managers:

    WARNING: Running Swarm in a two-manager configuration. This configuration provides
             no fault tolerance, and poses a high risk to loose control over the cluster.
             Refer to https://docs.docker.com/engine/swarm/admin_guide/ to configure the
             Swarm for fault-tolerance.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-03-18 14:36:00 +01:00
Brian Goff
fa9df85c6a Had HasExperimental() to cluster backend
It's already defined on the daemon. This allows us to not call
`SystemInfo` which is failry heavy and potentially can even error.

Takes care of todo item from Derek's containerd integration PR.
51c412f26e/daemon/cluster/services.go (L148-L149)

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-02-28 16:52:30 -08:00
Dani Louca
3fbbeb703c set bigger grpc limit for GetConfigs api
Signed-off-by: Dani Louca <dani.louca@docker.com>
2019-02-26 11:09:25 -05:00
Drew Erny
6f1d7ddfa4 Use Runtime target
The Swarmkit api specifies a target for configs called called "Runtime"
which indicates that the config is not mounted into the container but
has some other use. This commit updates the Docker api to reflect this.

Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-02-19 13:14:17 -06:00
Sebastiaan van Stijn
20383d504b Add support for using Configs as CredentialSpecs in services
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-02-04 15:29:33 -06:00
Drew Erny
04995fa7c7 Add CredentialSpec from configs support
Signed-off-by: Drew Erny <drew.erny@docker.com>
2019-02-04 14:52:01 -06:00
Brian Goff
eaad3ee3cf Make sure timers are stopped after use.
`time.After` keeps a timer running until the specified duration is
completed. It also allocates a new timer on each call. This can wind up
leaving lots of uneccessary timers running in the background that are
not needed and consume resources.

Instead of `time.After`, use `time.NewTimer` so the timer can actually
be stopped.
In some of these cases it's not a big deal since the duraiton is really
short, but in others it is much worse.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2019-01-16 14:32:53 -08:00
Olli Janatuinen
153171e9dd Added support for maximum replicas per node to services
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2018-12-24 02:04:15 +02:00
selansen
32180ac0c7 VXLAN UDP Port configuration support
This commit contains changes to configure DataPathPort
option. By default we use 4789 port number. But this commit
will allow user to configure port number during swarm init.
DataPathPort can't be modified after swarm init.
Signed-off-by: selansen <elango.siva@docker.com>
2018-11-22 17:35:02 -05:00
Akihiro Suda
596cdffb9f mount: add BindOptions.NonRecursive (API v1.40)
This allows non-recursive bind-mount, i.e. mount(2) with "bind" rather than "rbind".

Swarm-mode will be supported in a separate PR because of mutual vendoring.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-11-06 17:51:58 +09:00