Commit graph

120 commits

Author SHA1 Message Date
Aaron Lehmann
c27603238c Fix incorrect assumption in TestAPISwarmRaftQuorum
This test shuts down two out of three managers and then asserts that the
swarm has a leader. A swarm that lost quorum won't necessarily have a
leader, and in this case only has one because the old leader is still
around. Soon SwarmKit will be changed so the leader gives up leadership
when quorum is lost. This will avoid confusing situations, like
read-only APIs succeeding, while ones that write to Raft hang.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-04-25 12:10:12 -07:00
Ying Li
9b96b2d276 Add tests to ensure we can add an external CA to the cluster without
error.

Signed-off-by: Ying Li <ying.li@docker.com>
2017-04-12 16:55:48 -07:00
Aaron Lehmann
6763641d69 Add integration test for START_FIRST update order
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-04-06 17:23:36 -07:00
Tõnis Tiigi
3fe2730ab3 Merge pull request #31535 from aaronlehmann/vendor-swarmkit-7fc7503
Vendor swarmkit d60ccf3
2017-03-08 09:52:28 -08:00
Aaron Lehmann
39433318fe Vendor swarmkit d60ccf3
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-03-07 19:09:21 -08:00
Aaron Lehmann
f9bd8ec8b2 Implement server-side rollback, for daemon versions that support this
Server-side rollback can take advantage of the rollback-specific update
parameters, instead of being treated as a normal update that happens to
go back to a previous version of the spec.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-03-03 16:33:34 -08:00
Sebastiaan van Stijn
3a5a1c3f3d Merge pull request #30725 from aaronlehmann/topology
Topology-aware scheduling
2017-03-03 15:01:12 +01:00
Aaron Lehmann
17288c611a Topology-aware scheduling
This adds support for placement preferences in Swarm services.

- Convert PlacementPreferences between GRPC API and HTTP API
- Add --placement-pref, --placement-pref-add and --placement-pref-rm to CLI
- Add support for placement preferences in service inspect --pretty
- Add integration test

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-02-27 13:29:54 -08:00
Aaron Lehmann
99119fcafa Vendor swarmkit 46bbd41
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-02-27 11:51:00 -08:00
allencloud
69afd30444 split docker_api_swarm_test.go into multiple files
Signed-off-by: allencloud <allen.sun@daocloud.io>
2017-02-11 00:18:01 +08:00
Yong Tang
8feb5c5a48 Fix issue where service healthcheck is {} in remote API
This fix tries to address the issue raised in 30178 where
service healthcheck is `{}` in remote API will result in
dns resolve failue.

The reason was that when service healthcheck is `{}`,
service binding was not done.

This fix fixes the issue.

An integration test has been added.

This fix fixes 30178.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-01-27 15:43:44 -08:00
Yong Tang
05a831a775 Fix incorrect Scope in network ls/inspect with duplicate network names
This fix tries to address the issue raised in 30242 where the `Scope`
field always changed to `swarm` in the ouput of `docker network ls/inspect`
when duplicate networks name exist.

The reason for the issue was that `buildNetworkResource()` use network name
(which may not be unique) to check for the scope.

This fix fixes the issue by always use network ID in `buildNetworkResource()`.

A test has been added. The test fails before the fix and passes after the fix.

This fix fixes 30242.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2017-01-25 09:39:55 -08:00
Tonis Tiigi
d377b074fd Add test for swarm error handling
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-01-05 15:46:07 -08:00
Vincent Demeester
33968e6c7d
Remove pkg/integration and move it to testutil or integration-cli
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2016-12-30 18:26:34 +01:00
Tibor Vass
0d5a715931 Merge pull request #29470 from cyli/ask-for-unlock-key-only-if-locked
Check if a swarm is locked before asking a user to enter their unlock key
2016-12-20 13:21:47 -08:00
allencloud
29d4a7f512 update response status code for cluster request
Signed-off-by: allencloud <allen.sun@daocloud.io>
2016-12-19 10:21:10 +08:00
Ying Li
a6a0880a22 Before asking a user for the unlock key when they run docker swarm unlock, actually
check to see if the node is part of a swarm, and if so, if it is unlocked first.
If neither of these are true, abort the command.

Signed-off-by: Ying Li <ying.li@docker.com>
2016-12-16 17:16:55 -08:00
Vincent Demeester
c502fb49dc Use *check.C in StartWithBusybox, Start, Stop and Restart…
… to make sure it doesn't fail. It also introduce StartWithError,
StopWithError and RestartWithError in case we care about the
error (and want the error to happen).

This removes the need to check for error and make the intent more
clear : I want a deamon with busybox loaded on it — if an error occur
it should fail the test, but it's not the test code that has the
responsability to check that.

Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2016-12-12 09:46:47 +01:00
Vincent Demeester
48de91a33f Extract daemon to its own package
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2016-12-09 22:26:42 +01:00
Tonis Tiigi
b7ea1bdb02 Switch cluster locking strategy
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-11-30 14:59:12 -08:00
Evan Hazlett
e63dc5cde4 secrets: add secret create and delete integration tests
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:44 -05:00
Evan Hazlett
189f89301e more review updates
- use /secrets for swarm secret create route
- do not specify omitempty for secret and secret reference
- simplify lookup for secret ids
- do not use pointer for secret grpc conversion

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:43 -05:00
Evan Hazlett
857e60c2f9 review changes
- fix lint issues
- use errors pkg for wrapping errors
- cleanup on error when setting up secrets mount
- fix erroneous import
- remove unneeded switch for secret reference mode
- return single mount for secrets instead of slice

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:43 -05:00
Aaron Lehmann
073d811587 integration-cli: Fix style of swarm test name
A recent change fixed integration tests to use "API" in test names
instead of "Api". A new test was added in a PR opened before this
change, and didn't benefit from the cleanup. Fix its name.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-10-24 16:45:17 -07:00
Aaron Lehmann
6d4b527699 Service update failure thresholds and rollback
This adds support for two enhancements to swarm service rolling updates:

- Failure thresholds: In Docker 1.12, a service update could be set up
  to either pause or continue after a single failure occurs. This adds
  an --update-max-failure-ratio flag that controls how many tasks need to
  fail to update for the update as a whole to be considered a failure. A
  counterpart flag, --update-monitor, controls how long to monitor each
  task for a failure after starting it during the update.

- Rollback flag: service update --rollback reverts the service to its
  previous version. If a service update encounters task failures, or
  fails to function properly for some other reason, the user can roll back
  the update.

SwarmKit also has the ability to roll back updates automatically after
hitting the failure thresholds, but we've decided not to expose this in
the Docker API/CLI for now, favoring a workflow where the decision to
roll back is always made by an admin. Depending on user feedback, we may
add a "rollback" option to --update-failure-action in the future.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-10-18 10:09:50 -07:00
Tonis Tiigi
da9ef68f06 Add requirements for tests that require network
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-10-12 11:11:23 -07:00
Akihiro Suda
7fb7a477d7 [nit] integration-cli: obey Go's naming convention
No substantial code change.

 - Api         --> API
 - Cli         --> CLI
 - Http, Https --> HTTP, HTTPS
 - Id          --> ID
 - Uid,Gid,Pid --> UID,PID,PID
 - Ipam        --> IPAM
 - Tls         --> TLS (TestDaemonNoTlsCliTlsVerifyWithEnv --> TestDaemonTLSVerifyIssue13964)

Didn't touch in this commit:
 - Git: because it is officially "Git": https://git-scm.com/
 - Tar: because it is officially "Tar": https://www.gnu.org/software/tar/
 - Cpu, Nat, Mac, Ipc, Shm: for keeping a consistency with existing production code (not changable, for compatibility)

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2016-09-30 01:21:05 +00:00
Michael Crosby
91e197d614 Add engine-api types to docker
This moves the types for the `engine-api` repo to the existing types
package.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-09-07 11:05:58 -07:00
Yong Tang
80e3975117 Fix issue in API POST /services/(id or name)/update
This fix tries to address the issue raised in 26090 where
remote API `POST /services/(id or name)/update` cannot
use `name` to update. This is not consistent with the
documentation of the remote API.

This fix fixes this issue by performing a lookup with `getService`
in case `name` instead of `id` is used in API.

This fix adds an integration test to cover the changes.

This fix fixes 26090.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-08-29 21:13:53 -07:00
Vincent Demeester
4a94a6513b Merge pull request #25341 from tonistiigi/fix-pending-tests
Fix swarm pending state tests
2016-08-03 16:12:43 +02:00
Sebastiaan van Stijn
10ae908bfa Merge pull request #25159 from diogomonica/adding-force-to-node-remove
Adding force to node rm
2016-08-02 22:49:15 +02:00
Tonis Tiigi
fa3b5964b9 Fix swarm pending state tests
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-08-02 11:39:05 -07:00
Diogo Monica
a327c231b5 Add --force to node removal
Signed-off-by: Diogo Monica <diogo.monica@gmail.com>
2016-08-01 18:55:58 -07:00
Aaron Lehmann
f35c4343f3 Merge pull request #24878 from dongluochen/swarmConstraintTest
Add integration test for constraints
2016-08-01 17:45:23 -06:00
Yong Tang
85c9ef8a47 Remove testRequires(c, Network) from swarm integration tests
Since 24237 has been merged, it is not necessary to require network
for swarm integration tests (`integration-cli/docker_api_swarm_test.go`)
any more.

This fix removes testRequires(c, Network) from swarm integration
tests.

This fix could be verified by disable networking, and all related
tests pass.

This fix is related to 24547, 24490, 24237.

This fix fixes 24547.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-07-30 09:59:30 -07:00
Alexander Morozov
307b7b0d15 integration: drain node before stop in TestApiSwarmForceNewCluster
It's too long to wait for reschedule.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2016-07-29 10:44:30 -07:00
Michael Crosby
262063531a Merge pull request #25077 from cpuguy83/fix_TestApiSwarmRestartCluster
Fix race in `TestApiSwarmRestartCluster`
2016-07-28 10:15:31 -07:00
Sebastiaan van Stijn
e07ff10f70 Merge pull request #25104 from cpuguy83/fix_TestApiSwarmRaftQuorum
fix race in TestApiSwarmRaftQuorum
2016-07-27 12:50:09 +02:00
Vincent Demeester
ef63637b99 Merge pull request #25107 from stevvooe/cleanup-leader-election-test
integration-cli: cleanup leader election tests
2016-07-27 12:47:33 +02:00
Stephen J Day
946e23776b
integration-cli: cleanup leader election tests
Ensure convergence before moving on with testing leader election
conditions. This reduce the flakiness of this test when run in different
environments.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2016-07-26 19:12:27 -07:00
Brian Goff
4a856d7a87 fix race in TestApiSwarmRaftQuorum
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-07-26 21:32:56 -04:00
Aaron Lehmann
c93c649258 Specify a lower restart delay for swarm integration tests
If no restart delay is specified for a swarm service, the default
restart delay is 5 seconds. This is a reasonable value for actual
deployments - one example of where it's useful is that if a bad image is
specified, the orchestrator will wait 5 seconds between attempts to
restart it instead of restarting it in a tight loop.

In integration tests, this 5 second delay is dead time. The tests run
faster if the delay is reduced. Set it to 100 ms to avoid the waste of
time.

This appears to speed up a few tests:

DockerSwarmSuite.TestApiSwarmForceNewCluster 37.241s -> 34.323s
DockerSwarmSuite.TestApiSwarmRestartCluster  22.038s -> 15.545s
DockerSwarmSuite.TestApiSwarmServicesMultipleAgents 24.456s -> 19.853s
DockerSwarmSuite.TestApiSwarmServicesStateReporting 19.240s -> 10.049s

...a small step towards making the Swarm integration tests run in a
reasonable amount of time.

Also, change the update delay for the rolling update test from 8 seconds
to 4 seconds, which should be sufficient to differentiate between
batches of updated tasks. This reduces the runtime for
DockerSwarmSuite.TestApiSwarmServicesUpdate from 28s to 20s.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-07-26 12:12:43 -07:00
Brian Goff
fdcde8bb65 Fix race in TestApiSwarmRestartCluster
In `TestApiSwarmRestartCluster`, it's calling `checkClusterHealth`.
`checkClusterHealth` calls `d.info()`, which will return an error if
there is no cluster leader... problem is `checkClusterHealth` is doing a
nil error assertion w/o giving any time for a leader to be elected.

This moves the `d.info()` call into a `waitAndAssert` using the default
reconciliation timeout.

It also moves some other checks into a `waitAndAssert` to give the
cluster enough time to come back up.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-07-26 14:37:18 -04:00
Aaron Lehmann
57ae29aa74 Add failure action for rolling updates
This changes the default behavior so that rolling updates will not
proceed once an updated task fails to start, or stops running during the
update. Users can use docker service inspect --pretty servicename to see
the update status, and if it pauses due to a failure, it will explain
that the update is paused, and show the task ID that caused it to pause.
It also shows the time since the update started.

A new --update-on-failure=(pause|continue) flag selects the
behavior. Pause means the update stops once a task fails, continue means
the old behavior of continuing the update anyway.

In the future this will be extended with additional behaviors like
automatic rollback, and flags controlling parameters like how many tasks
need to fail for the update to stop proceeding. This is a minimal
solution for 1.12.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-07-25 08:51:19 -07:00
Aaron Lehmann
a0ccd0d42f Split advertised address from listen address
There are currently problems with "swarm init" and "swarm join" when an
explicit --listen-addr flag is not provided. swarmkit defaults to
finding the IP address associated with the default route, and in cloud
setups this is often the wrong choice.

Introduce a notion of "advertised address", with the client flag
--advertise-addr, and the daemon flag --swarm-default-advertise-addr to
provide a default. The default listening address is now 0.0.0.0, but a
valid advertised address must be detected or specified.

If no explicit advertised address is specified, error out if there is
more than one usable candidate IP address on the system. This requires a
user to explicitly choose instead of letting swarmkit make the wrong
choice. For the purposes of this autodetection, we ignore certain
interfaces that are unlikely to be relevant (currently docker*).

The user is also required to choose a listen address on swarm init if
they specify an explicit advertise address that is a hostname or an IP
address that's not local to the system. This is a requirement for
overlay networking.

Also support specifying interface names to --listen-addr,
--advertise-addr, and the daemon flag --swarm-default-advertise-addr.
This will fail if the interface has multiple IP addresses (unless it has
a single IPv4 address and a single IPv6 address - then we resolve the
tie in favor of IPv4).

This change also exposes the node's externally-reachable address in
docker info, as requested by #24017.

Make corresponding API and CLI docs changes.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-07-24 09:23:07 -07:00
Dong Chen
1b1a7f29e5 Add integration test for constraints.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-07-21 18:08:49 -07:00
Aaron Lehmann
2cc5bd33ee Replace secrets with join tokens
Implement the proposal from
https://github.com/docker/docker/issues/24430#issuecomment-233100121

Removes acceptance policy and secret in favor of an automatically
generated join token that combines the secret, CA hash, and
manager/worker role into a single opaque string.

Adds a docker swarm join-token subcommand to inspect and rotate the
tokens.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2016-07-21 15:23:03 -07:00
Tõnis Tiigi
ea59668046 Merge pull request #24563 from dperny/test-leader-election
Added leader election test
2016-07-20 16:02:09 -07:00
Dong Chen
d327765a62 Test rolling update.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
2016-07-19 12:09:30 -07:00
Drew Erny
3489e76513 Added leader election test
Signed-off-by: Drew Erny <drew.erny@docker.com>
2016-07-19 11:29:27 -07:00