Commit graph

34457 commits

Author SHA1 Message Date
Daniel Nephin
075fd7a9be Fix vendor validation
Previously adding files to vendor/ without adding to vendor.conf would not fail the
validation.

Also be consistent with indentation and use tabs.

Signed-off-by: Daniel Nephin <dnephin@gmail.com>
2018-02-02 10:49:57 -08:00
Brian Goff
c379d2681f Fix race in attachable network attachment
Attachable networks are networks created on the cluster which can then
be attached to by non-swarm containers. These networks are lazily
created on the node that wants to attach to that network.

When no container is currently attached to one of these networks on a
node, and then multiple containers which want that network are started
concurrently, this can cause a race condition in the network attachment
where essentially we try to attach the same network to the node twice.

To easily reproduce this issue you must use a multi-node cluster with a
worker node that has lots of CPUs (I used a 36 CPU node).

Repro steps:

1. On manager, `docker network create -d overlay --attachable test`
2. On worker, `docker create --restart=always --network test busybox
top`, many times... 200 is a good number (but not much more due to
subnet size restrictions)
3. Restart the daemon

When the daemon restarts, it will attempt to start all those containers
simultaneously. Note that you could try to do this yourself over the API,
but it's harder to trigger due to the added latency from going over
the API.

The error produced happens when the daemon tries to start the container
upon allocating the network resources:

```
attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
```

What happens here is the worker makes a network attachment request to
the manager. This is an async call which in the happy case would cause a
task to be placed on the node, which the worker is waiting for to get
the network configuration.
In the case of this race, the error ocurrs on the manager like this:

```
task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e
```

The task is not created and the worker times out waiting for the task.

---

The mitigation for this is to make sure that only one attachment reuest
is in flight for a given network at a time *when the network doesn't
already exist on the node*. If the network already exists on the node
there is no need for synchronization because the network is already
allocated and on the node so there is no need to request it from the
manager.

This basically comes down to a race with `Find(network) ||
Create(network)` without any sort of syncronization.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-02-02 13:46:23 -05:00
Yong Tang
81e651c6d2
Merge pull request #36177 from yongtang/02012018-docker_api_resize_test
Migrate several resize tests from integration-cli to integration
2018-02-02 10:27:24 -08:00
Yong Tang
8f800c9415 Migrate several resize tests from integration-cli to integration
This fix migrates several resize tests from integration-cli to api tests.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-02-02 15:21:47 +00:00
Yong Tang
fd87db3769
Merge pull request #36178 from yongtang/01302018-TestAPIStatsContainerGetMemoryLimit
Migrate TestAPIStatsContainerGetMemoryLimit from integration-cli to api tests
2018-02-01 22:01:05 -08:00
Yong Tang
d5cbde514f Migrate TestAPIStatsContainerGetMemoryLimit from integration-cli to api tests
This fix migrates TestAPIStatsContainerGetMemoryLimit from
integration-cli to api test.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-02-01 22:07:06 +00:00
Yong Tang
e8d1f35718 Remove integration-cli/docker_api_resize_test.go
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-02-01 16:37:01 +00:00
Tibor Vass
53a58da551
Merge pull request #36160 from kolyshkin/layer-not-retained
daemon.cleanupContainer: nullify container RWLayer upon release
2018-01-31 15:13:00 -08:00
Vincent Demeester
5772c4b8a2
Merge pull request #36166 from yongtang/01312018-TestAPIUpdateContainer
Migrate TestAPIUpdateContainer from integration-cli to api tests
2018-01-31 15:02:00 -08:00
Yong Tang
490edd3582 Migrate TestAPIUpdateContainer from integration-cli to api tests
This fix migrates TestAPIUpdateContainer from integration-cli to api tests

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-31 20:02:55 +00:00
Daniel Nephin
231a4408f2
Merge pull request #36150 from yongtang/36139-ContainerStatus
Fix issue of ExitCode and PID not show up in Task.Status.ContainerStatus
2018-01-31 11:59:53 -08:00
Yong Tang
e1f98e8ab3
Merge pull request #36163 from thaJeztah/bump-go-connections
bump docker/go-connections to 98e7d807e5d804e4e42a98d74d1dd695321224ef
2018-01-31 09:20:10 -08:00
Yong Tang
8786b09c81 Remove docker_api_update_unix_test.go
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-31 16:10:34 +00:00
Yong Tang
9247e09944 Fix issue of ExitCode and PID not show up in Task.Status.ContainerStatus
This fix tries to address the issue raised in 36139 where
ExitCode and PID does not show up in Task.Status.ContainerStatus

The issue was caused by `json:",omitempty"` in PID and ExitCode
which interprate 0 as null.

This is confusion as ExitCode 0 does have a meaning.

This fix removes  `json:",omitempty"` in ExitCode and PID,
but changes ContainerStatus to pointer so that ContainerStatus
does not show up at all if no content. If ContainerStatus
does have a content, then ExitCode and PID will show up (even if
they are 0).

This fix fixes 36139.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-31 15:35:19 +00:00
Sebastiaan van Stijn
a6d35a822e
bump docker/go-connections to 98e7d807e5d804e4e42a98d74d1dd695321224ef
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-01-31 01:30:56 -08:00
Akihiro Suda
421664aba1
Merge pull request #36156 from r2d4/move-shell-parser
Move dockerfile ENV shell parser functions into subpackage
2018-01-31 14:48:01 +09:00
Kir Kolyshkin
e9b9e4ace2 daemon.cleanupContainer: nullify container RWLayer upon release
ReleaseRWLayer can and should only be called once (unless it returns
an error), but might be called twice in case of a failure from
`system.EnsureRemoveAll(container.Root)`. This results in the
following error:

> Error response from daemon: driver "XXX" failed to remove root filesystem for YYY: layer not retained

The obvious fix is to set container.RWLayer to nil as soon as
ReleaseRWLayer() succeeds.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-01-30 18:50:59 -08:00
Matt Rickard
a634526d14 Move builder shell parser into subpackage
Moves builder/shell_parser and into its own subpackage at builder/shell since it
has no dependencies other than the standard library. This will make it
much easier to vendor for downstream libraries, without pulling all the
dependencies of builder/.

Fixes #36154

Signed-off-by: Matt Rickard <mrick@google.com>
2018-01-30 17:54:39 -08:00
Sebastiaan van Stijn
a80cd04eb5
Merge pull request #36125 from thaJeztah/fix-plural-singular-node-generic-resources
Fix "--node-generic-resource" singular/plural
2018-01-30 14:38:50 -08:00
Yong Tang
5e7a245c3f
Merge pull request #36148 from yongtang/01302018-TestLinksEtcHostsContentMatch
Migrate some integration-cli test to api tests
2018-01-30 13:38:15 -08:00
Yong Tang
11ccbeb5c5
Merge pull request #36152 from yongtang/01302018-TestAuthAPI
Migrate TestAuthAPI from integration-cli to integration
2018-01-30 13:37:56 -08:00
Yong Tang
d924755789 Migrate TestAuthAPI from integration-cli to integration
This fix migrates TestAuthAPI from integration-cli to integration

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-30 19:15:06 +00:00
Yong Tang
e6bd20edcb Migrate some integration-cli test to api tests
This fix migrate  TestLinksEtcHostsContentMatch
to api test.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-30 18:52:48 +00:00
Yong Tang
a40687f5ac Add REMOVE and ORPHANED to TaskState
This fix tries to address the issue raised in 36142 where
there are discrepancies between Swarm API and swagger.yaml.

This fix adds two recently added state `REMOVE` and `ORPHANED` to TaskState.

This fix fixes 36142.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-30 16:46:05 +00:00
Sebastiaan van Stijn
40a9d5d24c
Merge pull request #35946 from joelwurtz/patch-2
Fix Volumes property definition in ContainerConfig
2018-01-29 20:57:09 -08:00
Yong Tang
d98cdc490c
Merge pull request #36141 from yongtang/01292018-stop_test
Migrate docker_cli_stop_test.go to api test
2018-01-29 19:31:01 -08:00
Yong Tang
4f378124ff Migrate docker_cli_stop_test.go to api test
This fix migrate docker_cli_stop_test.go to api test

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-30 00:55:18 +00:00
Yong Tang
9d61e5c8c1
Merge pull request #36124 from crosbymichael/exec
Use proc/exe for reexec
2018-01-29 11:41:08 -08:00
Brian Goff
9c5686fbd0
Merge pull request #36131 from yongtang/01282018-swarmkit
Update swarmkit to 68a376dc30d8c4001767c39456b990dbd821371b
2018-01-29 11:07:13 -08:00
Vincent Demeester
d093aa0ec3
Merge pull request #36130 from yongtang/36042-secret-config-mode
Fix secret and config mode issue
2018-01-29 10:37:24 -08:00
Yong Tang
03a1df9536
Merge pull request #36114 from Microsoft/jjh/fixdeadlock-lcowdriver-hotremove
LCOW: Graphdriver fix deadlock in hotRemoveVHDs
2018-01-29 09:57:43 -08:00
Akihiro Suda
cd3c0057ac
Merge pull request #34369 from cyphar/build-buildmode-pie
*: switch to -buildmode=pie
2018-01-29 23:54:03 +09:00
Sebastiaan van Stijn
4219df22fc
Merge pull request #36132 from yongtang/01282018-typo
Fix a typo in CONTRIBUTING.md
2018-01-28 12:21:46 -08:00
Yong Tang
65ee7fff02 Add test cases for file mode with secret and config.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-28 16:48:10 +00:00
Yong Tang
958200970f
Merge pull request #36074 from shutefan/master
Update API docs to show that /containers/{id}/kill returns HTTP 409
2018-01-28 08:32:35 -08:00
Yong Tang
3305221eef Fix secret and config mode issue
This fix tries to address the issue raised in 36042
where secret and config are not configured with the
specified file mode.

This fix update the file mode so that it is not impacted
with umask.

Additional tests have been added.

This fix fixes 36042.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-28 16:21:41 +00:00
Yong Tang
b9923d8530 Update swarmkit to 68a376dc30d8c4001767c39456b990dbd821371b
This fix updates swarmkit to 68a376dc30d8c4001767c39456b990dbd821371b:
```
-github.com/docker/swarmkit 713d79dc8799b33465c58ed120b870c52eb5eb4f
+github.com/docker/swarmkit 68a376dc30d8c4001767c39456b990dbd821371b
```

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-28 16:20:17 +00:00
Yong Tang
cce360c7b0 Fix a typo in CONTRIBUTING.md
`seperate` -> `separate`

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-01-28 16:19:51 +00:00
Yong Tang
9368e9dac3
Merge pull request #36099 from thaJeztah/bump-libnetwork3
bump libnetwork to 5ab4ab830062fe8a30a44b75b0bda6b1f4f166a4
2018-01-27 21:47:29 -08:00
Yong Tang
c9f1807abb
Merge pull request #34992 from allencloud/simplify-shutdowntimeout
simplify codes on calculating shutdown timeout
2018-01-27 18:26:54 -08:00
Sebastiaan van Stijn
924fb0e843
Merge pull request #36095 from yongtang/36083-network-inspect-created-time
Fix issue where network inspect does not show Created time for networks in swarm scope
2018-01-26 17:18:30 -08:00
Sebastiaan van Stijn
c41c80026b
Merge pull request #35911 from ASMfreaK/Russian-Scientsists-addition-to-names-generator
Add more russian scientists in names-generator.go
2018-01-26 15:48:58 -08:00
Sebastiaan van Stijn
6e7715d65b
Fix "--node-generic-resource" singular/plural
Daemon flags that can be specified multiple times use
singlar names for flags, but plural names for the configuration
file.

To make the daemon configuration know how to correlate
the flag with the corresponding configuration option,
`opt.NewNamedListOptsRef()` should be used instead of
`opt.NewListOptsRef()`.

Commit 6702ac590e attempted
to fix the daemon not corresponding the flag with the configuration
file option, but did so by changing the name of the flag
to plural.

This patch reverts that change, and uses `opt.NewNamedListOptsRef()`
instead.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-01-26 13:53:13 -08:00
Michael Crosby
59ec65cd8c Use proc/exe for reexec
You don't need to resolve the symlink for the exec as long as the
process is to keep running during execution.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-01-26 14:13:43 -05:00
Michael Crosby
2c05aefc99
Merge pull request #36047 from cpuguy83/graphdriver_improvements
Do not make graphdriver homes private mounts.
2018-01-26 13:54:30 -05:00
Allen Sun
de68ac8393
Simplify codes on calculating shutdown timeout
Signed-off-by: Allen Sun <shlallen1990@gmail.com>
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
2018-01-26 09:18:07 -08:00
John Howard
a44fcd3d27 LCOW: Graphdriver fix deadlock
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-26 08:57:52 -08:00
Vincent Demeester
6d4d3c52ae
Merge pull request #34372 from cpuguy83/more_error_handling_for_pluginrm
Ignore exist/not-exist errors on plugin remove
2018-01-25 20:45:17 -08:00
Yong Tang
3e416acf1d
Merge pull request #36108 from Microsoft/jjh/opengcsv0.3.6
Revendor Microsoft/opengcs @ v0.3.6
2018-01-25 16:42:35 -08:00
Sebastiaan van Stijn
a8d0e36d03
Merge pull request #36052 from Microsoft/jjh/no-overlay-off-only-one-disk
LCOW: Regular mount if only one layer
2018-01-25 15:46:16 -08:00