Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Add daemon config to allow the user to specify the MTU of the control plane network.
The first user of this new parameter is actually libnetwork that can seed the
gossip with the proper MTU value allowing to pack multiple messages per UDP packet sent.
If the value is not specified or is lower than 1500 the logic will set it to the default.
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Makes sure that debug endpoints are always available, which will aid in
debugging demon issues.
Wraps debug endpoints in the middleware chain so the can be blocked by
authz.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Due to the CL https://go-review.googlesource.com/c/39608/ in
x/sys/windows which changed the definitions of STD_INPUT_HANDLE,
STD_OUTPUT_HANDLE and STD_ERROR_HANDLE, we get the following failure
in cmd/dockerd/service_windows.go after re-vendoring x/sys/windows:
06:29:57 # github.com/docker/docker/cmd/dockerd
06:29:57 .\service_windows.go:400: cannot use sh (type int) as type uint32 in argument to windows.GetStdHandle
Fix it by adding an explicit type conversion when calling
windows.GetStdHandle.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Enables other subsystems to watch actions for a plugin(s).
This will be used specifically for implementing plugins on swarm where a
swarm controller needs to watch the state of a plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Deprecation of interacting with v1 registries was
started in docker 1.8.3, which added a `--disable-legacy-registry`
flag.
This option was anounced to be the default starting
with docker 17.06, and v1 registries completely
removed in docker 17.12.
This patch updates the default, and disables
interaction with v1 registres by default.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commit the rwLayer to get the correct DiffID
Refacator copy in thebuilder
move more code into exportImage
cleanup some windows tests
Release the newly commited layer.
Set the imageID on the buildStage after exporting a new image.
Move archiver to BuildManager.
Have ReleaseableLayer.Commit return a layer
and store the Image from exportImage in the local imageSources cache
Remove NewChild from image interface.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
- Moving the `common*.go` files in `cmd/dockerd` directly (it's the
only place it's getting used)
- Rename `cli/flags` to `cli/config` because it's the only thing left
in that package 👼
Now, `integration-cli` does *truly* not depend on `cobra` stuff.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
If a container mount the socket the daemon is listening on into
container while the daemon is being shutdown, the socket will
not exist on the host, then daemon will assume it's a directory
and create it on the host, this will cause the daemon can't start
next time.
fix issue https://github.com/moby/moby/issues/30348
To reproduce this issue, you can add following code
```
--- a/daemon/oci_linux.go
+++ b/daemon/oci_linux.go
@@ -8,6 +8,7 @@ import (
"sort"
"strconv"
"strings"
+ "time"
"github.com/Sirupsen/logrus"
"github.com/docker/docker/container"
@@ -666,7 +667,8 @@ func (daemon *Daemon) createSpec(c *container.Container) (*libcontainerd.Spec, e
if err := daemon.setupIpcDirs(c); err != nil {
return nil, err
}
-
+ fmt.Printf("===please stop the daemon===\n")
+ time.Sleep(time.Second * 2)
ms, err := daemon.setupMounts(c)
if err != nil {
return nil, err
```
step1 run a container which has `--restart always` and `-v /var/run/docker.sock:/sock`
```
$ docker run -ti --restart always -v /var/run/docker.sock:/sock busybox
/ #
```
step2 exit the the container
```
/ # exit
```
and kill the daemon when you see
```
===please stop the daemon===
```
in the daemon log
The daemon can't restart again and fail with `can't create unix socket /var/run/docker.sock: is a directory`.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
The --allow-nondistributable-artifacts daemon option specifies
registries to which foreign layers should be pushed. (By default,
foreign layers are not pushed to registries.)
Additionally, to make this option effective, foreign layers are now
pulled from the registry if possible, falling back to the URLs in the
image manifest otherwise.
This option is useful when pushing images containing foreign layers to a
registry on an air-gapped network so hosts on that network can pull the
images without connecting to another server.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
This commit in conjunction with a libnetwork side commit,
cleans up the libnetwork SetClusterProvider logic interaction.
The previous code was inducing libnetwork to spawn several go
routines that were racing between each other during the agent
init and close.
A test got added to verify that back to back swarm init and leave
are properly processed and not raise crashes
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Allows storing key under any directory. In the case where the
"/etc/docker" directory is not preserved, this file can be
specified to a location where it will be preserved to ensure
the ID does not change across restarts.
Note this key is currently only used today to generate the ID
used in Docker info and for manifest schema v1 pushes. The key
signature and finger on these manifests are not checked or
used any longer for security, deprecated by notary.
Removes old key migration from a pre-release of Docker which put
the key under the home directory and was used to preserve ID used
for swarm v1 after the file moved.
closes#32135
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Starting with this commit, integration tests should no longer rely on
the docker cli, they should be API tests instead. For the existing tests
the scripts will use a frozen version of the docker cli with a
DOCKER_API_VERSION frozen to 1.30, which should ensure that the CI remains
green at all times.
To help contributors develop and test manually with a modified docker
cli, this commit also adds a DOCKER_CLI_PATH environment variable to the
Makefile. This allows to set the path of a custom cli that will be
available inside the development container and used to run the
integration tests.
Signed-off-by: Arnaud Porterie (icecrime) <arnaud.porterie@docker.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
The daemon config for defaulting to no-new-privileges for containers was
added in d7fda019bb, but somehow we
managed to omit the flag itself, but also documented the flag.
This just adds the actual flag.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Remove pathCache and replace it with syncmap
Cleanup NewBuilder
Create an api/server/backend/build
Extract BuildTagger
Signed-off-by: Daniel Nephin <dnephin@docker.com>
I noticed that we're using a homegrown package for assertions. The
functions are extremely similar to testify, but with enough slight
differences to be confusing (for example, Equal takes its arguments in a
different order). We already vendor testify, and it's used in a few
places by tests.
I also found some problems with pkg/testutil/assert. For example, the
NotNil function seems to be broken. It checks the argument against
"nil", which only works for an interface. If you pass in a nil map or
slice, the equality check will fail.
In the interest of avoiding NIH, I'm proposing replacing
pkg/testutil/assert with testify. The test code looks almost the same,
but we avoid the confusion of having two similar but slightly different
assertion packages, and having to maintain our own package instead of
using a commonly-used one.
In the process, I found a few places where the tests should halt if an
assertion fails, so I've made those cases (that I noticed) use "require"
instead of "assert", and I've vendored the "require" package from
testify alongside the already-present "assert" package.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
None of the daemon flags use a constant for the
flag name.
This patch removes the constant for consistency
Also removes a FIXME, that was now in the wrong
location, and added a long time ago in
353b7c8ec7,
without a lot of context (and probably no longer really relevant).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Even though the flag `--api-enable-cors` is deprecated in favor of
`--api-cors-header`. Using only `--api-cors-header` does not enable
CORS.
Make changes to 'cmd/dockerd/daemon.go' to enable cors if either of
the above flags is set.
Signed-off-by: Karthik Nayak <Karthik.188@gmail.com>
When the daemon is configured to run with an authorization-plugin and if
the plugin is disabled, the daemon continues to send API requests to the
plugin and expect it to respond. But the plugin has been disabled. As a
result, all API requests are blocked. Fix this behavior by removing the
disabled plugin from the authz middleware chain.
Tested using riyaz/authz-no-volume-plugin and observed that after
disabling the plugin, API request/response is functional.
Fixes#31836
Signed-off-by: Anusha Ragunathan <anusha.ragunathan@docker.com>
This also moves some cli specific in `cmd/dockerd` as it does not
really belong to the `daemon/config` package.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Go style calls for mixed caps instead of all caps:
https://golang.org/doc/effective_go.html#mixed-caps
Change LOOKUP, ACQUIRE, and RELEASE to Lookup, Acquire, and Release.
This vendors a fork of libnetwork for now, to deal with a cyclic
dependency issue. The change will be upstream to libnetwork once this is
merged.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Move plugins to shared distribution stack with images.
Create immutable plugin config that matches schema2 requirements.
Ensure data being pushed is same as pulled/created.
Store distribution artifacts in a blobstore.
Run init layer setup for every plugin start.
Fix breakouts from unsafe file accesses.
Add support for `docker plugin install --alias`
Uses normalized references for default names to avoid collisions when using default hosts/tags.
Some refactoring of the plugin manager to support the change, like removing the singleton manager and adding manager config struct.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Instead of not adding experimental routes at all, fail with an explicit
message if the daemon is not running in experimental mode.
Added the `router.Experimental` which does this automatically.
Signed-off-by: Andrea Luzzardi <aluzzardi@gmail.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
update cobra and use Tags
Signed-off-by: Victor Vieux <vieux@docker.com>
allow client to talk to an older server
Signed-off-by: Victor Vieux <vieux@docker.com>
This adds a metrics packages that creates additional metrics. Add the
metrics endpoint to the docker api server under `/metrics`.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add metrics to daemon package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
api: use standard way for metrics route
Also add "type" query parameter
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Convert timers to ms
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
`docker network prune` prunes unused networks, including overlay ones.
`docker system prune` also prunes unused networks.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This fix tries to address the issue raised in 24392 where
labels with duplicate keys exist in `docker info`, which
contradicts with the specifications in the docs.
The reason for duplicate keys is that labels are stored as
slice of strings in the format of `A=B` (and the input/output).
This fix tries to address this issue by checking conflict
labels when daemon started, and remove duplicate labels (K-V).
The existing `/info` API has not been changed.
An additional integration test has been added to cover the
changes in this fix.
This fix fixes 24392.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
The swarm scope network connected containers with autostart enabled
there was a dependency problem with the cluster to be initialized before
we can autostart them. With the current container restart code happening
before cluster init, these containers were not getting autostarted
properly. Added a fix to delay the container start of those containers
which has atleast one swarm scope endpoint to until after the cluster is
initialized.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Support args to RunCommand
Fix docker help text test.
Fix for ipv6 tests.
Fix TLSverify option.
Fix TestDaemonDiscoveryBackendConfigReload
Use tempfile for another test.
Restore missing flag.
Fix tests for removal of shlex.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Also consolidate the leftover packages under cli.
Remove pkg/mflag.
Make manpage generation work with new cobra layout.
Remove remaining mflag and fix tests after rebase with master.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Cleanup cobra integration
Update windows files for cobra and pflags
Cleanup SetupRootcmd, and remove unnecessary SetFlagErrorFunc.
Use cobra command traversal
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Right now docker puts swarm's control socket into the docker root dir
(e.g. /var/lib/docker).
This can cause some nasty issues with path length being > 108
characters, especially in our CI environment.
Since we already have some other state going in the daemon's exec root
(libcontainerd and libnetwork), I think it makes sense to move the
control socket to this location, especially since there are other unix
sockets being created here by docker so it must always be at a path that
works.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Following #22729, enable to dynamically reload/remove the daemon
authorization plugins (via standard reloading mechanism).
https://docs.docker.com/engine/reference/commandline/daemon/#daemon-
configuration-file
Daemon must store a reference to the authorization middleware to refresh
the plugin on configuration changes.
Signed-off-by: Liron Levin <liron@twistlock.com>
There are currently problems with "swarm init" and "swarm join" when an
explicit --listen-addr flag is not provided. swarmkit defaults to
finding the IP address associated with the default route, and in cloud
setups this is often the wrong choice.
Introduce a notion of "advertised address", with the client flag
--advertise-addr, and the daemon flag --swarm-default-advertise-addr to
provide a default. The default listening address is now 0.0.0.0, but a
valid advertised address must be detected or specified.
If no explicit advertised address is specified, error out if there is
more than one usable candidate IP address on the system. This requires a
user to explicitly choose instead of letting swarmkit make the wrong
choice. For the purposes of this autodetection, we ignore certain
interfaces that are unlikely to be relevant (currently docker*).
The user is also required to choose a listen address on swarm init if
they specify an explicit advertise address that is a hostname or an IP
address that's not local to the system. This is a requirement for
overlay networking.
Also support specifying interface names to --listen-addr,
--advertise-addr, and the daemon flag --swarm-default-advertise-addr.
This will fail if the interface has multiple IP addresses (unless it has
a single IPv4 address and a single IPv6 address - then we resolve the
tie in favor of IPv4).
This change also exposes the node's externally-reachable address in
docker info, as requested by #24017.
Make corresponding API and CLI docs changes.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This fixes the hard coded restriction for non-linux platforms to v2 registries. Previously, the check was above the flag parsing, which would overwrite the hard coded value and prevent correct operation. This change also removes the related daemon flag from Windows to avoid confusion, as it has no meaning when the value is going to always be hard coded to true.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
This adds the `--live-restore` option to the documentation.
Also synched usage description in the documentation
with the actual description, and re-phrased some
flag descriptions to be a bit more consistent.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Unix sockets are limited to 108 bytes. As a result, we need to be
careful in not using exec-root as the parent directory for pluginID
(which is already 64 bytes), since it can result in socket path names
longer than 108 bytes. Use /tmp instead. Before this change, setting:
- dockerd --exec-root=/go/src/github.com/do passes
- dockerd --exec-root=/go/src/github.com/doc fails
After this change, there's no failure.
Also, write a volume plugins test to verify that the plugins socket
responds.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
This adds an `--oom-score-adjust` flag to the daemon so that the value
provided can be set for the docker daemon's process. The default value
for the flag is -500. This will allow the docker daemon to have a
less chance of being killed before containers do. The default value for
processes is 0 with a min/max of -1000/1000.
-500 is a good middle ground because it is less than the default for
most processes and still not -1000 which basically means never kill this
process in an OOM condition on the host machine. The only processes on
my machine that have a score less than -500 are dbus at -900 and sshd
and xfce( my window manager ) at -1000. I don't think docker should be
set lower, by default, than dbus or sshd so that is why I chose -500.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This ensures that:
- The in-memory plugin store is populated with all the plugins
- Plugins which were active before daemon restart are active after.
This utilizes the liverestore feature when available, otherwise it
manually starts the plugin.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This patch introduces a new experimental engine-level plugin management
with a new API and command line. Plugins can be distributed via a Docker
registry, and their lifecycle is managed by the engine.
This makes plugins a first-class construct.
For more background, have a look at issue #20363.
Documentation is in a separate commit. If you want to understand how the
new plugin system works, you can start by reading the documentation.
Note: backwards compatibility with existing plugins is maintained,
albeit they won't benefit from the advantages of the new system.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
As described in our ROADMAP.md, introduce new Swarm management API
endpoints relying on swarmkit to deploy services. It currently vendors
docker/engine-api changes.
This PR is fully backward compatible (joining a Swarm is an optional
feature of the Engine, and existing commands are not impacted).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
This flags enables full support of daemonless containers in docker. It
ensures that docker does not stop containers on shutdown or restore and
properly reconnects to the container when restarted.
This is not the default because of backwards compat but should be the
desired outcome for people running containers in prod.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
This works around golang/go#15286 by explicitly loading shell32.dll at
load time, ensuring that syscall can load it dynamically during process
startup.
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Using golang 1.6, is it now possible to ignore SIGPIPE events on
stdout/stderr. Previous versions of the golang library cached 10
events and then killed the process receiving the events.
systemd-journald sends SIGPIPE events when jounald is restarted and
the target of the unit file writes to stdout/stderr. Docker logs to stdout/stderr.
This patch silently ignores all SIGPIPE events.
Signed-off-by: Jhon Honce <jhonce@redhat.com>
This adds support for Windows dockerd to run as a Windows service, managed
by the service control manager. The log is written to the Windows event
log (and can be viewed in the event viewer or in PowerShell). If there is
a Go panic, the stack is written to a file panic.log in the Docker root.
Signed-off-by: John Starks <jostarks@microsoft.com>