Commit graph

63 commits

Author SHA1 Message Date
Vincent Demeester
f70c715be0
Merge pull request #35316 from kolyshkin/facepalm
Fix honoring tmpfs-size for user /dev/shm mount
2017-11-14 11:13:59 +01:00
Kir Kolyshkin
31d30a985d Fix user mount /dev/shm size
Commit 7120976d74 ("Implement none, private, and shareable ipc
modes") introduces a bug: if a user-specified mount for /dev/shm
is provided, its size is overriden by value of ShmSize.

A reproducer is simple:

 docker run --rm
	--mount type=tmpfs,dst=/dev/shm,tmpfs-size=100K \
	alpine df /dev/shm

This commit is an attempt to fix the bug, as well as optimize things
a but and make the code easier to read.

https://github.com/moby/moby/issues/35271

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-11-12 21:42:59 -08:00
Chao Wang
5c154cfac8 Copy Inslice() to those parts that use it
Signed-off-by: Chao Wang <wangchao.fnst@cn.fujitsu.com>
2017-11-10 13:42:38 +08:00
Justin Cormack
f5c70c5b75
Merge pull request #35365 from Microsoft/jjh/removeduplicateoomscoreadj
Remove duplicate redundant setting of OOMScoreAdj in OCI spec
2017-11-03 13:59:51 +00:00
Daniel J Walsh
5f3bd2473e /dev should not be readonly with --readonly flag
/dev is mounted on a tmpfs inside of a container.  Processes inside of containers
some times need to create devices nodes, or to setup a socket that listens on /dev/log
Allowing these containers to run with the --readonly flag makes sense.  Making a tmpfs
readonly does not add any security to the container, since there is plenty of places
where the container can write tmpfs content.

I have no idea why /dev was excluded.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-11-02 10:28:51 -04:00
John Howard
f0b44881b5 Remove dupl setting of OOMScoreAdj in OCI spec
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-11-01 11:01:43 -07:00
Kenfe-Mickael Laventure
ddae20c032
Update libcontainerd to use containerd 1.0
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-10-20 07:11:37 -07:00
Aleksa Sarai
c0f883fdee
daemon: oci: obey CL_UNPRIVILEGED for user namespaced daemon
When runc is bind-mounting a particular path "with options", it has to
do so by first creating a bind-mount and the modifying the options of
said bind-mount via remount. However, in a user namespace, there are
restrictions on which flags you can change with a remount (due to
CL_UNPRIVILEGED being set in this instance). Docker historically has
ignored this, and as a result, internal Docker mounts (such as secrets)
haven't worked with --userns-remap. Fix this by preserving
CL_UNPRIVILEGED mount flags when Docker is spawning containers with user
namespaces enabled.

Ref: https://github.com/opencontainers/runc/pull/1603
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-10-16 02:52:56 +11:00
Victor Vieux
a5f9783c93 Merge pull request #34252 from Microsoft/akagup/lcow-remotefs-sandbox
LCOW: Support for docker cp, ADD/COPY on build
2017-09-15 16:49:48 -07:00
Simon Ferquel
e89b6e8c2d Volume refactoring for LCOW
Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>
2017-09-14 12:33:31 -07:00
Akash Gupta
7a7357dae1 LCOW: Implemented support for docker cp + build
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.

Signed-off-by: Akash Gupta <akagup@microsoft.com>
2017-09-14 12:07:52 -07:00
Daniel Nephin
f7f101d57e Add gosimple linter
Update gometalinter

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-09-12 12:09:59 -04:00
Yong Tang
cb952bf006 Merge pull request #34625 from dnephin/more-linters
Add interfacer and unconvert linters
2017-09-01 08:46:08 -07:00
Daniel Nephin
2f5f0af3fd Add unconvert linter
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-08-24 15:08:31 -04:00
Kenfe-Mickael Laventure
45d85c9913
Update containerd to 06b9cb35161009dcb7123345749fef02f7cea8e0
This also update:
 - runc to 3f2f8b84a77f73d38244dd690525642a72156c64
 - runtime-specs to v1.0.0

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-21 12:04:07 -07:00
Daniel Nephin
9b47b7b151 Fix golint errors.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-08-18 14:23:44 -04:00
Brian Goff
ebcb7d6b40 Remove string checking in API error handling
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.

Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2017-08-15 16:01:11 -04:00
Kir Kolyshkin
7120976d74 Implement none, private, and shareable ipc modes
Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.

Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).

This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:

 - 'shareable':	enables sharing this container's IPC with others
		(this used to be the implicit default);

 - 'private':	disables sharing this container's IPC.

In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.

While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:

> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...

...so here's yet yet another mode:

 - 'none':	no /dev/shm mount inside the container (though it still
		has its own private IPC namespace).

Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.

Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).

Some other changes this patch introduces are:

1. A mount for /dev/shm is added to default OCI Linux spec.

2. IpcMode.Valid() is simplified to remove duplicated code that parsed
   'container:ID' form. Note the old version used to check that ID does
   not contain a semicolon -- this is no longer the case (tests are
   modified accordingly). The motivation is we should either do a
   proper check for container ID validity, or don't check it at all
   (since it is checked in other places anyway). I chose the latter.

3. IpcMode.Container() is modified to not return container ID if the
   mode value does not start with "container:", unifying the check to
   be the same as in IpcMode.IsContainer().

3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
   to add checks for newly added values.

[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
     container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-08-14 10:50:39 +03:00
Derek McGowan
1009e6a40b
Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-07-31 13:16:46 -07:00
Daniel Nephin
93fbdb69ac Remove error return from RootPair
There is no case which would resolve in this error. The root user always exists, and if the id maps are empty, the default value of 0 is correct.

Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-06-07 11:45:33 -04:00
Daniel Nephin
09cd96c5ad Partial refactor of UID/GID usage to use a unified struct.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2017-06-07 11:44:33 -04:00
Aaron Lehmann
9e9fc7b57c Add config support to executor backend
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-05-11 10:08:21 -07:00
Evan Hazlett
67d282a5c9 support custom paths for secrets
This adds support to specify custom container paths for secrets.

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2017-05-10 10:23:07 -07:00
Michael Crosby
005506d36c Update moby to runc and oci 1.0 runtime final rc
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-05-05 13:45:45 -07:00
Antonio Murdaca
a18d103b5e
remove --init-path from client
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-04-10 16:49:43 +02:00
Sunny Gogoi
17b1288760 Fix missing Init Binary in docker info output
- Moved DefaultInitBinary from daemon/daemon.go to
daemon/config/config.go since it's a daemon config and is referred in
config package files.
- Added condition in GetInitPath to check for any explicitly configured
DefaultInitBinary. If not, the default value of DefaultInitBinary is
returned.
- Changed all references of DefaultInitBinary to refer to the variable
from new location.
- Added TestCommonUnixGetInitPath to test for the various values of
GetInitPath.

Fixes #32314

Signed-off-by: Sunny Gogoi <indiasuny000@gmail.com>
2017-04-10 16:54:07 +05:30
Kenfe-Mickael Laventure
1756af6faf Allow adding rules to cgroup devices.allow on container create/run
This introduce a new `--device-cgroup-rule` flag that allow a user to
add one or more entry to the container cgroup device `devices.allow`

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-01-26 07:20:45 -08:00
Alexander Morozov
50a72c7467 Merge pull request #28454 from glensc/init-args
do not require custom build of tini
2017-01-20 10:03:58 -08:00
Aleksa Sarai
567ef8e785
daemon: switch to 'ensure' workflow for AppArmor profiles
In certain cases (unattended upgrades), system services can disable
loaded AppArmor profiles. However, since /etc being read-only is a
supported setup we cannot just write a copy of the profile to
/etc/apparmor.d.

Instead, dynamically load the docker-default AppArmor profile if a
container is started with that profile set. This code will short-cut if
the profile is already loaded.

Fixes: 2f7596aaef ("apparmor: do not save profile to /etc/apparmor.d")
Signed-off-by: Aleksa Sarai <asarai@suse.de>
2016-12-07 08:47:28 +11:00
Tibor Vass
6547609870 plugins: misc fixes
Rename variable to reflect manifest -> config renaming
Populate Description fields when computing privileges.
Refactor/reuse code from daemon/oci_linux.go

Signed-off-by: Tibor Vass <tibor@docker.com>
2016-11-22 14:32:07 -08:00
Tibor Vass
53b9b99e5c plugins: support for devices
Signed-off-by: Tibor Vass <tibor@docker.com>
2016-11-22 09:54:45 -08:00
Elan Ruusamäe
d7df731597 do not require custom build of tini
https://github.com/krallin/tini/issues/55#issuecomment-260507562
https://github.com/krallin/tini/issues/55#issuecomment-260538243
https://github.com/docker/docker/pull/28037

Signed-off-by: Elan Ruusamäe <glen@delfi.ee>
2016-11-16 00:08:55 +02:00
Victor Vieux
0427afa409 Merge pull request #27955 from mlaventure/runc-docker-info
Add external binaries version to docker info
2016-11-10 21:27:14 -08:00
Evan Hazlett
189f89301e more review updates
- use /secrets for swarm secret create route
- do not specify omitempty for secret and secret reference
- simplify lookup for secret ids
- do not use pointer for secret grpc conversion

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:43 -05:00
Evan Hazlett
857e60c2f9 review changes
- fix lint issues
- use errors pkg for wrapping errors
- cleanup on error when setting up secrets mount
- fix erroneous import
- remove unneeded switch for secret reference mode
- return single mount for secrets instead of slice

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:43 -05:00
Evan Hazlett
3716ec25b4 secrets: secret management for swarm
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

wip: use tmpfs for swarm secrets

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

wip: inject secrets from swarm secret store

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

secrets: use secret names in cli for service create

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

switch to use mounts instead of volumes

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

vendor: use ehazlett swarmkit

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>

secrets: finish secret update

Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2016-11-09 14:27:43 -05:00
Kenfe-Mickael Laventure
2790ac68b3 Add expected 3rd party binaries commit ids to info
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2016-11-09 07:42:44 -08:00
Akihiro Suda
18768fdc2e api: add TypeTmpfs to api/types/mount
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2016-10-28 08:38:32 +00:00
Erik St. Martin
56f77d5ade Implementing support for --cpu-rt-period and --cpu-rt-runtime so that
containers may specify these cgroup values at runtime. This will allow
processes to change their priority to real-time within the container
when CONFIG_RT_GROUP_SCHED is enabled in the kernel. See #22380.

Also added sanity checks for the new --cpu-rt-runtime and --cpu-rt-period
flags to ensure that that the kernel supports these features and that
runtime is not greater than period.

Daemon will support a --cpu-rt-runtime flag to initialize the parent
cgroup on startup, this prevents the administrator from alotting runtime
to docker after each restart.

There are additional checks that could be added but maybe too far? Check
parent cgroups to ensure values are <= parent, inspecting rtprio ulimit
and issuing a warning.

Signed-off-by: Erik St. Martin <alakriti@gmail.com>
2016-10-26 11:33:06 -04:00
Michael Crosby
97660c6ec5 Merge pull request #26961 from Microsoft/jjh/oci
Windows: OCI runtime spec compliance
2016-09-30 10:13:57 -07:00
Tonis Tiigi
e981459609 Fix missing hostname and links in exec env
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2016-09-29 13:46:10 -07:00
John Howard
02309170a5 Remove hacked Windows OCI spec, compile fixups
Signed-off-by: John Howard <jhoward@microsoft.com>
2016-09-27 12:07:35 -07:00
Antonio Murdaca
6a12685bb7
configure docker-init binary path
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-09-27 14:49:17 +02:00
Michael Crosby
ee3ac3aa66 Add init process for zombie fighting
This adds a small C binary for fighting zombies.  It is mounted under
`/dev/init` and is prepended to the args specified by the user.  You
enable it via a daemon flag, `dockerd --init`, as it is disable by
default for backwards compat.

You can also override the daemon option or specify this on a per
container basis with `docker run --init=true|false`.

You can test this by running a process like this as the pid 1 in a
container and see the extra zombie that appears in the container as it
is running.

```c

int main(int argc, char ** argv) {
	pid_t pid = fork();
	if (pid == 0) {
		pid = fork();
		if (pid == 0) {
			exit(0);
		}
		sleep(3);
		exit(0);
	}
	printf("got pid %d and exited\n", pid);
	sleep(20);
}
```

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-09-19 17:33:50 -07:00
Yong Tang
7d705a7355 Fix ulimits in docker inspect when daemon default exists
This fix tries to fix 26326 where `docker inspect` will not show
ulimit even when daemon default ulimit has been set.

This fix merge the HostConfig's ulimit with daemon default in
`docker inspect`, so that when daemon is started with `default-ulimit`
and HostConfig's ulimit is not set, `docker inspect` will output
the daemon default.

An integration test has been added to cover the changes.

This fix fixes 26326.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2016-09-07 23:15:22 -07:00
Michael Crosby
91e197d614 Add engine-api types to docker
This moves the types for the `engine-api` repo to the existing types
package.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-09-07 11:05:58 -07:00
Michael Crosby
041e5a21dc Replace old oci specs import with runtime-specs
Fixes #25804

The upstream repo changed the import paths.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2016-08-17 09:38:34 -07:00
Brian Goff
6d98e344c7 revendor engine-api
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2016-08-16 14:16:12 -04:00
nick
7135afa79b Fix misspell typos
Signed-off-by: nick <nicholasrusso@icloud.com>
2016-06-19 09:53:31 -07:00
Antonio Murdaca
756f6cef4a daemon: allow tmpfs to trump over VOLUME(s)
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-06-15 16:01:51 +02:00