This is required to address a race condition described in #5553,
where a container can be partially deleted -- for example, the
root filesystem but not the init filesystem -- which makes
it impossible to delete the container without re-adding the
missing filesystems manually.
This behavior has been witnessed when rebooting boxes that
are configured to remove containers on shutdown in parallel
with stopping the Docker daemon.
Docker-DCO-1.1-Signed-off-by: Gabriel Monroy <gabriel@opdemand.com> (github: gabrtv)
We don't have the flexibility to do extra things with lxc because it is
a black box and most fo the magic happens before we get a chance to
interact with it in dockerinit.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
There is not need for the remount hack, we use aa_change_onexec so the
apparmor profile is not applied until we exec the users app.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
- git mv archived/* .
- put the links back into the summary document
- reduce the header depth by 1 so the TOC lists each API version
- update the mkdocs.yaml to render the archived API docs, but not add
them to the menu/nav
Docker-DCO-1.1-Signed-off-by: Sven Dowideit <SvenDowideit@fosiki.com> (github: SvenDowideit)
This also cleans up some of the left over restriction paths code from
before.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
It has been pointed out that some files in /proc and /sys can be used
to break out of containers. However, if those filesystems are mounted
read-only, most of the known exploits are mitigated, since they rely
on writing some file in those filesystems.
This does not replace security modules (like SELinux or AppArmor), it
is just another layer of security. Likewise, it doesn't mean that the
other mitigations (shadowing parts of /proc or /sys with bind mounts)
are useless. Those measures are still useful. As such, the shadowing
of /proc/kcore is still enabled with both LXC and native drivers.
Special care has to be taken with /proc/1/attr, which still needs to
be mounted read-write in order to enable the AppArmor profile. It is
bind-mounted from a private read-write mount of procfs.
All that enforcement is done in dockerinit. The code doing the real
work is in libcontainer. The init function for the LXC driver calls
the function from libcontainer to avoid code duplication.
Docker-DCO-1.1-Signed-off-by: Jérôme Petazzoni <jerome@docker.com> (github: jpetazzo)
Kernel capabilities for privileged syslog operations are currently splitted into
CAP_SYS_ADMIN and CAP_SYSLOG since the following commit:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ce6ada35bdf710d16582cc4869c26722547e6f11
This patch drops CAP_SYSLOG to prevent containers from messing with
host's syslog (e.g. `dmesg -c` clears up host's printk ring buffer).
Closes#5491
Docker-DCO-1.1-Signed-off-by: Eiichi Tsukata <devel@etsukata.com> (github: Etsukata)
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
This is needed for Send/Recieve to correctly handle borders between
the messages.
The framing uses a single 32bit uint32 length for each frame, of which
the high bit is used to indicate whether the message contains a file
descriptor or not. This is enough to separate out each message sent
and to decide to which message each file descriptors belongs, even
though multiple Sends may be coalesced into a single read, and/or one
Send can be split into multiple writes.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Docker-DCO-1.1-Signed-off-by: Solomon Hykes <solomon@docker.com> (github: shykes)