Commit graph

7491 commits

Author SHA1 Message Date
Jérôme Petazzoni
1c4202a614 Mount /proc and /sys read-only, except in privileged containers.
It has been pointed out that some files in /proc and /sys can be used
to break out of containers. However, if those filesystems are mounted
read-only, most of the known exploits are mitigated, since they rely
on writing some file in those filesystems.

This does not replace security modules (like SELinux or AppArmor), it
is just another layer of security. Likewise, it doesn't mean that the
other mitigations (shadowing parts of /proc or /sys with bind mounts)
are useless. Those measures are still useful. As such, the shadowing
of /proc/kcore is still enabled with both LXC and native drivers.

Special care has to be taken with /proc/1/attr, which still needs to
be mounted read-write in order to enable the AppArmor profile. It is
bind-mounted from a private read-write mount of procfs.

All that enforcement is done in dockerinit. The code doing the real
work is in libcontainer. The init function for the LXC driver calls
the function from libcontainer to avoid code duplication.

Docker-DCO-1.1-Signed-off-by: Jérôme Petazzoni <jerome@docker.com> (github: jpetazzo)
2014-05-01 15:26:58 -07:00
Michael Crosby
20bcb80f40 Merge pull request #5457 from tiborvass/5423-bridge-ip
Fix bridge ip comparison
2014-05-01 11:56:47 -07:00
Eiichi Tsukata
cac0cea03f drop CAP_SYSLOG capability
Kernel capabilities for privileged syslog operations are currently splitted into
CAP_SYS_ADMIN and CAP_SYSLOG since the following commit:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ce6ada35bdf710d16582cc4869c26722547e6f11

This patch drops CAP_SYSLOG to prevent containers from messing with
host's syslog (e.g. `dmesg -c` clears up host's printk ring buffer).

Closes #5491

Docker-DCO-1.1-Signed-off-by: Eiichi Tsukata <devel@etsukata.com> (github: Etsukata)
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-05-01 11:43:55 -07:00
Guillaume J. Charmes
fe4a25546a Merge pull request #5515 from crosbymichael/refactor-libcontainer2
Remove CommandFactory and NsInit interface
2014-05-01 11:41:54 -07:00
Alexandr Morozov
d1297feef8 Timestamps for docker logs.
Fixes #1165
Docker-DCO-1.1-Signed-off-by: Alexandr Morozov <lk4d4math@gmail.com> (github: LK4D4)
2014-05-01 20:40:36 +04:00
Michael Crosby
da0d6dbd7b Make native driver use Exec func with different CreateCommand
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 18:49:24 -07:00
Michael Crosby
60e4276f5a Integrate new structure into docker's native driver
This duplicates some of the Exec code but I think it it worth it because
the native driver is more straight forward and does not have the
complexity have handling the type issues for now.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 18:20:01 -07:00
Michael Crosby
f110401437 Remove statewriter interface, export more libcontainer funcs
This temp. expands the Exec method's signature but adds a more robust
way to know when the container's process is actually released and begins
to run.  The network interfaces are not guaranteed to be up yet but this
provides a more accurate view with a single callback at this time.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 15:52:40 -07:00
Michael Crosby
162dafbcd5 Remove logger from nsinit struct
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-30 15:24:18 -07:00
Guillaume J. Charmes
2a711d16e0 Merge pull request #5448 from crosbymichael/selinux-defaults
Add selinux label support for processes and mount
2014-04-30 14:14:39 -07:00
Tibor Vass
986c647d5a Fix bridge ip comparison
Docker-DCO-1.1-Signed-off-by: Tibor Vass <teabee89@gmail.com> (github: tiborvass)
2014-04-30 12:36:16 -07:00
Michael Crosby
e88ef454b7 Merge pull request #5464 from tianon/close-leftover-fds 2014-04-30 12:27:52 -07:00
Tianon Gravi
d5d62ff955 Close extraneous file descriptors in containers
Without this patch, containers inherit the open file descriptors of the daemon, so my "exec 42>&2" allows us to "echo >&42 some nasty error with some bad advice" directly into the daemon log. :)

Also, "hack/dind" was already doing this due to issues caused by the inheritance, so I'm removing that hack too since this patch obsoletes it by generalizing it for all containers.

Docker-DCO-1.1-Signed-off-by: Andrew Page <admwiggin@gmail.com> (github: tianon)
2014-04-29 16:45:28 -06:00
Michael Crosby
1a5ffef6c6 Do not return labels when in privileged mode
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-29 03:40:06 -07:00
Michael Crosby
46e05ed2d9 Update process labels to be set at create not start
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-29 03:40:05 -07:00
Michael Crosby
ae00649305 Update devicemapper to pass mount flag
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-29 03:40:05 -07:00
Dan Walsh
b7942ec2ca This patch reworks the SELinux patch to be only run on demand by the daemon
Added --selinux-enable switch to daemon to enable SELinux labeling.

The daemon will now generate a new unique random SELinux label when a
container starts, and remove it when the container is removed.   The MCS
labels will be stored in the daemon memory.  The labels of containers will
be stored in the container.json file.

When the daemon restarts on boot or if done by an admin, it will read all containers json files and reserve the MCS labels.

A potential problem would be conflicts if you setup thousands of containers,
current scheme would handle ~500,000 containers.

Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: crosbymichael)
2014-04-29 03:40:05 -07:00
Michael Crosby
f0e6e135a8 Initial work on selinux patch
This has every container using the docker daemon's pid for the processes
label so it does not work correctly.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-29 03:40:05 -07:00
Michael Crosby
934bd15565 Merge pull request #5389 from tiborvass/5152-symlink-in-volume
Fixes #5152 : symlink in volume path
2014-04-28 17:27:18 -07:00
Brian Goff
ff7b52abd3 Fixes permissions on volumes when dir in container is empty
Docker-DCO-1.1-Signed-off-by: Brian Goff <cpuguy83@gmail.com> (github: cpuguy83)
2014-04-28 16:57:28 -04:00
Tibor Vass
e9a42a45bf Fixes #5152 : symlink in volume path
Docker-DCO-1.1-Signed-off-by: Tibor Vass <teabee89@gmail.com> (github: tiborvass)
2014-04-28 13:18:12 -07:00
unclejack
44140f7909 Merge pull request #5411 from crosbymichael/lockdown
Update default restrictions for exec drivers
2014-04-26 03:27:56 +03:00
unclejack
077b7d0359 Merge pull request #5342 from danielnorberg/avoid-suicide
avoid suicide
2014-04-25 21:44:45 +03:00
Alexander Larsson
6d631968fa devmapper: Store metadata in one file per device
This allows multiple instances of the backend in different containers
to access devices (although generally only one can modify/create them).

Any old metadata is converted on the first run.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-25 15:21:56 +02:00
Alexander Larsson
f26203cf54 devmapper: Simplify thin pool device id allocation
Instead of globally keeping track of the free device ids we just
start from 0 each run and handle EEXIST error and try the next one.

This way we don't need any global state for the device ids, which
means we can read device metadata lazily. This is important for
multi-process use of the backend.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-25 14:26:27 +02:00
Alexander Larsson
586a511cb5 devmapper: Move error detection to devmapper.go
This moves the EBUSY detection to devmapper.go, and then returns
a real ErrBusy that deviceset uses.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-25 14:26:27 +02:00
Alexander Larsson
8116b86e05 devmapper: Maks createSnapDevice a function, not a method
No idea why this was a method. Maybe a cut and paste bug.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-25 14:26:27 +02:00
Guillaume J. Charmes
85540f6aa0 Merge pull request #5373 from vmarmol/master
Separating cgroup Memory and MemoryReservation.
2014-04-24 14:28:40 -07:00
Michael Crosby
8af84c5e23 Merge pull request #5335 from alexlarsson/remove-ghost
container: Remove Ghost state
2014-04-24 11:55:05 -07:00
Victor Marmol
f188b9f623 Separating cgroup Memory and MemoryReservation.
This will allow for these to be set independently. Keep the current Docker behavior where Memory and MemoryReservation are set to the value of Memory.

Docker-DCO-1.1-Signed-off-by: Victor Marmol <vmarmol@google.com> (github: vmarmol)
2014-04-24 11:09:38 -07:00
Michael Crosby
2d6c367434 Increment native driver version with these changes
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
5ba1242bdc Mount over dev and only copy allowed nodes in
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
81e5026a6a No not mount sysfs by default for non privilged containers
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
0779a8c328 Add lxc support for restricting proc
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:20 -07:00
Michael Crosby
60a90970bc Add restrictions to proc in libcontainer
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-24 10:35:19 -07:00
Daniel Norberg
b3ddc31b95 avoid suicide
container.Kill() might read a pid of 0 from
container.State.Pid due to losing a race with
container.monitor() calling
container.State.SetStopped(). Sending a SIGKILL to
pid 0 is undesirable as "If pid equals 0, then sig
is sent to every process in the process group of
the calling process."

Docker-DCO-1.1-Signed-off-by: Daniel Norberg <daniel.norberg@gmail.com> (github: danielnorberg)
2014-04-23 11:06:59 -04:00
Alexander Larsson
73d9ede12c devicemapper: Don't mount in Create()
We used to mount in Create() to be able to create a few files that
needs to be in each device. However, this mount is problematic for
selinux, as we need to set the mount label at mount-time, and it
is not known at the time of Create().

This change just moves the file creation to first Get() call and
drops the mount from Create(). Additionally, this lets us remove
some complexities we had to avoid an extra unmount+mount cycle.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-23 13:50:53 +02:00
Alexander Larsson
cf997aa905 container: Remove Ghost state
container.Register() checks both IsRunning() and IsGhost(), but at
this point IsGhost() is always true if IsRunning() is true. For a
newly created container both are false, and for a restored-from-disk
container Daemon.load() sets Ghost to true if IsRunning is true. So we
just drop the IsGhost check.

This was the last call to IsGhost, so we remove It and all other
traces of the ghost state.

Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-22 09:49:53 +02:00
Guillaume J. Charmes
813cebc64f
Merge branch 'master' into load-profile
Conflicts:
	daemon/execdriver/native/create.go
	daemon/execdriver/native/driver.go

Docker-DCO-1.1-Signed-off-by: Guillaume J. Charmes <guillaume@charmes.net> (github: creack)
2014-04-21 10:32:13 -07:00
Michael Crosby
eceeebc22d Remove IsGhost checks around networking
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
2014-04-17 23:49:59 +00:00
Alexander Larsson
359b7df5d2 Rename runtime/* to daemon/*
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
2014-04-17 14:43:01 -07:00