security.rst 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269
  1. :title: Docker Security
  2. :description: Review of the Docker Daemon attack surface
  3. :keywords: Docker, Docker documentation, security
  4. .. _dockersecurity:
  5. Docker Security
  6. ===============
  7. *Adapted from* `Containers & Docker: How Secure are They? <blogsecurity_>`_
  8. There are three major areas to consider when reviewing Docker security:
  9. * the intrinsic security of containers, as implemented by kernel
  10. namespaces and cgroups;
  11. * the attack surface of the Docker daemon itself;
  12. * the "hardening" security features of the kernel and how they
  13. interact with containers.
  14. Kernel Namespaces
  15. -----------------
  16. Docker containers are essentially LXC containers, and they come with
  17. the same security features. When you start a container with ``docker
  18. run``, behind the scenes Docker uses ``lxc-start`` to execute the
  19. Docker container. This creates a set of namespaces and control groups
  20. for the container. Those namespaces and control groups are not created
  21. by Docker itself, but by ``lxc-start``. This means that as the LXC
  22. userland tools evolve (and provide additional namespaces and isolation
  23. features), Docker will automatically make use of them.
  24. **Namespaces provide the first and most straightforward form of
  25. isolation**: processes running within a container cannot see, and even
  26. less affect, processes running in another container, or in the host
  27. system.
  28. **Each container also gets its own network stack**, meaning that a
  29. container doesn’t get a privileged access to the sockets or interfaces
  30. of another container. Of course, if the host system is setup
  31. accordingly, containers can interact with each other through their
  32. respective network interfaces — just like they can interact with
  33. external hosts. When you specify public ports for your containers or
  34. use :ref:`links <working_with_links_names>` then IP traffic is allowed
  35. between containers. They can ping each other, send/receive UDP
  36. packets, and establish TCP connections, but that can be restricted if
  37. necessary. From a network architecture point of view, all containers
  38. on a given Docker host are sitting on bridge interfaces. This means
  39. that they are just like physical machines connected through a common
  40. Ethernet switch; no more, no less.
  41. How mature is the code providing kernel namespaces and private
  42. networking? Kernel namespaces were introduced `between kernel version
  43. 2.6.15 and 2.6.26
  44. <http://lxc.sourceforge.net/index.php/about/kernel-namespaces/>`_. This
  45. means that since July 2008 (date of the 2.6.26 release, now 5 years
  46. ago), namespace code has been exercised and scrutinized on a large
  47. number of production systems. And there is more: the design and
  48. inspiration for the namespaces code are even older. Namespaces are
  49. actually an effort to reimplement the features of `OpenVZ
  50. <http://en.wikipedia.org/wiki/OpenVZ>`_ in such a way that they could
  51. be merged within the mainstream kernel. And OpenVZ was initially
  52. released in 2005, so both the design and the implementation are
  53. pretty mature.
  54. Control Groups
  55. --------------
  56. Control Groups are the other key component of Linux Containers. They
  57. implement resource accounting and limiting. They provide a lot of very
  58. useful metrics, but they also help to ensure that each container gets
  59. its fair share of memory, CPU, disk I/O; and, more importantly, that a
  60. single container cannot bring the system down by exhausting one of
  61. those resources.
  62. So while they do not play a role in preventing one container from
  63. accessing or affecting the data and processes of another container,
  64. they are essential to fend off some denial-of-service attacks. They
  65. are particularly important on multi-tenant platforms, like public and
  66. private PaaS, to guarantee a consistent uptime (and performance) even
  67. when some applications start to misbehave.
  68. Control Groups have been around for a while as well: the code was
  69. started in 2006, and initially merged in kernel 2.6.24.
  70. .. _dockersecurity_daemon:
  71. Docker Daemon Attack Surface
  72. ----------------------------
  73. Running containers (and applications) with Docker implies running the
  74. Docker daemon. This daemon currently requires root privileges, and you
  75. should therefore be aware of some important details.
  76. First of all, **only trusted users should be allowed to control your
  77. Docker daemon**. This is a direct consequence of some powerful Docker
  78. features. Specifically, Docker allows you to share a directory between
  79. the Docker host and a guest container; and it allows you to do so
  80. without limiting the access rights of the container. This means that
  81. you can start a container where the ``/host`` directory will be the
  82. ``/`` directory on your host; and the container will be able to alter
  83. your host filesystem without any restriction. This sounds crazy? Well,
  84. you have to know that **all virtualization systems allowing filesystem
  85. resource sharing behave the same way**. Nothing prevents you from
  86. sharing your root filesystem (or even your root block device) with a
  87. virtual machine.
  88. This has a strong security implication: if you instrument Docker from
  89. e.g. a web server to provision containers through an API, you should
  90. be even more careful than usual with parameter checking, to make sure
  91. that a malicious user cannot pass crafted parameters causing Docker to
  92. create arbitrary containers.
  93. For this reason, the REST API endpoint (used by the Docker CLI to
  94. communicate with the Docker daemon) changed in Docker 0.5.2, and now
  95. uses a UNIX socket instead of a TCP socket bound on 127.0.0.1 (the
  96. latter being prone to cross-site-scripting attacks if you happen to
  97. run Docker directly on your local machine, outside of a VM). You can
  98. then use traditional UNIX permission checks to limit access to the
  99. control socket.
  100. You can also expose the REST API over HTTP if you explicitly decide
  101. so. However, if you do that, being aware of the abovementioned
  102. security implication, you should ensure that it will be reachable
  103. only from a trusted network or VPN; or protected with e.g. ``stunnel``
  104. and client SSL certificates.
  105. Recent improvements in Linux namespaces will soon allow to run
  106. full-featured containers without root privileges, thanks to the new
  107. user namespace. This is covered in detail `here
  108. <http://s3hh.wordpress.com/2013/07/19/creating-and-using-containers-without-privilege/>`_. Moreover,
  109. this will solve the problem caused by sharing filesystems between host
  110. and guest, since the user namespace allows users within containers
  111. (including the root user) to be mapped to other users in the host
  112. system.
  113. The end goal for Docker is therefore to implement two additional
  114. security improvements:
  115. * map the root user of a container to a non-root user of the Docker
  116. host, to mitigate the effects of a container-to-host privilege
  117. escalation;
  118. * allow the Docker daemon to run without root privileges, and delegate
  119. operations requiring those privileges to well-audited sub-processes,
  120. each with its own (very limited) scope: virtual network setup,
  121. filesystem management, etc.
  122. Finally, if you run Docker on a server, it is recommended to run
  123. exclusively Docker in the server, and move all other services within
  124. containers controlled by Docker. Of course, it is fine to keep your
  125. favorite admin tools (probably at least an SSH server), as well as
  126. existing monitoring/supervision processes (e.g. NRPE, collectd, etc).
  127. Linux Kernel Capabilities
  128. -------------------------
  129. By default, Docker starts containers with a very restricted set of
  130. capabilities. What does that mean?
  131. Capabilities turn the binary "root/non-root" dichotomy into a
  132. fine-grained access control system. Processes (like web servers) that
  133. just need to bind on a port below 1024 do not have to run as root:
  134. they can just be granted the ``net_bind_service`` capability
  135. instead. And there are many other capabilities, for almost all the
  136. specific areas where root privileges are usually needed.
  137. This means a lot for container security; let’s see why!
  138. Your average server (bare metal or virtual machine) needs to run a
  139. bunch of processes as root. Those typically include SSH, cron,
  140. syslogd; hardware management tools (to e.g. load modules), network
  141. configuration tools (to handle e.g. DHCP, WPA, or VPNs), and much
  142. more. A container is very different, because almost all of those tasks
  143. are handled by the infrastructure around the container:
  144. * SSH access will typically be managed by a single server running in
  145. the Docker host;
  146. * ``cron``, when necessary, should run as a user process, dedicated
  147. and tailored for the app that needs its scheduling service, rather
  148. than as a platform-wide facility;
  149. * log management will also typically be handed to Docker, or by
  150. third-party services like Loggly or Splunk;
  151. * hardware management is irrelevant, meaning that you never need to
  152. run ``udevd`` or equivalent daemons within containers;
  153. * network management happens outside of the containers, enforcing
  154. separation of concerns as much as possible, meaning that a container
  155. should never need to perform ``ifconfig``, ``route``, or ip commands
  156. (except when a container is specifically engineered to behave like a
  157. router or firewall, of course).
  158. This means that in most cases, containers will not need "real" root
  159. privileges *at all*. And therefore, containers can run with a reduced
  160. capability set; meaning that "root" within a container has much less
  161. privileges than the real "root". For instance, it is possible to:
  162. * deny all "mount" operations;
  163. * deny access to raw sockets (to prevent packet spoofing);
  164. * deny access to some filesystem operations, like creating new device
  165. nodes, changing the owner of files, or altering attributes
  166. (including the immutable flag);
  167. * deny module loading;
  168. * and many others.
  169. This means that even if an intruder manages to escalate to root within
  170. a container, it will be much harder to do serious damage, or to
  171. escalate to the host.
  172. This won't affect regular web apps; but malicious users will find that
  173. the arsenal at their disposal has shrunk considerably! You can see
  174. `the list of dropped capabilities in the Docker code
  175. <https://github.com/dotcloud/docker/blob/v0.5.0/lxc_template.go#L97>`_,
  176. and a full list of available capabilities in `Linux manpages
  177. <http://man7.org/linux/man-pages/man7/capabilities.7.html>`_.
  178. Of course, you can always enable extra capabilities if you really need
  179. them (for instance, if you want to use a FUSE-based filesystem), but
  180. by default, Docker containers will be locked down to ensure maximum
  181. safety.
  182. Other Kernel Security Features
  183. ------------------------------
  184. Capabilities are just one of the many security features provided by
  185. modern Linux kernels. It is also possible to leverage existing,
  186. well-known systems like TOMOYO, AppArmor, SELinux, GRSEC, etc. with
  187. Docker.
  188. While Docker currently only enables capabilities, it doesn't interfere
  189. with the other systems. This means that there are many different ways
  190. to harden a Docker host. Here are a few examples.
  191. * You can run a kernel with GRSEC and PAX. This will add many safety
  192. checks, both at compile-time and run-time; it will also defeat many
  193. exploits, thanks to techniques like address randomization. It
  194. doesn’t require Docker-specific configuration, since those security
  195. features apply system-wide, independently of containers.
  196. * If your distribution comes with security model templates for LXC
  197. containers, you can use them out of the box. For instance, Ubuntu
  198. comes with AppArmor templates for LXC, and those templates provide
  199. an extra safety net (even though it overlaps greatly with
  200. capabilities).
  201. * You can define your own policies using your favorite access control
  202. mechanism. Since Docker containers are standard LXC containers,
  203. there is nothing “magic” or specific to Docker.
  204. Just like there are many third-party tools to augment Docker
  205. containers with e.g. special network topologies or shared filesystems,
  206. you can expect to see tools to harden existing Docker containers
  207. without affecting Docker’s core.
  208. Conclusions
  209. -----------
  210. Docker containers are, by default, quite secure; especially if you
  211. take care of running your processes inside the containers as
  212. non-privileged users (i.e. non root).
  213. You can add an extra layer of safety by enabling Apparmor, SELinux,
  214. GRSEC, or your favorite hardening solution.
  215. Last but not least, if you see interesting security features in other
  216. containerization systems, you will be able to implement them as well
  217. with Docker, since everything is provided by the kernel anyway.
  218. For more context and especially for comparisons with VMs and other
  219. container systems, please also see the `original blog post
  220. <blogsecurity_>`_.
  221. .. _blogsecurity: http://blog.docker.io/2013/08/containers-docker-how-secure-are-they/