moby/daemon/exec.go

341 lines
9.3 KiB
Go
Raw Normal View History

package daemon // import "github.com/docker/docker/daemon"
import (
"context"
Remove static errors from errors package. Moving all strings to the errors package wasn't a good idea after all. Our custom implementation of Go errors predates everything that's nice and good about working with errors in Go. Take as an example what we have to do to get an error message: ```go func GetErrorMessage(err error) string { switch err.(type) { case errcode.Error: e, _ := err.(errcode.Error) return e.Message case errcode.ErrorCode: ec, _ := err.(errcode.ErrorCode) return ec.Message() default: return err.Error() } } ``` This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake. Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors. Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API: ```go switch err.(type) { case errcode.ErrorCode: daError, _ := err.(errcode.ErrorCode) statusCode = daError.Descriptor().HTTPStatusCode errMsg = daError.Message() case errcode.Error: // For reference, if you're looking for a particular error // then you can do something like : // import ( derr "github.com/docker/docker/errors" ) // if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... } daError, _ := err.(errcode.Error) statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode errMsg = daError.Message default: // This part of will be removed once we've // converted everything over to use the errcode package // FIXME: this is brittle and should not be necessary. // If we need to differentiate between different possible error types, // we should create appropriate error types with clearly defined meaning errStr := strings.ToLower(err.Error()) for keyword, status := range map[string]int{ "not found": http.StatusNotFound, "no such": http.StatusNotFound, "bad parameter": http.StatusBadRequest, "conflict": http.StatusConflict, "impossible": http.StatusNotAcceptable, "wrong login/password": http.StatusUnauthorized, "hasn't been activated": http.StatusForbidden, } { if strings.Contains(errStr, keyword) { statusCode = status break } } } ``` You can notice two things in that code: 1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are. 2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation. This change removes all our status errors from the errors package and puts them back in their specific contexts. IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages. It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface: ```go type errorWithStatus interface { HTTPErrorStatusCode() int } ``` This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method. I included helper functions to generate errors that use custom status code in `errors/errors.go`. By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it. Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-25 15:53:35 +00:00
"fmt"
"io"
"runtime"
"strings"
"time"
"github.com/docker/docker/api/types"
"github.com/docker/docker/api/types/strslice"
"github.com/docker/docker/container"
"github.com/docker/docker/container/stream"
"github.com/docker/docker/daemon/exec"
"github.com/docker/docker/errdefs"
"github.com/docker/docker/pkg/pools"
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
"github.com/docker/docker/pkg/signal"
"github.com/docker/docker/pkg/term"
specs "github.com/opencontainers/runtime-spec/specs-go"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
)
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
// Seconds to wait after sending TERM before trying KILL
const termProcessTimeout = 10 * time.Second
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
func (d *Daemon) registerExecCommand(container *container.Container, config *exec.Config) {
// Storing execs in container in order to kill them gracefully whenever the container is stopped or removed.
container.ExecCommands.Add(config.ID, config)
// Storing execs in daemon for easy access via Engine API.
d.execCommands.Add(config.ID, config)
}
// ExecExists looks up the exec instance and returns a bool if it exists or not.
// It will also return the error produced by `getConfig`
func (d *Daemon) ExecExists(name string) (bool, error) {
if _, err := d.getExecConfig(name); err != nil {
return false, err
}
return true, nil
}
// getExecConfig looks up the exec instance by name. If the container associated
// with the exec instance is stopped or paused, it will return an error.
func (d *Daemon) getExecConfig(name string) (*exec.Config, error) {
ec := d.execCommands.Get(name)
if ec == nil {
return nil, errExecNotFound(name)
}
// If the exec is found but its container is not in the daemon's list of
// containers then it must have been deleted, in which case instead of
// saying the container isn't running, we should return a 404 so that
// the user sees the same error now that they will after the
// 5 minute clean-up loop is run which erases old/dead execs.
container := d.containers.Get(ec.ContainerID)
if container == nil {
return nil, containerNotFound(name)
}
if !container.IsRunning() {
return nil, fmt.Errorf("Container %s is not running: %s", container.ID, container.State.String())
}
if container.IsPaused() {
return nil, errExecPaused(container.ID)
}
if container.IsRestarting() {
return nil, errContainerIsRestarting(container.ID)
}
return ec, nil
}
func (d *Daemon) unregisterExecCommand(container *container.Container, execConfig *exec.Config) {
container.ExecCommands.Delete(execConfig.ID, execConfig.Pid)
d.execCommands.Delete(execConfig.ID, execConfig.Pid)
}
func (d *Daemon) getActiveContainer(name string) (*container.Container, error) {
container, err := d.GetContainer(name)
if err != nil {
return nil, err
}
if !container.IsRunning() {
return nil, errNotRunning(container.ID)
}
if container.IsPaused() {
Remove static errors from errors package. Moving all strings to the errors package wasn't a good idea after all. Our custom implementation of Go errors predates everything that's nice and good about working with errors in Go. Take as an example what we have to do to get an error message: ```go func GetErrorMessage(err error) string { switch err.(type) { case errcode.Error: e, _ := err.(errcode.Error) return e.Message case errcode.ErrorCode: ec, _ := err.(errcode.ErrorCode) return ec.Message() default: return err.Error() } } ``` This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake. Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors. Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API: ```go switch err.(type) { case errcode.ErrorCode: daError, _ := err.(errcode.ErrorCode) statusCode = daError.Descriptor().HTTPStatusCode errMsg = daError.Message() case errcode.Error: // For reference, if you're looking for a particular error // then you can do something like : // import ( derr "github.com/docker/docker/errors" ) // if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... } daError, _ := err.(errcode.Error) statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode errMsg = daError.Message default: // This part of will be removed once we've // converted everything over to use the errcode package // FIXME: this is brittle and should not be necessary. // If we need to differentiate between different possible error types, // we should create appropriate error types with clearly defined meaning errStr := strings.ToLower(err.Error()) for keyword, status := range map[string]int{ "not found": http.StatusNotFound, "no such": http.StatusNotFound, "bad parameter": http.StatusBadRequest, "conflict": http.StatusConflict, "impossible": http.StatusNotAcceptable, "wrong login/password": http.StatusUnauthorized, "hasn't been activated": http.StatusForbidden, } { if strings.Contains(errStr, keyword) { statusCode = status break } } } ``` You can notice two things in that code: 1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are. 2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation. This change removes all our status errors from the errors package and puts them back in their specific contexts. IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages. It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface: ```go type errorWithStatus interface { HTTPErrorStatusCode() int } ``` This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method. I included helper functions to generate errors that use custom status code in `errors/errors.go`. By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it. Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-25 15:53:35 +00:00
return nil, errExecPaused(name)
}
if container.IsRestarting() {
Remove static errors from errors package. Moving all strings to the errors package wasn't a good idea after all. Our custom implementation of Go errors predates everything that's nice and good about working with errors in Go. Take as an example what we have to do to get an error message: ```go func GetErrorMessage(err error) string { switch err.(type) { case errcode.Error: e, _ := err.(errcode.Error) return e.Message case errcode.ErrorCode: ec, _ := err.(errcode.ErrorCode) return ec.Message() default: return err.Error() } } ``` This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake. Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors. Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API: ```go switch err.(type) { case errcode.ErrorCode: daError, _ := err.(errcode.ErrorCode) statusCode = daError.Descriptor().HTTPStatusCode errMsg = daError.Message() case errcode.Error: // For reference, if you're looking for a particular error // then you can do something like : // import ( derr "github.com/docker/docker/errors" ) // if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... } daError, _ := err.(errcode.Error) statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode errMsg = daError.Message default: // This part of will be removed once we've // converted everything over to use the errcode package // FIXME: this is brittle and should not be necessary. // If we need to differentiate between different possible error types, // we should create appropriate error types with clearly defined meaning errStr := strings.ToLower(err.Error()) for keyword, status := range map[string]int{ "not found": http.StatusNotFound, "no such": http.StatusNotFound, "bad parameter": http.StatusBadRequest, "conflict": http.StatusConflict, "impossible": http.StatusNotAcceptable, "wrong login/password": http.StatusUnauthorized, "hasn't been activated": http.StatusForbidden, } { if strings.Contains(errStr, keyword) { statusCode = status break } } } ``` You can notice two things in that code: 1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are. 2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation. This change removes all our status errors from the errors package and puts them back in their specific contexts. IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages. It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface: ```go type errorWithStatus interface { HTTPErrorStatusCode() int } ``` This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method. I included helper functions to generate errors that use custom status code in `errors/errors.go`. By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it. Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-25 15:53:35 +00:00
return nil, errContainerIsRestarting(container.ID)
}
return container, nil
}
// ContainerExecCreate sets up an exec in a running container.
func (d *Daemon) ContainerExecCreate(name string, config *types.ExecConfig) (string, error) {
cntr, err := d.getActiveContainer(name)
if err != nil {
return "", err
}
cmd := strslice.StrSlice(config.Cmd)
entrypoint, args := d.getEntrypointAndArgs(strslice.StrSlice{}, cmd)
keys := []byte{}
if config.DetachKeys != "" {
keys, err = term.ToBytes(config.DetachKeys)
if err != nil {
err = fmt.Errorf("Invalid escape keys (%s) provided", config.DetachKeys)
return "", err
}
}
execConfig := exec.NewConfig()
execConfig.OpenStdin = config.AttachStdin
execConfig.OpenStdout = config.AttachStdout
execConfig.OpenStderr = config.AttachStderr
execConfig.ContainerID = cntr.ID
execConfig.DetachKeys = keys
execConfig.Entrypoint = entrypoint
execConfig.Args = args
execConfig.Tty = config.Tty
execConfig.Privileged = config.Privileged
execConfig.User = config.User
execConfig.WorkingDir = config.WorkingDir
linkedEnv, err := d.setupLinkedContainers(cntr)
if err != nil {
return "", err
}
execConfig.Env = container.ReplaceOrAppendEnvValues(cntr.CreateDaemonEnvironment(config.Tty, linkedEnv), config.Env)
if len(execConfig.User) == 0 {
execConfig.User = cntr.Config.User
}
if len(execConfig.WorkingDir) == 0 {
execConfig.WorkingDir = cntr.Config.WorkingDir
}
d.registerExecCommand(cntr, execConfig)
attributes := map[string]string{
"execID": execConfig.ID,
}
d.LogContainerEventWithAttributes(cntr, "exec_create: "+execConfig.Entrypoint+" "+strings.Join(execConfig.Args, " "), attributes)
return execConfig.ID, nil
}
// ContainerExecStart starts a previously set up exec instance. The
// std streams are set up.
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
// If ctx is cancelled, the process is terminated.
func (d *Daemon) ContainerExecStart(ctx context.Context, name string, stdin io.Reader, stdout io.Writer, stderr io.Writer) (err error) {
var (
cStdin io.ReadCloser
cStdout, cStderr io.Writer
)
ec, err := d.getExecConfig(name)
if err != nil {
Remove static errors from errors package. Moving all strings to the errors package wasn't a good idea after all. Our custom implementation of Go errors predates everything that's nice and good about working with errors in Go. Take as an example what we have to do to get an error message: ```go func GetErrorMessage(err error) string { switch err.(type) { case errcode.Error: e, _ := err.(errcode.Error) return e.Message case errcode.ErrorCode: ec, _ := err.(errcode.ErrorCode) return ec.Message() default: return err.Error() } } ``` This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake. Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors. Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API: ```go switch err.(type) { case errcode.ErrorCode: daError, _ := err.(errcode.ErrorCode) statusCode = daError.Descriptor().HTTPStatusCode errMsg = daError.Message() case errcode.Error: // For reference, if you're looking for a particular error // then you can do something like : // import ( derr "github.com/docker/docker/errors" ) // if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... } daError, _ := err.(errcode.Error) statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode errMsg = daError.Message default: // This part of will be removed once we've // converted everything over to use the errcode package // FIXME: this is brittle and should not be necessary. // If we need to differentiate between different possible error types, // we should create appropriate error types with clearly defined meaning errStr := strings.ToLower(err.Error()) for keyword, status := range map[string]int{ "not found": http.StatusNotFound, "no such": http.StatusNotFound, "bad parameter": http.StatusBadRequest, "conflict": http.StatusConflict, "impossible": http.StatusNotAcceptable, "wrong login/password": http.StatusUnauthorized, "hasn't been activated": http.StatusForbidden, } { if strings.Contains(errStr, keyword) { statusCode = status break } } } ``` You can notice two things in that code: 1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are. 2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation. This change removes all our status errors from the errors package and puts them back in their specific contexts. IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages. It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface: ```go type errorWithStatus interface { HTTPErrorStatusCode() int } ``` This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method. I included helper functions to generate errors that use custom status code in `errors/errors.go`. By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it. Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-25 15:53:35 +00:00
return errExecNotFound(name)
}
ec.Lock()
if ec.ExitCode != nil {
ec.Unlock()
Remove static errors from errors package. Moving all strings to the errors package wasn't a good idea after all. Our custom implementation of Go errors predates everything that's nice and good about working with errors in Go. Take as an example what we have to do to get an error message: ```go func GetErrorMessage(err error) string { switch err.(type) { case errcode.Error: e, _ := err.(errcode.Error) return e.Message case errcode.ErrorCode: ec, _ := err.(errcode.ErrorCode) return ec.Message() default: return err.Error() } } ``` This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake. Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors. Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API: ```go switch err.(type) { case errcode.ErrorCode: daError, _ := err.(errcode.ErrorCode) statusCode = daError.Descriptor().HTTPStatusCode errMsg = daError.Message() case errcode.Error: // For reference, if you're looking for a particular error // then you can do something like : // import ( derr "github.com/docker/docker/errors" ) // if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... } daError, _ := err.(errcode.Error) statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode errMsg = daError.Message default: // This part of will be removed once we've // converted everything over to use the errcode package // FIXME: this is brittle and should not be necessary. // If we need to differentiate between different possible error types, // we should create appropriate error types with clearly defined meaning errStr := strings.ToLower(err.Error()) for keyword, status := range map[string]int{ "not found": http.StatusNotFound, "no such": http.StatusNotFound, "bad parameter": http.StatusBadRequest, "conflict": http.StatusConflict, "impossible": http.StatusNotAcceptable, "wrong login/password": http.StatusUnauthorized, "hasn't been activated": http.StatusForbidden, } { if strings.Contains(errStr, keyword) { statusCode = status break } } } ``` You can notice two things in that code: 1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are. 2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation. This change removes all our status errors from the errors package and puts them back in their specific contexts. IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages. It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface: ```go type errorWithStatus interface { HTTPErrorStatusCode() int } ``` This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method. I included helper functions to generate errors that use custom status code in `errors/errors.go`. By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it. Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-25 15:53:35 +00:00
err := fmt.Errorf("Error: Exec command %s has already run", ec.ID)
return errdefs.Conflict(err)
}
if ec.Running {
ec.Unlock()
return errdefs.Conflict(fmt.Errorf("Error: Exec command %s is already running", ec.ID))
}
ec.Running = true
ec.Unlock()
c := d.containers.Get(ec.ContainerID)
logrus.Debugf("starting exec command %s in container %s", ec.ID, c.ID)
attributes := map[string]string{
"execID": ec.ID,
}
d.LogContainerEventWithAttributes(c, "exec_start: "+ec.Entrypoint+" "+strings.Join(ec.Args, " "), attributes)
defer func() {
if err != nil {
ec.Lock()
ec.Running = false
exitCode := 126
ec.ExitCode = &exitCode
if err := ec.CloseStreams(); err != nil {
logrus.Errorf("failed to cleanup exec %s streams: %s", c.ID, err)
}
ec.Unlock()
c.ExecCommands.Delete(ec.ID, ec.Pid)
}
}()
if ec.OpenStdin && stdin != nil {
r, w := io.Pipe()
go func() {
defer w.Close()
defer logrus.Debug("Closing buffered stdin pipe")
pools.Copy(w, stdin)
}()
cStdin = r
}
if ec.OpenStdout {
cStdout = stdout
}
if ec.OpenStderr {
cStderr = stderr
}
if ec.OpenStdin {
ec.StreamConfig.NewInputPipes()
} else {
ec.StreamConfig.NewNopInputPipe()
}
p := &specs.Process{}
if runtime.GOOS != "windows" {
container, err := d.containerdCli.LoadContainer(ctx, ec.ContainerID)
if err != nil {
return err
}
spec, err := container.Spec(ctx)
if err != nil {
return err
}
p = spec.Process
}
p.Args = append([]string{ec.Entrypoint}, ec.Args...)
p.Env = ec.Env
p.Cwd = ec.WorkingDir
p.Terminal = ec.Tty
if p.Cwd == "" {
p.Cwd = "/"
}
if err := d.execSetPlatformOpt(c, ec, p); err != nil {
return err
}
attachConfig := stream.AttachConfig{
TTY: ec.Tty,
UseStdin: cStdin != nil,
UseStdout: cStdout != nil,
UseStderr: cStderr != nil,
Stdin: cStdin,
Stdout: cStdout,
Stderr: cStderr,
DetachKeys: ec.DetachKeys,
CloseStdin: true,
}
ec.StreamConfig.AttachStreams(&attachConfig)
attachErr := ec.StreamConfig.CopyStreams(ctx, &attachConfig)
// Synchronize with libcontainerd event loop
ec.Lock()
c.ExecCommands.Lock()
systemPid, err := d.containerd.Exec(ctx, c.ID, ec.ID, p, cStdin != nil, ec.InitializeStdio)
// the exec context should be ready, or error happened.
// close the chan to notify readiness
close(ec.Started)
if err != nil {
c.ExecCommands.Unlock()
ec.Unlock()
return translateContainerdStartErr(ec.Entrypoint, ec.SetExitCode, err)
}
ec.Pid = systemPid
c.ExecCommands.Unlock()
ec.Unlock()
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
select {
case <-ctx.Done():
logrus.Debugf("Sending TERM signal to process %v in container %v", name, c.ID)
d.containerd.SignalProcess(ctx, c.ID, name, int(signal.SignalMap["TERM"]))
timeout := time.NewTimer(termProcessTimeout)
defer timeout.Stop()
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
select {
case <-timeout.C:
logrus.Infof("Container %v, process %v failed to exit within %v of signal TERM - using the force", c.ID, name, termProcessTimeout)
d.containerd.SignalProcess(ctx, c.ID, name, int(signal.SignalMap["KILL"]))
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
case <-attachErr:
// TERM signal worked
}
return ctx.Err()
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
case err := <-attachErr:
if err != nil {
Update ContainerWait API This patch adds the untilRemoved option to the ContainerWait API which allows the client to wait until the container is not only exited but also removed. This patch also adds some more CLI integration tests for waiting for a created container and waiting with the new --until-removed flag. Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Handle detach sequence in CLI Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Update Container Wait Conditions Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Apply container wait changes to API 1.30 The set of changes to the containerWait API missed the cut for the Docker 17.05 release (API version 1.29). This patch bumps the version checks to use 1.30 instead. This patch also makes a minor update to a testfile which was added to the builder/dockerfile package. Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Remove wait changes from CLI Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Address minor nits on wait changes - Changed the name of the tty Proxy wrapper to `escapeProxy` - Removed the unnecessary Error() method on container.State - Fixes a typo in comment (repeated word) Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Use router.WithCancel in the containerWait handler This handler previously added this functionality manually but now uses the existing wrapper which does it for us. Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Add WaitCondition constants to api/types/container Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Address more ContainerWait review comments - Update ContainerWait backend interface to not return pointer values for container.StateStatus type. - Updated container state's Wait() method comments to clarify that a context MUST be used for cancelling the request, setting timeouts, and to avoid goroutine leaks. - Removed unnecessary buffering when making channels in the client's ContainerWait methods. - Renamed result and error channels in client's ContainerWait methods to clarify that only a single result or error value would be sent on the channel. Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Move container.WaitCondition type to separate file ... to avoid conflict with swagger-generated code for API response Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn) Address more ContainerWait review comments Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
2017-03-31 03:01:41 +00:00
if _, ok := err.(term.EscapeError); !ok {
return errdefs.System(errors.Wrap(err, "exec attach failed"))
}
attributes := map[string]string{
"execID": ec.ID,
}
d.LogContainerEventWithAttributes(c, "exec_detach", attributes)
Add support for user-defined healthchecks This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-04-18 09:48:13 +00:00
}
}
return nil
}
// execCommandGC runs a ticker to clean up the daemon references
// of exec configs that are no longer part of the container.
func (d *Daemon) execCommandGC() {
for range time.Tick(5 * time.Minute) {
var (
cleaned int
liveExecCommands = d.containerExecIds()
)
for id, config := range d.execCommands.Commands() {
if config.CanRemove {
cleaned++
d.execCommands.Delete(id, config.Pid)
} else {
if _, exists := liveExecCommands[id]; !exists {
config.CanRemove = true
}
}
}
if cleaned > 0 {
logrus.Debugf("clean %d unused exec commands", cleaned)
}
}
}
// containerExecIds returns a list of all the current exec ids that are in use
// and running inside a container.
func (d *Daemon) containerExecIds() map[string]struct{} {
ids := map[string]struct{}{}
for _, c := range d.containers.List() {
for _, id := range c.ExecCommands.List() {
ids[id] = struct{}{}
}
}
return ids
}