DeviceMapper tasks in go use SetFinalizer to clean up C construct
counterparts in the C LVM library. While thats well and good, it relies
heavily on the exact interpretation of when the golang garbage collector
determines that an object is unreachable is subject to reclaimation.
While common sense would assert that for stack variables (which these DM
tasks always are), are unreachable when the stack frame in which they
are declared returns, thats not the case. According to this:
https://golang.org/pkg/runtime/#SetFinalizer
The garbage collector decides that, if a function calls into a
systemcall (which task.run() always will in LVM), and there are no
subsequent references to the task variable within that stack frame, then
it can be reclaimed. Those conditions are met in several devmapper.go
routines, and if the garbage collector runs in the middle of a
deviceMapper operation, then the task can be destroyed while the
operation is in progress, leading to crashes, failed operations and
other unpredictable behavior.
The fix is to use the KeepAlive interface:
https://golang.org/pkg/runtime/#KeepAlive
The KeepAlive method is effectively an empy reference that fools the
garbage collector into thinking that a variable is still reachable. By
adding a call to KeepAlive in the task.run() method, we can ensure that
the garbage collector won't reclaim a task object until its execution
within the deviceMapper C library is complete.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Switches the remaining syscalls except Errno to /x/sys/.
This was supposed to be part of 33180
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
This is necessary because normally `apparmor_parser -r` will try to
create a temporary directory on the host (which is not allowed if the
host has a rootfs). However, the -K option bypasses saving things to the
cache (which avoids this issue).
% apparmor_parser -r /tmp/docker-profile
mkstemp: Read-only file system
% apparmor_parser -Kr /tmp/docker-profile
%
In addition, add extra information to the ensureDefaultAppArmorProfile
errors so that problems like this are easier to debug.
Fixes: 2f7596aaef ("apparmor: do not save profile to /etc/apparmor.d")
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This patch adds the untilRemoved option to the ContainerWait API which
allows the client to wait until the container is not only exited but
also removed.
This patch also adds some more CLI integration tests for waiting for a
created container and waiting with the new --until-removed flag.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Handle detach sequence in CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Update Container Wait Conditions
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Apply container wait changes to API 1.30
The set of changes to the containerWait API missed the cut for the
Docker 17.05 release (API version 1.29). This patch bumps the version
checks to use 1.30 instead.
This patch also makes a minor update to a testfile which was added to
the builder/dockerfile package.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Remove wait changes from CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address minor nits on wait changes
- Changed the name of the tty Proxy wrapper to `escapeProxy`
- Removed the unnecessary Error() method on container.State
- Fixes a typo in comment (repeated word)
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Use router.WithCancel in the containerWait handler
This handler previously added this functionality manually but now uses
the existing wrapper which does it for us.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Add WaitCondition constants to api/types/container
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
- Update ContainerWait backend interface to not return pointer values
for container.StateStatus type.
- Updated container state's Wait() method comments to clarify that a
context MUST be used for cancelling the request, setting timeouts,
and to avoid goroutine leaks.
- Removed unnecessary buffering when making channels in the client's
ContainerWait methods.
- Renamed result and error channels in client's ContainerWait methods
to clarify that only a single result or error value would be sent
on the channel.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Move container.WaitCondition type to separate file
... to avoid conflict with swagger-generated code for API response
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Switches calls to syscall to x/sys, which is more up to date.
This is fixes a number of possible bugs on other architectures
where ioctl tcget and tcset aren't implemented correctly.
There are a few remaining syscall references, because x/sys doesn't
have an Errno implementation yet.
Also removes a ppc64le and cgo build tag that fixes building on
ppc64le without cgo
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
All LVM actions in the devicemapper library are asyncronous, involving a call to
a task enqueue function (dm_run_task) and a wait on a resultant udev event
(UdevWait). Currently devmapper.go defers all calls to UdevWait, which discards
the return value. While it still generates an error message in the log (if
debugging is enabled), the calling thread is still allowed to continue as if no
error has occured, leading to subsequent errors, and significant confusion when
debugging, due to those subsequent errors. Given that there is no risk of panic
between the task submission and the wait operation, it seems more reasonable to
preform the UdevWait inline at the end of any given lvm action so that errors
can be caught and returned before docker can continue and create additional
failures.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Currently, the devicemapper library sets cookies to correlate wait operations,
which must be unique (as the lvm2 library doesn't detect duplicate cookies).
The current method for cookie generation is to take the address of a cookie
variable. However, because the variable is declared on the stack, execution
patterns can lead to the cookie variable being declared at the same stack
location, which results in a high likelyhood of duplicate cookie use, which in
turn can lead to various odd lvm behaviors, which can be hard to track down
(object use before create, duplicate completions, etc). Lets guarantee that the
cookie we generate is unique by declaring it on the heap instead. This
guarantees that the address of the variable won't be reused until such time as
the UdevWait operation completes, and drops its reference to it, at which time
the gc can reclaim it.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
If a wait event fails when preforming a devicemapper operation, it would be good
to know, in addition to the cookie that its waiting on, we reported the error
that was reported from the lvm2 library.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Before this, if `forceRemove` is set the container data will be removed
no matter what, including if there are issues with removing container
on-disk state (rw layer, container root).
In practice this causes a lot of issues with leaked data sitting on
disk that users are not able to clean up themselves.
This is particularly a problem while the `EBUSY` errors on remove are so
prevalent. So for now let's not keep this behavior.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This is synonymous with `docker run --cidfile=FILE` and writes the digest of
the newly built image to the named file. This is intended to be used by build
systems which want to avoid tagging (perhaps because they are in CI or
otherwise want to avoid fixed names which can clash) by enabling e.g. Makefile
constructs like:
image.id: Dockerfile
docker build --iidfile=image.id .
do-some-more-stuff: image.id
do-stuff-with <image.id
Currently the only way to achieve this is to use `docker build -q` and capture
the stdout, but at the expense of losing the build output.
In non-silent mode (without `-q`) with API >= v1.29 the caller will now see a
`JSONMessage` with the `Aux` field containing a `types.BuildResult` in the
output stream for each image/layer produced during the build, with the final
one being the end product. Having all of the intermediate images might be
interesting in some cases.
In silent mode (with `-q`) there is no change, on success the only output will
be the resulting image digest as it was previosuly.
There was no wrapper to just output an Aux section without enclosing it in a
Progress, so add one here.
Added some tests to integration cli tests.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Previously, only perm-related bits where preserved when rewriting
FileMode in tar entries on Windows. This had the nasty side effect of
having tarsum returning different values when executing from a tar filed
produced on Windows or Linux.
This fix the issue, and pave the way for incremental build context
to work in hybrid contexts.
Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>