The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It has been pointed out that we're ignoring EINVAL from umount(2)
everywhere, so let's move it to a lower-level function. Also, its
implementation should be the same for any UNIX incarnation, so
let's consolidate it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function ensures the argument is the mount point
(i.e. if it's not, it bind mounts it to itself).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. There is no need to specify rw argument -- bind mounts are
read-write by default.
2. There is no point in parsing /proc/self/mountinfo after performing
a mount, especially if we don't check whether the fs is mounted or
not -- the only outcome from it could be an error from our mountinfo
parser, which makes no sense in this context.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Sometimes docker-master CI fails on rhel4+selinux configuration,
like this:
--- FAIL: TestMount (0.12s)
--- FAIL: TestMount/none-remount,size=128k (0.01s)
mounter_linux_test.go:209: unexpected mount option "seclabel" expected "rw,size=128k"
--- FAIL: TestMount/none-remount,ro,size=128k (0.01s)
mounter_linux_test.go:209: unexpected mount option "seclabel" expected "ro,size=128k"
Earlier, commit 8bebd42df2 (PR #34965) fixed this failure,
but not entirely (i.e. the test is now flaky). It looks like
either selinux detection code is not always working (it won't
work in d-in-d), or the kernel might or might not add 'seclabel'
option).
As the subject of this test case is definitely not selinux,
it can just ignore the option added by it.
While at it, fix error messages:
- add missing commas;
- fix a typo;
- allow for clear distinction between mount
and vfs (per-superblock) options.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Now, every Unmount() call takes a burden to parse the whole nine yards
of /proc/self/mountinfo to figure out whether the given mount point is
mounted or not (and returns an error in case parsing fails somehow).
Instead, let's just call umount() and ignore EINVAL, which results
in the same behavior, but much better performance.
Note that EINVAL is returned from umount(2) not only in the case when
`target` is not mounted, but also for invalid flags. As the flags are
hardcoded here, it can't be the case.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The mountinfo parser implemented via `fmt.Sscanf()` is slower than the one
using `strings.Split()` and `strconv.Atoi()`. This rewrite helps to speed it
up to a factor of 8x, here is a result from go bench:
> BenchmarkParsingScanf-4 300 22294112 ns/op
> BenchmarkParsingSplit-4 3000 2780703 ns/op
I tried other approaches, such as using `fmt.Sscanf()` for the first
three (integer) fields and `strings.Split()` for the rest, but it slows
things down considerably:
> BenchmarkParsingMixed-4 1000 8827058 ns/op
Note the old code uses `fmt.Sscanf`, when a linear search for '-' field,
when a split for the last 3 fields. The new code relies on a single
split.
I have also added more comments to aid in future development.
Finally, the test data is fixed to now have white space before the first field.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The flow of getSourceMount was:
1 get all entries from /proc/self/mountinfo
2 do a linear search for the `source` directory
3 if found, return its data
4 get the parent directory of `source`, goto 2
The repeated linear search through the whole mountinfo (which can have
thousands of records) is inefficient. Instead, let's just
1 collect all the relevant records (only those mount points
that can be a parent of `source`)
2 find the record with the longest mountpath, return its data
This was tested manually with something like
```go
func TestGetSourceMount(t *testing.T) {
mnt, flags, err := getSourceMount("/sys/devices/msr/")
assert.NoError(t, err)
t.Logf("mnt: %v, flags: %v", mnt, flags)
}
```
...but it relies on having a specific mount points on the system
being used for testing.
[v2: add unit tests for ParentsFilter]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Sorting by mount point length can be implemented in a more
straightforward fashion since Go 1.8 introduced sort.Slice()
with an ability to provide a less() function in place.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This makes `go test .` to pass if run as non-root user, skipping
those tests that require superuser privileges (for `mount`).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When a recursive unmount fails, don't bother parsing the mount table to check
if what we expected to be a mountpoint is still mounted. `EINVAL` is
returned when you try to unmount something that is not a mountpoint, the
other cases of `EINVAL` would not apply here unless everything is just
wrong. Parsing the mount table over and over is relatively expensive,
especially in the code path that it's in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure to call C.free on C string allocated using C.CString in every
exit path.
C.CString allocates memory in the C heap using malloc. It is the callers
responsibility to free them. See
https://golang.org/cmd/cgo/#hdr-Go_references_to_C for details.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
The case where we are trying to do a remount with changed filesystem specific options was missing,
we need to call `mount` as well here to change those options.
See #33844 for where we need this, as we change `tmpfs` options.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Before this, if `forceRemove` is set the container data will be removed
no matter what, including if there are issues with removing container
on-disk state (rw layer, container root).
In practice this causes a lot of issues with leaked data sitting on
disk that users are not able to clean up themselves.
This is particularly a problem while the `EBUSY` errors on remove are so
prevalent. So for now let's not keep this behavior.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fixes issues where the underlying filesystem may be disconnected and
attempting to unmount may cause a hang.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Propagation type changes must be done as a separate call, in the
same way as read only bind mounts.
To fix this:
1. Ensure propagation type change flags aren't included in other calls.
2. Apply propagation type change in a separate call.
Also:
* Make it clear which parameters are ignored by passing them as empty.
* Add tests to ensure Mount options are applied correctly.
Fixes#30415
Signed-off-by: Steven Hartland <steven.hartland@multiplay.co.uk>
This fix tries to address the issue raised in #22420. When
`--tmpfs` is specified with `/tmp`, the default value is
`rw,nosuid,nodev,noexec,relatime,size=65536k`. When `--tmpfs`
is specified with `/tmp:rw`, then the value changed to
`rw,nosuid,nodev,noexec,relatime`.
The reason for such an inconsistency is because docker tries
to add `size=65536k` option only when user provides no option.
This fix tries to address this issue by always pre-progating
`size=65536k` along with `rw,nosuid,nodev,noexec,relatime`.
If user provides a different value (e.g., `size=8192k`), it
will override the `size=65536k` anyway since the combined
options will be parsed and merged to remove any duplicates.
Additional test cases have been added to cover the changes
in this fix.
This fix fixes#22420.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
dockerinit has been around for a very long time. It was originally used
as a way for us to do configuration for LXC containers once the
container had started. LXC is no longer supported, and /.dockerinit has
been dead code for quite a while. This removes all code and references
in code to dockerinit.
Signed-off-by: Aleksa Sarai <asarai@suse.com>
If you run a
docker run command with --tmpfs /mountpoint:noexec
Or certain options that get translated into mount options, the mount command can get passed "" for mount data.
So this should be valid.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
It will Tar up contents of child directory onto tmpfs if mounted over
This patch will use the new PreMount and PostMount hooks to "tar"
up the contents of the base image on top of tmpfs mount points.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Fix the following warnings:
pkg/mount/mountinfo.go:5:6: type name will be used as mount.MountInfo by other packages, and that stutters; consider calling this Info
pkg/mount/mountinfo.go:7:2: struct field Id should be ID
Signed-off-by: Antonio Murdaca <runcom@linux.com>