This change was introduced early in the development of rootless support,
before all the kinks were worked out and rootlesskit was built. The
author was testing the daemon by inside a user namespace set up by runc,
observed that the unshare(2) syscall was returning EPERM, and assumed
that it was a fundamental limitation of user namespaces. Seeing as the
kernel documentation (of today) disagrees with that assessment and that
unshare demonstrably works inside user namespaces, I can only assume
that the EPERM was due to a quirk of their test environment, such as a
seccomp filter set up by runc blocking the unshare syscall.
https://github.com/moby/moby/pull/20902#issuecomment-236409406
Mount namespaces are necessary to address #38995 and #43390. Revert the
special-casing so those issues can also be fixed for rootless daemons.
This reverts commit dc950567c1.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Unshare the thread's file system attributes and, if applicable, mount
namespace so that the chroot operation does not affect the rest of the
process.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The applyLayer implementation in pkg/chrootarchive has to set the TMPDIR
environment variable so that archive.UnpackLayer() can successfully
create the whiteout-file temp directory. Change UnpackLayer to create
the temporary directory under the destination path so that environment
variables do not need to be touched.
Signed-off-by: Cory Snider <csnider@mirantis.com>
We were discarding the underlying error, which made it impossible for
callers to detect (e.g.) an os.ErrNotExist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- use t.TempDir() to make sure we're testing from a clean state
- improve checks for errors to have the correct error-type where possible
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
system.MkdirAll is a special version of os.Mkdir to handle creating directories
using Windows volume paths ("\\?\Volume{4c1b02c1-d990-11dc-99ae-806e6f6e6963}").
This may be important when MkdirAll is used, which traverses all parent paths to
create them if missing (ultimately landing on the "volume" path).
Commit 62f648b061 introduced the system.MkdirAll
calls, as a change was made in applyLayer() for Windows to use Windows volume
paths as an alternative for chroot (which is not supported on Windows). Later
iteractions changed this to regular Windows long-paths (`\\?\<path>`) in
230cfc6ed2, and 9b648dfac6.
Such paths are handled by the `os` package.
However, in these tests, the parent path already exists (all paths created are
a direct subdirectory within `tmpDir`). It looks like `MkdirAll` here is used
out of convenience to not have to handle `os.ErrExist` errors. As all these
tests are running in a fresh temporary directory, there should be no need to
handle those, and it's actually desirable to produce an error in that case, as
the directory already existing would be unexpected.
Because of the above, this test changes `system.MkdirAll` to `os.Mkdir`. As we
are changing these lines, this patch also changes the legacy octal notation
(`0700`) to the now preferred `0o700`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Introduced in 3ac6394b80, which makes no mention
of a reason for extracting to the same directory as we created the archive from,
so I assume this was a copy/paste mistake and the path was meant to be "dest",
not "src".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The implementation of CanAccess() is very rudimentary, and should
not be used for anything other than a basic check (and maybe not
even for that). It's only used in a single location in the daemon,
so move it there, and un-export it to not encourage others to use
it out of context.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type felt really redundant; `pidfile.New()` takes the path of the file to
create as an argument, so this is already known. The only thing the PIDFile
type provided was a `Remove()` method, which was just calling `os.Remove()` on
the path of the file.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use bytes.TrimSpace instead of using the strings package, which is
more performant, and allows us to skip the intermediate variable.
Also combined some "if" statements to reduce cyclomatic complexity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's ok to ignore if the file doesn't exist, or if the file doesn't
have a PID in it, but we should produce an error if the file exists,
but we're unable to read it for other reasons.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The same attribute was generated for each path that was created, but always
the same, so instead of generating it in each iteration, generate it once,
and pass it to our mkdirall() implementation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The regex only matched volume paths without a trailing path-separator. In cases
where a path would be passed with a trailing path-separator, it would depend on
further code in mkdirall to strip the trailing slash, then to perform the regex
again in the next iteration.
While regexes aren't ideal, we're already executing this one, so we may as well
use it to match those situations as well (instead of executing it twice), to
allow us to return early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Ideally, we would construct this lazily, but adding a function and a
sync.Once felt like a bit "too much".
Also updated the GoDoc for some functions to better describe what they do.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These consts were defined locally, but are now defined in golang.org/x/sys, so
we can use those.
Also added some documentation about how this function works, taking the description
from the GetExitCodeProcess function (processthreadsapi.h) API reference:
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getexitcodeprocess
> The GetExitCodeProcess function returns a valid error code defined by the
> application only after the thread terminates. Therefore, an application should
> not use `STILL_ACTIVE` (259) as an error code (`STILL_ACTIVE` is a macro for
> `STATUS_PENDING` (minwinbase.h)). If a thread returns `STILL_ACTIVE` (259) as
> an error code, then applications that test for that value could interpret it
> to mean that the thread is still running, and continue to test for the
> completion of the thread after the thread has terminated, which could put
> the application into an infinite loop.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most of the package was using stdlib's errors package, so replacing two calls
to pkg/errors with stdlib. Also fixing capitalization of error strings.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
On unix, it's an alias for os.MkdirAll, so remove its use to be
more transparent what's being used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Merge the accessible() function into CanAccess, and check world-
readable permissions first, before checking owner and group.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use the IoctlRetInt, IoctlSetInt and IoctlLoopSetStatus64 helper
functions defined in the golang.org/x/sys/unix package instead of
manually wrapping these using a locally defined function.
Inspired by 3cc3d8a560
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These tests were effectively doing "subtests", using comments to describe each,
however;
- due to the use of `t.Fatal()` would terminate before completing all "subtests"
- The error returned by the function being tested (`Chtimes`), was not checked,
and the test used "indirect" checks to verify if it worked correctly. Adding
assertions to check if the function didn't produce an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removing the "Linux" suffix from one test, which should probably be
rewritten to be run on "unix", to provide test-coverage for those
implementations.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
With t.TempDir(), some of the test-utilities became so small that
it was more transparent to inline them. This also helps separating
concenrs, as we're in the process of thinning out and decoupling
some packages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It looks like this function was converting the time (`windows.NsecToTimespec()`),
only to convert it back (`windows.TimespecToNsec()`). This became clear when
moving the lines together:
```go
ctimespec := windows.NsecToTimespec(ctime.UnixNano())
c := windows.NsecToFiletime(windows.TimespecToNsec(ctimespec))
```
And looking at the Golang code, it looks like they're indeed the exact reverse:
```go
func TimespecToNsec(ts Timespec) int64 { return int64(ts.Sec)*1e9 + int64(ts.Nsec) }
func NsecToTimespec(nsec int64) (ts Timespec) {
ts.Sec = nsec / 1e9
ts.Nsec = nsec % 1e9
return
}
```
While modifying this code, also renaming the `e` variable to a more common `err`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This more closely matches to how it's used everywhere. Also move the comment
describing "what" ChTimes() does inside its GoDoc.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code caused me some head-scratches, and initially I wondered
if this was a bug, but it looks to be intentional to set nsec, not
sec, as time.Unix() internally divides nsec, and sets sec accordingly;
https://github.com/golang/go/blob/go1.19.2/src/time/time.go#L1364-L1380
// Unix returns the local Time corresponding to the given Unix time,
// sec seconds and nsec nanoseconds since January 1, 1970 UTC.
// It is valid to pass nsec outside the range [0, 999999999].
// Not all sec values have a corresponding time value. One such
// value is 1<<63-1 (the largest int64 value).
func Unix(sec int64, nsec int64) Time {
if nsec < 0 || nsec >= 1e9 {
n := nsec / 1e9
sec += n
nsec -= n * 1e9
if nsec < 0 {
nsec += 1e9
sec--
}
}
return unixTime(sec, int32(nsec))
}
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>