release notes: https://github.com/opencontainers/runtime-spec/releases/tag/v1.1.0-rc.2
Additions
- config-linux: add support for rsvd hugetlb cgroup
- features: add features.md to formalize the runc features JSON
- config-linux: add support for time namespace
Minor fixes and documentation
- config-linux: clarify where device nodes can be created
- runtime: remove When serialized in JSON, the format MUST adhere to the following pattern
- Update CI to Go 1.20
- config: clarify Linux mount options
- config-linux: fix url error
- schema: fix schema for timeOffsets
- schema: remove duplicate keys
full diff: https://github.com/opencontainers/runtime-spec/compare/v1.1.0-rc.1...v1.1.0-rc.2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OCI image-spec now also provides ArgsEscaped for backward compatibility
with the option used by Docker.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function included a defer to close the net.Conn if an error occurred,
but the calling function (SetExternalKey()) also had a defer to close it
unconditionally.
Rewrite it to use json.NewEncoder(), which accepts a writer, and inline
the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's a no-op on Windows and other non-Linux, non-FreeBSD platforms,
so there's no need to register the re-exec.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just print the error and os.Exit() instead, which makes it more
explicit that we're exiting, and there's no need to decorate the
error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the function into a "backing" function that returns an error, and the
re-exec entrypoint, which handles the error to provide a more idiomatic approach.
This was part of a larger change accross multiple re-exec functions (now removed).
For history's sake; here's the description for that;
The `reexec.Register()` function accepts reexec entrypoints, which are a `func()`
without return (matching a binary's `main()` function). As these functions cannot
return an error, it's the entrypoint's responsibility to handle any error, and to
indicate failures through `os.Exit()`.
I noticed that some of these entrypoint functions had `defer()` statements, but
called `os.Exit()` either explicitly or implicitly (e.g. through `logrus.Fatal()`).
defer statements are not executed if `os.Exit()` is called, which rendered these
statements useless.
While I doubt these were problematic (I expect files to be closed when the process
exists, and `runtime.LockOSThread()` to not have side-effects after exit), it also
didn't seem to "hurt" to call these as was expected by the function.
This patch rewrites some of the entrypoints to split them into a "backing function"
that can return an error (being slightly more iodiomatic Go) and an wrapper function
to act as entrypoint (which can handle the error and exit the executable).
To some extend, I'm wondering if we should change the signatures of the entrypoints
to return an error so that `reexec.Init()` can handle (or return) the errors, so
that logging can be handled more consistently (currently, some some use logrus,
some just print); this would also keep logging out of some packages, as well as
allows us to provide more metadata about the error (which reexec produced the
error for example).
A quick search showed that there's some external consumers of pkg/reexec, so I
kept this for a future discussion / exercise.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Create dangling images for imported images which don't have a name
annotation attached. Previously the content got loaded, but no image
referencing it was created which caused it to be garbage collected
immediately.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
release notes: https://github.com/opencontainers/runc/releases/tag/v1.1.7
full diff: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7
This is the seventh patch release in the 1.1.z release of runc, and is
the last planned release of the 1.1.z series. It contains a fix for
cgroup device rules with systemd when handling device rules for devices
that don't exist (though for devices whose drivers don't correctly
register themselves in the kernel -- such as the NVIDIA devices -- the
full fix only works with systemd v240+).
- When used with systemd v240+, systemd cgroup drivers no longer skip
DeviceAllow rules if the device does not exist (a regression introduced
in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5,
removing an extra warning emitted by runc run/start.
- The source code now has a new file, runc.keyring, which contains the keys
used to sign runc releases.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Verify the content to be equal, not "contains"; this output should be
predictable.
- Also verify the content returned by the function to match.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like the intent is to exclude windows (which wouldn't have /etc/resolv.conf
nor systemd), but most tests would run fine elsewhere. This allows running the
tests on macOS for local testing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use t.TempDir() for convenience, and change some t.Fatal's to Errors,
so that all tests can run instead of failing early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was assuming that the "source" file was always "/etc/resolv.conf",
but the `Get()` function uses `Path()` to find the location of resolv.conf,
which may be different.
While at it, also changed some `t.Fatalf()` to `t.Errorf()`, and renamed
some variables for clarity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After my last change, I noticed that the hash is used as a []byte in most
cases (other than tests). This patch updates the type to use a []byte, which
(although unlikely very important) also improves performance:
Compared to the previous version:
benchstat new.txt new2.txt
name old time/op new time/op delta
HashData-10 128ns ± 1% 116ns ± 1% -9.77% (p=0.000 n=20+20)
name old alloc/op new alloc/op delta
HashData-10 208B ± 0% 88B ± 0% -57.69% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=20+20)
And compared to the original version:
benchstat old.txt new2.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 116ns ± 1% -42.39% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 88B ± 0% -78.85% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 2.00 ± 0% -66.67% (p=0.000 n=20+20)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code seemed overly complicated, requiring a reader to be constructed,
where in all cases, the data was already available in a variable. This patch
simplifies the utility to not require a reader, which also makes it a bit
more performant:
go install golang.org/x/perf/cmd/benchstat@latest
GO111MODULE=off go test -run='^$' -bench=. -count=20 > old.txt
GO111MODULE=off go test -run='^$' -bench=. -count=20 > new.txt
benchstat old.txt new.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 128ns ± 1% -36.16% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 208B ± 0% -50.00% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=20+20)
A small change was made in `Build()`, which previously returned the resolv.conf
data, even if the function failed to write it. In the new variation, `nil` is
consistently returned on failures.
Note that in various places, the hash is not even used, so we may be able to
simplify things more after this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As of Go 1.8, "net/http".Server provides facilities to close all
listeners, making the same facilities in server.Server redundant.
http.Server also improves upon server.Server by additionally providing a
facility to also wait for outstanding requests to complete after closing
all listeners. Leverage those facilities to give in-flight requests up
to five seconds to finish up after all containers have been shut down.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- Use logrus.Fields instead of multiple WithField
- Split one giant debug log into one log per image
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Logging through a dependency-injected interface value was a vestige of
when Trap was in pkg/signal to avoid importing logrus in a reusable
package: cc4da81128.
Now that Trap lives under cmd/dockerd, nobody will be importing this so
we no longer need to worry about minimizing the package's dependencies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Always calling os.Exit() on clean shutdown may not always be desirable
as deferred functions are not run. Let the cleanup callback decide
whether or not to call os.Exit() itself. Allow the process to exit the
normal way, by returning from func main().
Simplify the trap.Trap implementation. The signal notifications are
buffered in a channel so there is little need to spawn a new goroutine
for each received signal. With all signals being handled in the same
goroutine, there are no longer any concurrency concerns around the
interrupt counter.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The image store sends events when a new image is created/tagged, using
it instead of the reference store makes sure we send the "tag" event
when a new image is built using buildx.
Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
Don't panic when processing containers created under fork containerd
integration (this field was added in the upstream and didn't exist in
fork).
Co-authored-by: Djordje Lukic <djordje.lukic@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The fix to ignore SIGPIPE signals was originally added in the Go 1.4
era. signal.Ignore was first added in Go 1.5.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Since cc19eba (backported to v23.0.4), the PreferredPool for docker0 is
set only when the user provides the bip config parameter or when the
default bridge already exist. That means, if a user provides the
fixed-cidr parameter on a fresh install or reboot their computer/server
without bip set, dockerd throw the following error when it starts:
> failed to start daemon: Error initializing network controller: Error
> creating default "bridge" network: failed to parse pool request for
> address space "LocalDefault" pool "" subpool "100.64.0.0/26": Invalid
> Address SubPool
See #45356.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>