The configDir (and "DOCKER_CONFIG" environment variable) is now only used
for the default location for TLS certificates to secure the daemon API.
It is a leftover from when the "docker" and "dockerd" CLI shared the same
binary, allowing the DOCKER_CONFIG environment variable to set the location
for certificates to be used by both.
This patch merges it into cmd/dockerd, which is where it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The cli package is a leftover from when the "docker" and "dockerd" cli
were both maintained in this repository; the only consumer of this is
now the dockerd CLI, so we can move this code (it should not be imported
by anyone).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Moby is not a Go module; to prevent anyone from mistakenly trying to
convert it to one before we are ready, introduce a check (usable in CI
and locally) for a go.mod file.
This is preferable to trying to .gitignore the file as we can ensure
that a mistakenly created go.mod is surfaced by Git-based tooling and is
less likely to surprise a contributor.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
To make the local build environment more correct and consistent, we
should never leave an uncommitted go.mod in the tree; however, we need a
go.mod for certain commands to work properly. Use a wrapper script to
create and destroy the go.mod as needed instead of potentially changing
tooling behavior by leaving it.
If a go.mod already exists, this script will warn and call the wrapped
command with GO111MODULE=on.
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
(*Daemon).registerLinks() calling the WriteHostConfig() method of its
container argument is a vestigial behaviour. In the distant past,
registerLinks() would persist the container links in an SQLite database
and drop the link config from the container's persisted HostConfig. This
changed in Docker v1.10 (#16032) which migrated away from SQLite and
began using the link config in the container's HostConfig as the
persistent source of truth. registerLinks() no longer mutates the
HostConfig at all so persisting the HostConfig to disk falls outside of
its scope of responsibilities.
Signed-off-by: Cory Snider <csnider@mirantis.com>
(*Container).CheckpointTo() upserts a snapshot of the container to the
daemon's in-memory ViewDB and also persists the snapshot to disk. It
does not register the live container object with the daemon's container
store, however. The ViewDB and container store are used as the source of
truth for different operations, so having a container registered in one
but not the other can result in inconsistencies. In particular, the List
Containers API uses the ViewDB as its source of truth and the Container
Inspect API uses the container store.
The (*Daemon).setHostConfig() method is called fairly early in the
process of creating a container, long before the container is registered
in the daemon's container store. Due to a rogue CheckpointTo() call
inside setHostConfig(), there is a window of time where a container can
be included in a List Containers API response but "not exist" according
to the Container Inspect API and similar endpoints which operate on a
particular container. Remove the rogue call so that the caller has full
control over when the container is checkpointed and update callers to
checkpoint explicitly. No changes to (*Daemon).create() are needed as it
checkpoints the fully-created container via (*Daemon).Register().
Fixes#44512.
Signed-off-by: Cory Snider <csnider@mirantis.com>
GetContainer() would return (nil, nil) when looking up a container
if the container was inserted into the containersReplica ViewDB but not
the containers Store at the time of the lookup. Callers which reasonably
assume that the returned err == nil implies returned container != nil
would dereference a nil pointer and panic. Change GetContainer() so that
it always returns a container or an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This allows differentiating how the detailed data is collected between
the containerd-integration code and the existing implementation.
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The List Images API endpoint has accepted multiple values for the
`since` and `before` filter predicates, but thanks to Go's randomizing
of map iteration order, it would pick an arbitrary image to compare
created timestamps against. In other words, the behaviour was undefined.
Change these filter predicates to have well-defined semantics: the
logical AND of all values for each of the respective predicates. As
timestamps are a totally-ordered relation, this is exactly equivalent to
applying the newest and oldest creation timestamps for the `since` and
`before` predicates, respectively.
Signed-off-by: Cory Snider <csnider@mirantis.com>
no changes in vendored code, but containerd v1.6.12 is a security release,
so updating, to prevent scanners marking the dependency to have a vulnerability.
full diff: https://github.com/containerd/containerd/compare/v1.6.11...v1.6.12
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
golang.org/x/net contains a fix for CVE-2022-41717, which was addressed
in stdlib in go1.19.4 and go1.18.9;
> net/http: limit canonical header cache by bytes, not entries
>
> An attacker can cause excessive memory growth in a Go server accepting
> HTTP/2 requests.
>
> HTTP/2 server connections contain a cache of HTTP header keys sent by
> the client. While the total number of entries in this cache is capped,
> an attacker sending very large keys can cause the server to allocate
> approximately 64 MiB per open connection.
>
> This issue is also fixed in golang.org/x/net/http2 v0.4.0,
> for users manually configuring HTTP/2.
full diff: https://github.com/golang/net/compare/v0.2.0...v0.4.0
other dependency updates (due to circular dependencies):
- golang.org/x/sys v0.3.0: https://github.com/golang/sys/compare/v0.2.0...v0.3.0
- golang.org/x/text v0.5.0: https://github.com/golang/text/compare/v0.4.0...v0.5.0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Fix nil pointer deference for Windows containers in CRI plugin
- Fix lease labels unexpectedly overwriting expiration
- Fix for simultaneous diff creation using the same parent snapshot
full diff: https://github.com/containerd/containerd/v1.6.10...v1.6.11
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Includes security fixes for net/http (CVE-2022-41717, CVE-2022-41720),
and os (CVE-2022-41720).
These minor releases include 2 security fixes following the security policy:
- os, net/http: avoid escapes from os.DirFS and http.Dir on Windows
The os.DirFS function and http.Dir type provide access to a tree of files
rooted at a given directory. These functions permitted access to Windows
device files under that root. For example, os.DirFS("C:/tmp").Open("COM1")
would open the COM1 device.
Both os.DirFS and http.Dir only provide read-only filesystem access.
In addition, on Windows, an os.DirFS for the directory \(the root of the
current drive) can permit a maliciously crafted path to escape from the
drive and access any path on the system.
The behavior of os.DirFS("") has changed. Previously, an empty root was
treated equivalently to "/", so os.DirFS("").Open("tmp") would open the
path "/tmp". This now returns an error.
This is CVE-2022-41720 and Go issue https://go.dev/issue/56694.
- net/http: limit canonical header cache by bytes, not entries
An attacker can cause excessive memory growth in a Go server accepting
HTTP/2 requests.
HTTP/2 server connections contain a cache of HTTP header keys sent by
the client. While the total number of entries in this cache is capped,
an attacker sending very large keys can cause the server to allocate
approximately 64 MiB per open connection.
This issue is also fixed in golang.org/x/net/http2 vX.Y.Z, for users
manually configuring HTTP/2.
Thanks to Josselin Costanzi for reporting this issue.
This is CVE-2022-41717 and Go issue https://go.dev/issue/56350.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.19.4
And the milestone on the issue tracker:
https://github.com/golang/go/issues?q=milestone%3AGo1.19.4+label%3ACherryPickApproved
Full diff: https://github.com/golang/go/compare/go1.19.3...go1.19.4
The golang.org/x/net fix is in 1e63c2f08a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Kevin has been doing a lot of work helping modernize our CI, and helping
with BuildKit-related changes. Being our "resident GitHub actions expert",
I think it make sense to add him to the maintainers list as a curator (for
starters), to grant him permissions on the repository.
Perhaps we should nominate him to become a maintainer (with a focus on his
area of expertise) :)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This addresses a regression introduced in 407e3a4552,
which turned out to be "too strict", as there's old images that use, for example;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
unsupported media type application/octet-stream
Before 407e3a4552, such mediatypes were accepted;
docker pull python:3.5.1-alpine
3.5.1-alpine: Pulling from library/python
e110a4a17941: Pull complete
30dac23631f0: Pull complete
202fc3980a36: Pull complete
Digest: sha256:f88925c97b9709dd6da0cb2f811726da9d724464e9be17a964c70f067d2aa64a
Status: Downloaded newer image for python:3.5.1-alpine
docker.io/library/python:3.5.1-alpine
This patch copies the additional media-types, using the list of types that
were added in a215e15cb1, which fixed a
similar issue.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This syncs the seccomp-profile with the latest changes in containerd's
profile, applying the same changes as 17a9324035
Some background from the associated ticket:
> We want to use vsock for guest-host communication on KubeVirt
> (https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.
>
> However since anyone can just connect from any pod to any VM with the
> default seccomp settings, we cannot limit connection attempts to our
> privileged node-agent.
>
> ### Describe the solution you'd like
> We want to deny the `socket` syscall for the `AF_VSOCK` family by default.
>
> I see in [1] and [2] that AF_VSOCK was actually already blocked for some
> time, but that got reverted since some architectures support the `socketcall`
> syscall which can't be restricted properly. However we are mostly interested
> in `arm64` and `amd64` where limiting `socket` would probably be enough.
>
> ### Additional context
> I know that in theory we could use our own seccomp profiles, but we would want
> to provide security for as many users as possible which use KubeVirt, and there
> it would be very helpful if this protection could be added by being part of the
> DefaultRuntime profile to easily ensure that it is active for all pods [3].
>
> Impact on existing workloads: It is unlikely that this will disturb any existing
> workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
> However if someone would still use it: Privileged pods would still be able to
> use `socket` for `AF_VSOCK`, custom seccomp policies could be applied too.
> Further it was already blocked for quite some time and the blockade got lifted
> due to reasons not related to AF_VSOCK.
>
> The PR in KubeVirt which adds VSOCK support for additional context: [4]
>
> [1]: https://github.com/moby/moby/pull/29076#commitcomment-21831387
> [2]: dcf2632945
> [3]: https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
> [4]: https://github.com/kubevirt/kubevirt/pull/8546
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit allows to remove dependency on the mutable version armon/go-radix.
The go-immutable-radix package is better maintained.
It is likely that a bit more memory will be used when using the
immutable version, though discarded nodes are being reused in a pool.
These changes happen when networks are added/removed or nodes come and
go in a cluster, so we are still talking about a relatively low
frequency event.
The major changes compared to the old radix are when modifying (insert
or delete) a tree, and those are pretty self-contained: we replace the
entire immutable tree under a lock.
Signed-off-by: Tibor Vass <teabee89@gmail.com>