Linux kernel prior to v3.16 was not supporting netns for vxlan
interfaces. As such, moby/libnetwork#821 introduced a "host mode" to the
overlay driver. The related kernel fix is available for rhel7 users
since v7.2.
This mode could be forced through the use of the env var
_OVERLAY_HOST_MODE. However this env var has never been documented and
is not referenced in any blog post, so there's little chance many people
rely on it. Moreover, this host mode is deemed as an implementation
details by maintainers. As such, we can consider it dead and we can
remove it without a prior deprecation warning.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since 0fa873c, there's no function writing overlay networks to some
datastore. As such, overlay network struct doesn't need to implement
KVObject interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since a few commits, subnet's vni don't change during the lifetime of
the subnet struct, so there's no need to lock the network before
accessing it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since the previous commit, data from the local store are never read,
thus proving it was only used for Classic Swarm.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver in Swarm v2 mode doesn't support live-restore, ie.
the daemon won't even start if the node is part of a Swarm cluster and
live-restore is enabled. This feature was only used by Swarm Classic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
VNI allocations made by the overlay driver were only used by Classic
Swarm. With Swarm v2 mode, the driver ovmanager is responsible of
allocating & releasing them.
Previously, vxlanIdm was initialized when a global store was available
but since 142b522, no global store can be instantiated. As such,
releaseVxlanID actually does actually nothing and iptables rules are
never removed.
The last line of dead code detected by golangci-lint is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Prior to 0fa873c, the serf-based event loop was started when a global
store was available. Since there's no more global store, this event loop
and all its associated code is dead.
Most dead code detected by golangci-lint in prior commits is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- LocalKVProvider, LocalKVProviderURL, LocalKVProviderConfig,
GlobalKVProvider, GlobalKVProviderURL and GlobalKVProviderConfig
are all unused since moby/libnetwork@be2b6962 (moby/libnetwork#908).
- GlobalKVClient is unused since 0fa873c and c8d2c6e.
- MakeKVProvider, MakeKVProviderURL and MakeKVProviderConfig are unused
since 96cfb076 (moby/moby#44683).
- MakeKVClient is unused since 142b5229 (moby/moby#44875).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is unused anymore since 142b522 (moby/moby#44875).
It was also creating a local store whenever netlabel.LocalKVClient was
specificed in its config argument. This store is unused since
moby/libnetwork@9e72136 (moby/libnetwork#1636).
Finally, the sync.Once properties are never used and thus can be
deleted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is not used anymore since 142b522 (moby/moby#44875).
golangci-lint now detects dead code. This will be fixed in subsequent
commits.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This command was useful when overlay networks based on external KV store
was developed but is unused nowadays.
As the last reference to OverlayBindInterface and OverlayNeighborIP
netlabels are in the ovrouter cmd, they're removed too.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This only makes the containerd ImageService implementation respect
context cancellation though.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Implement Children method for containerd image store which makes the
`ancestor` filter work for `docker ps`. Checking if image is a children
of other image is implemented by comparing their rootfs diffids because
containerd image store doesn't have a concept of image parentship like
the graphdriver store. The child is expected to have more layers than
the parent and should start with all parent layers.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Before:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Use CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variables are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
After:
```console
$ docker-rootless-setuptool.sh install
...
[INFO] Using CLI context "rootless"
Current context is now "rootless"
[INFO] Make sure the following environment variable(s) are set (or add them to ~/.bashrc):
export PATH=/usr/local/bin:$PATH
[INFO] Some applications may require the following environment variable too:
export DOCKER_HOST=unix:///run/user/1001/docker.sock
```
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Drop support for platforms which only have xt_u32 but not xt_bpf. No
attempt is made to clean up old xt_u32 iptables rules left over from a
previous daemon instance.
Signed-off-by: Cory Snider <csnider@mirantis.com>
go1.20.3 (released 2023-04-04) includes security fixes to the go/parser,
html/template, mime/multipart, net/http, and net/textproto packages, as well
as bug fixes to the compiler, the linker, the runtime, and the time package.
See the Go 1.20.3 milestone on our issue tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.20.3+label%3ACherryPickApproved
full diff: https://github.com/golang/go/compare/go1.20.2...go1.20.3
Further details from the announcement on the mailing list:
We have just released Go versions 1.20.3 and 1.19.8, minor point releases.
These minor releases include 4 security fixes following the security policy:
- go/parser: infinite loop in parsing
Calling any of the Parse functions on Go source code which contains `//line`
directives with very large line numbers can cause an infinite loop due to
integer overflow.
Thanks to Philippe Antoine (Catena cyber) for reporting this issue.
This is CVE-2023-24537 and Go issue https://go.dev/issue/59180.
- html/template: backticks not treated as string delimiters
Templates did not properly consider backticks (`) as Javascript string
delimiters, and as such did not escape them as expected. Backticks are
used, since ES6, for JS template literals. If a template contained a Go
template action within a Javascript template literal, the contents of the
action could be used to terminate the literal, injecting arbitrary Javascript
code into the Go template.
As ES6 template literals are rather complex, and themselves can do string
interpolation, we've decided to simply disallow Go template actions from being
used inside of them (e.g. "var a = {{.}}"), since there is no obviously safe
way to allow this behavior. This takes the same approach as
github.com/google/safehtml. Template.Parse will now return an Error when it
encounters templates like this, with a currently unexported ErrorCode with a
value of 12. This ErrorCode will be exported in the next major release.
Users who rely on this behavior can re-enable it using the GODEBUG flag
jstmpllitinterp=1, with the caveat that backticks will now be escaped. This
should be used with caution.
Thanks to Sohom Datta, Manipal Institute of Technology, for reporting this issue.
This is CVE-2023-24538 and Go issue https://go.dev/issue/59234.
- net/http, net/textproto: denial of service from excessive memory allocation
HTTP and MIME header parsing could allocate large amounts of memory, even when
parsing small inputs.
Certain unusual patterns of input data could cause the common function used to
parse HTTP and MIME headers to allocate substantially more memory than
required to hold the parsed headers. An attacker can exploit this behavior to
cause an HTTP server to allocate large amounts of memory from a small request,
potentially leading to memory exhaustion and a denial of service.
Header parsing now correctly allocates only the memory required to hold parsed
headers.
Thanks to Jakob Ackermann (@das7pad) for discovering this issue.
This is CVE-2023-24534 and Go issue https://go.dev/issue/58975.
- net/http, net/textproto, mime/multipart: denial of service from excessive resource consumption
Multipart form parsing can consume large amounts of CPU and memory when
processing form inputs containing very large numbers of parts. This stems from
several causes:
mime/multipart.Reader.ReadForm limits the total memory a parsed multipart form
can consume. ReadForm could undercount the amount of memory consumed, leading
it to accept larger inputs than intended. Limiting total memory does not
account for increased pressure on the garbage collector from large numbers of
small allocations in forms with many parts. ReadForm could allocate a large
number of short-lived buffers, further increasing pressure on the garbage
collector. The combination of these factors can permit an attacker to cause an
program that parses multipart forms to consume large amounts of CPU and
memory, potentially resulting in a denial of service. This affects programs
that use mime/multipart.Reader.ReadForm, as well as form parsing in the
net/http package with the Request methods FormFile, FormValue,
ParseMultipartForm, and PostFormValue.
ReadForm now does a better job of estimating the memory consumption of parsed
forms, and performs many fewer short-lived allocations.
In addition, mime/multipart.Reader now imposes the following limits on the
size of parsed forms:
Forms parsed with ReadForm may contain no more than 1000 parts. This limit may
be adjusted with the environment variable GODEBUG=multipartmaxparts=. Form
parts parsed with NextPart and NextRawPart may contain no more than 10,000
header fields. In addition, forms parsed with ReadForm may contain no more
than 10,000 header fields across all parts. This limit may be adjusted with
the environment variable GODEBUG=multipartmaxheaders=.
Thanks to Jakob Ackermann for discovering this issue.
This is CVE-2023-24536 and Go issue https://go.dev/issue/59153.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
While we currently do not provide an option to specify the snapshotter to use
for individual containers (we may want to add this option in future), currently
it already is possible to configure the snapshotter in the daemon configuration,
which could (likely) cause issues when changing and restarting the daemon.
This patch updates some code-paths that have the container available to use
the snapshotter that's configured for the container (instead of the default
snapshotter configured).
There are still code-paths to be looked into, and a tracking ticket as well as
some TODO's were added for those.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previously, the AWSLogs driver attempted to implement
non-blocking itself. Non-blocking is supposed to
implemented solely by the Docker RingBuffer that
wraps the log driver.
Please see issue and explanation here:
https://github.com/moby/moby/issues/45217
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Prevent the daemon from erroring out if daemon.json contains default
network options for network drivers aside from bridge. Configuring
defaults for the bridge driver previously worked by coincidence because
the unrelated CLI flag '--bridge' exists.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Closing stdin of a container or exec (a.k.a.: task or process) has been
somewhat broken ever since support for ContainerD 1.0 was introduced
back in Docker v17.11: the error returned from the CloseIO() call was
effectively ignored due to it being assigned to a local variable which
shadowed the intended variable. Serendipitously, that oversight
prevented a data race. In my recent refactor of libcontainerd, I
corrected the variable shadowing issue and introduced the aforementioned
data race in the process.
Avoid deadlocking when closing stdin without swallowing errors or
introducing data races by calling CloseIO() synchronously if the process
handle is available, falling back to an asynchronous close-and-log
strategy otherwise. This solution is inelegant and complex, but looks to
be the best that could be done without changing the libcontainerd API.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- use apiClient for api-clients to reduce shadowing (also more "accurate")
- use "ctr" instead of "container"
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attestation manifests have an OCI image media type, which makes them
being listed like they were a separate platform supported by the image.
Don't use `images.Platforms` and walk the manifest list ourselves
looking for all manifests that are an actual image manifest.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
commit 3c69b9f2c5 replaced these functions
and types with github.com/moby/patternmatcher. That commit has shipped with
docker 23.0, and BuildKit v0.11 no longer uses the old functions, so we can
remove these.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The 'Deprecated:' line in NewClient's doc comment was not in a new
paragraph, so GoDoc, linters, and IDEs were unaware that it was
deprecated. The package documentation also continued to reference
NewClient. Update the doc comments to finish documenting that NewClient
is deprecated.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Distribution source labels don't store port of the repository. If the
content was obtained from repository 172.17.0.2:5000 then its
corresponding label will have a key "containerd.io/distribution.source.172.17.0.2".
Fix the check in canBeMounted to ignore the :port part of the domain.
This also removes the check which prevented insecure repositories to use
cross-repo mount - the real cause was the mismatch in domain comparison.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Distribution source label can specify multiple repositories - in this
case value is a comma separated list of source repositories.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>