During review, it was decided to remove `LimitNOFILE` from `docker.service` to rely on the systemd v240 implicit default of `1024:524288`. On supported platforms with systemd prior to v240, packagers will patch the service with an explicit `LimitNOFILE=1024:524288`.
- `1024` soft limit is an implicit default, avoiding unexpected breakage. Software that needs a higher limit should request to raise the soft limit for its process.
- `524288` hard limit is an implicit default since systemd v240 and is adequate for most processes (_half of the historical limit from `fs.nr_open` of `1048576`_), while 4096 is the implicit default from the kernel (often too low). Individual containers can be started with `--ulimit` when a larger hard limit is required.
- The hard limit may not exceed `fs.nr_open` (_which a value of `infinity` will resolve to_). On most systems with systemd v240 or newer, this will resolve to an excessive size of 2^30 (over 1 billion).
- When set to `infinity` (usually as the soft limit) software may experience significantly increased resource usage, resulting in a performance regression or runtime failures that are difficult to troubleshoot.
- OpenRC current config approach lacks support for different soft/hard limits being set as it adjusts additional limits and `ulimit` does not support mixed usage of `-H` + `-S`. A soft limit of `524288` is not ideal, but 2^19 is much less overhead than 2^30, whilst a hard limit of 4096 would be problematic for Docker.
Signed-off-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
These tests were made parallel to speed up the execution, but this
turned out to be flaky, because they mutate some shared state.
The tests use shared `storage` variable without any synchronization.
However, adding synchronization is not enough in all cases, some tests
register the same plugin, so they can't be run in parallel to each
other.
This commit adds the synchronization around `storage` variable
modification and removes parallel from the tests where it's not enough.
Before:
```
$ go test -race -v . -count 1
...
--- FAIL: TestGet (15.02s)
--- FAIL: TestGet/not_implemented (0.00s)
testing.go:1446: race detected during execution of test
testing.go:1446: race detected during execution of test
FAIL
FAIL github.com/docker/docker/pkg/plugins 17.655s
FAIL
```
After:
```
$ go test -race -v . -count 1
ok github.com/docker/docker/pkg/plugins 32.702s
```
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This function is called by `daemon.containerCreate()` which is already
wrapping errors coming from `verifyNetworkingConfig()` with
`errdefs.InvalidParameter()`. So `verifyNetworkingConfig()` should only
return standard errors.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
"HEAD" will still be used as a version if no DOCKER_COMMIT is provided
(for example when not running via `make`), but it won't prevent it being
set to the GITHUB_SHA variable when it's present.
This should fix `Git commit` reported by `docker version` for the
binaries generated by `moby-bin`.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
- store network.Name() in a variable to reduce repeatedly locking/unlocking
of the network (although this is very, very minimal in the grand scheme
of things).
- un-wrap long conditions
- ever so slightly optimise some conditions by changeing the order of checks.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was initializing a new PortBinding, and creating a deep copy
for each binding. It's unclear what the intent was here, but at least
PortBinding.GetCopy() wasn't adding much value, as it created a new
PortBinding, [copying all values from the original][1], which includes
a [copy of IPAddresses in it][2]. Our original "template" did not have any
of that, so let's forego that, and just create new PortBindings as we go.
[1]: 454b6a7cf5/libnetwork/types/types.go (L110-L120)
[2]: 454b6a7cf5/libnetwork/types/types.go (L236-L244)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were not adding much, so just getting rid of them. Also added a
TODO to move this code to the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move variables closer to where they're used instead of defining them all
at the start of the function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`getPortMapInfo` does many things; it creates a copy of all the sandbox
endpoints, gets the driver, endpoints, and network from store, and creates
port-bindings for all exposed and mapped ports.
We should look if we can create a more minimal implementation for this
purpose, but in the meantime, let's prevent it being called if we don't
need it by making it the second condition in the check.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- don't initialize slices; it's not needed to append to them
- store network-ID in a var to prevent repeated lock/unlocking in nw.ID()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The original code in container.Exec was potentially leaking the copy
goroutine when the context was cancelled or timed out. The new
`demultiplexStreams()` function won't return until the goroutine has
finished its work, and to ensure that it takes care of closing the
hijacked connection.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Includes a fix for CVE-2023-29409
go1.20.7 (released 2023-08-01) includes a security fix to the crypto/tls
package, as well as bug fixes to the assembler and the compiler. See the
Go 1.20.7 milestone on our issue tracker for details:
- https://github.com/golang/go/issues?q=milestone%3AGo1.20.7+label%3ACherryPickApproved
- full diff: https://github.com/golang/go/compare/go1.20.6...go1.20.7
From the mailing list announcement:
[security] Go 1.20.7 and Go 1.19.12 are released
Hello gophers,
We have just released Go versions 1.20.7 and 1.19.12, minor point releases.
These minor releases include 1 security fixes following the security policy:
- crypto/tls: restrict RSA keys in certificates to <= 8192 bits
Extremely large RSA keys in certificate chains can cause a client/server
to expend significant CPU time verifying signatures. Limit this by
restricting the size of RSA keys transmitted during handshakes to <=
8192 bits.
Based on a survey of publicly trusted RSA keys, there are currently only
three certificates in circulation with keys larger than this, and all
three appear to be test certificates that are not actively deployed. It
is possible there are larger keys in use in private PKIs, but we target
the web PKI, so causing breakage here in the interests of increasing the
default safety of users of crypto/tls seems reasonable.
Thanks to Mateusz Poliwczak for reporting this issue.
View the release notes for more information:
https://go.dev/doc/devel/release#go1.20.7
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly clearer what it does, as "resolve" may give the
impression it's doing more than just returning the TLS config configured
for the client.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fallbackDial was only used in a single place, and it was defined far away
from where it's used, so let's inline it, so that it's clear at a glance
what we're doing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The is-automated field is being deprecated by Docker Hub's search API,
and will always be set to "false" in future.
This patch deprecates the field and related filter for the Engine's API.
In future, the `is-automated` filter will no longer yield any results
when searching for `is-automated=true`, and will be ignored when
searching for `is-automated=false`.
Given that this field is deprecated by an external API, the deprecation
will not be versioned, and will apply to any API version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I noticed this was always being skipped because of race conditions
checking the logs.
This change adds a log scanner which will look through the logs line by
line rather than allocating a big buffer.
Additionally it adds a `poll.Check` which we can use to actually wait
for the desired log entry.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv4AddrNoMatchError and IPv6AddrNoMatchError are currently implementing
BadRequestError. They are returned in two cases, and none are due to a
bad user request:
- When calling daemon's CreateNetwork route, if the bridge's IPv4
address or none of the bridge's IPv6 addresses match what's requested.
If that happens, there's a big issue somewhere in libnetwork or the
kernel.
- When restoring a network, for the same reason. In that case, the
on-disk state drifted from the interface state.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error can only be reached because of an error in our code, so it's
not a "bad user request". As it's never type asserted, no need to keep
it around.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error is only used in defensive checks whereas the precondition is
already checked by caller. If we reach it, we messed something else. So
it's definitely not a BadRequest. Also, it's not type asserted anywhere,
so just inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>