Slightly refactor Resolver.dialExtDNS:
- use net.JoinHostPort to properly format IPv6 addresses
- define a const for the default port, and avoid int -> string
conversion if no custom port is defined
- slightly simplify logic if the HostLoopback is used (at the cost of
duplicating one line); in that case we don't need to define the closure
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in 36fd9d02be
(libnetwork: ce6c6e8c35),
because there were multiple places where a DNS response was created,
which had to use the same options. However, new "common" options were
added since, and having it in a function separate from the other (also
common) options was just hiding logic, so let's remove it.
What the above probably _should_ have done was to create a common utility
to create a DNS response (as all other options are shared as well). This
was actually done in 0c22e1bd07 (libnetwork:
be3531759b),
which added a `createRespMsg` utility, but missed that it could be used
for both cases.
This patch:
- removes the setCommonFlags function
- uses createRespMsg instead to share common options
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Removes the deprecated consts, which moved to a separate "scope" package
in commit 6ec03d6745, and are no longer used;
- datastore.LocalScope
- datastore.GlobalScope
- datastore.SwarmScope
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
UnavailableError is now compatible with errdefs.UnavailableError. These
errors will now return a 503 instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
InvalidParameter is now compatible with errdefs.InvalidParameter. Thus,
these errors will now return a 400 status code instead of a 500.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- un-export ZoneSettings, because it's only used internally
- make conversion to a "interface" slice a method on the struct
- remove the getDockerZoneSettings() function, and move the type-definition
close to where it's used, as it was only used in a single location
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This test didn't make a lot of sense, because `checkRunning()` depends on
the `connection` package-var being set, which is done by `firewalldInit()`,
so would never be true on its own.
Add a small utility that opens its own D-Bus connection to verify if
firewalld is running, and otherwise skips the tests (preserving any
error in the process).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
DelInterfaceFirewalld returns an error if the interface to delete was
not found. Let's ignore cases where we were successfully able to get
the list of interfaces in the zone, but the interface was not part of
the zone.
This patch changes the error for these cases to an errdefs.ErrNotFound,
and updates IPTable.ProgramChain to ignore those errors.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's used in various defers, but was using `err` as name, which can be
confusing, and increases the risk of accidentally shadowing the error.
This patch:
- introduces a `retErr` output variable, to be used in defer statements.
- explicitly changes some `err` uses to locally-scoped variables.
- moves some variable definitions closer to where they're used (where possible).
While working on this change, there was one point in the code where
error handling was ambiguous. I added a note for that, in case this
was not a bug:
> This code was previously assigning the error to the global "err"
> variable (before it was renamed to "retErr"), but in case of a
> "MaskableError" did not *return* the error:
> b325dcbff6/libnetwork/controller.go (L566-L573)
>
> Depending on code paths further down, that meant that this error
> was either overwritten by other errors (and thus not handled in
> defer statements) or handled (if no other code was overwriting it.
>
> I suspect this was a bug (but possible without effect), but it could
> have been intentional. This logic is confusing at least, and even
> more so combined with the handling in defer statements that check for
> both the "err" return AND "skipCfgEpCount":
> b325dcbff6/libnetwork/controller.go (L586-L602)
>
> To save future visitors some time to dig up history:
>
> - config-only networks were added in 25082206df
> - the special error-handling and "skipCfgEpcoung" was added in ddd22a8198
> - and updated in 87b082f365 to don't use string-matching
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There were quite some places where the type collided with variables
named `agent`. Let's rename the type.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function has _four_ output variables of the same type, and several
defer statements that checked the error returned (but using the `err`
variable).
This patch names the return variables to make it clearer what's being
returned, and renames the error-return to `retErr` to make it clearer
where we're dealing with the returned error (and not any local err), to
prevent accidentally shadowing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
There's nothing handling these results, and they're logged as debug-logs,
so we may as well remove the returned variables.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions were generating debug logs if there was nothing to log.
The function already produces logs if things failed while deleting entries,
so these logs would only be printed if there was nothing to delete, so can
safely be discarded.
Before this change:
DEBU[2023-08-14T12:33:23.082052638Z] Revoking external connectivity on endpoint sweet_swirles (1519f9376a3abe7a1c981600c25e8df6bbd0a3bc3a074f1c2b3bcbad0438443b)
DEBU[2023-08-14T12:33:23.085782847Z] DeleteConntrackEntries purged ipv4:0, ipv6:0
DEBU[2023-08-14T12:33:23.085793847Z] DeleteConntrackEntriesByPort for udp ports purged ipv4:0, ipv6:0
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only called as part of the "libnetwork-setkey" re-exec, so un-exporting
it to make clear it's not for external use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The basepath is only used on Linux, so no need to call it on other
platforms. SetBasePath was already stubbed out on other platforms,
but "osl" was still imported in various places where it was not actually
used, so trying to reduce imports to get a better picture of what parts
are used (and not used).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Some tests were implicitly skipped through the `getTestEnv()` utility,
which made it hard to discover they were not ran on Windows.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to spot if code is only used on Linux. Note that "all of"
the bridge driver is Linux-only.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The mutex is only used on reads, but there's nothing protecting writes,
and it looks like nothing is mutating fields after creation, so let's
remove this altogether.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
No context in the commit that added it, but PR discussion shows that
the API was mostly exploratory, and it was 8 Years go, so let's not
head in that direction :) b646784859
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that we removed the interface, there's no need to cast the Network
to a NetworkInfo interface, so we can remove uses of the `Info()` method.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These errors aren't used in our repo and seem unused by the OSS
community (this was checked with Sourcegraph).
- ErrIpamInternalError has never been used
- ErrInvalidRequest is unused since moby/libnetwork@c85356efa
- ErrPoolNotFound has never been used
- ErrOverlapPool has never been used
- ErrNoAvailablePool has never been used
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Check the preferredPool first, as other checks could be doing more
(such as locking, or validating / parsing). Also adding a note, as
it's unclear why we're ignoring invalid pools here.
The "invalid" conditions was added in [libnetwork#1095][1], which
moved code to reduce os-specific dependencies in the ipam package,
but also introduced a types.IsIPNetValid() function, which considers
"0.0.0.0/0" invalid, and added it to the condition to return early.
Unfortunately review does not mention this change, so there's no
context why. Possibly this was done to prevent errors further down
the line (when checking for overlaps), but returning an error here
instead would likely have avoided that as well, so we can only guess.
To make this code slightly more transparent, this patch also inlines
the "types.IsIPNetValid" function, as it's not used anywhere else,
and inlining it makes it more visible.
[1]: 5ca79d6b87 (diff-bdcd879439d041827d334846f9aba01de6e3683ed8fdd01e63917dae6df23846)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This code was only run if no preferred pool was specified, however,
since [libnetwork#1162][2], the function would already return early
if a preferred pools was set (and the overlap check to be skipped),
so this was now just dead code.
[2]: 9cc3385f44
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function intentionally holds a lock / lease on address-pools to
prevent trying the same pool repeatedly.
Let's try to make this logic slightly more transparent, and prevent
defining defers in a loop. Releasing all the pools in a singe defer
also allows us to get the network-name once, which prevents locking
and unlocking the network for each iteration.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both functions have multiple output vars with generic types, which made
it hard to grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to consume, without first having to create an empty
PoolID.
Performance is the same:
BenchmarkPoolIDFromString-10 6100345 196.5 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
Note that I opted not to change the return-type to a pointer, as that seems
to perform less;
BenchmarkPoolIDFromString-10 6252750 192.0 ns/op 112 B/op 3 allocs/op
BenchmarkPoolIDFromString-10 5288682 226.6 ns/op 192 B/op 4 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
As this function may be called repeatedly to convert to/from a string,
it may be worth optimizing it a bit. Adding a minimal Benchmark for
it as well.
Before/after:
BenchmarkPoolIDToString-10 2842830 424.3 ns/op 232 B/op 12 allocs/op
BenchmarkPoolIDToString-10 7176738 166.8 ns/op 112 B/op 7 allocs/op
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
network.requestPoolHelper and Allocator.RequestPool have many args and
output vars with generic types. Add names for them to make it easier to
grasp what's what.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The options are unused, other than for debug-logging, which made it look
as if they were actually consumed anywhere, but they aren't.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it slightly more readable to see what's returned in each of
the code-paths. Also move validation of pool/subpool earlier in the
function.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Collect a list of all the links we successfully enabled (if any), and
use a single defer to disable them.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The iptables package has types defined for these actions; use them directly
instead of creating a string only to convert it to a known value.
As the linkContainers() function is only used internally, and with fixed
values, we can also remove the validation, and InvalidIPTablesCfgError
error, which is now unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Partially revert commit 94b880f.
The CheckDuplicate field has been introduced in commit 2ab94e1. At that
time, this check was done in the network router. It was then moved to
the daemon package in commit 3ca2982. However, commit 94b880f duplicated
the logic into the network router for no apparent reason. Finally,
commit ab18718 made sure a 409 would be returned instead of a 500.
As this logic is first done by the daemon, the error -> warning
conversion can't happen because CheckDuplicate has to be true for the
daemon package to return an error. If it's false, the daemon proceed
with the network creation, set the Warning field of its return value and
return no error.
Thus, the CheckDuplicate logic in the api is removed and
libnetwork.NetworkNameError now implements the ErrConflict interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The MediaType was changed twice in;
- b3b7eb2723 ("application/vnd.docker.plugins.v1+json" -> "application/vnd.docker.plugins.v1.1+json")
- 54587d861d ("application/vnd.docker.plugins.v1.1+json" -> "application/vnd.docker.plugins.v1.2+json")
But the (integration) tests were still using the old version, so let's
use the VersionMimeType const that's defined, and use the updated version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Outside of some tests, these options are the only code setting these fields,
so we can update them to set the value, instead of appending.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv4AddrNoMatchError and IPv6AddrNoMatchError are currently implementing
BadRequestError. They are returned in two cases, and none are due to a
bad user request:
- When calling daemon's CreateNetwork route, if the bridge's IPv4
address or none of the bridge's IPv6 addresses match what's requested.
If that happens, there's a big issue somewhere in libnetwork or the
kernel.
- When restoring a network, for the same reason. In that case, the
on-disk state drifted from the interface state.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error can only be reached because of an error in our code, so it's
not a "bad user request". As it's never type asserted, no need to keep
it around.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This error is only used in defensive checks whereas the precondition is
already checked by caller. If we reach it, we messed something else. So
it's definitely not a BadRequest. Also, it's not type asserted anywhere,
so just inline it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
It was not used as a sentinel error, and didn't carry a specific type,
which made it a rather complex way to create an error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This type was added moved to the types package as part of a refactor
in 778e2a72b3
but the introduction of the sandbox API changed the existing API to
weak types (not using a plain string);
9a47be244a
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- InvalidIPTablesCfgError: implement InternalError instead of
BadRequestError. This error is returned when an invalid iptables
action is passed as argument (ie. none of -A, -I, or -D).
- ErrInvalidDriverConfig: don't implement BadRequestError. This is
returned when libnetwork controller initialization pass bad driver
config -- there's no call from an HTTP route.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Follow-up to fca38bcd0a, which made the
Discover API optional for drivers to implement, but forgot to remove the
stubs from the Windows drivers, which didn't implement this API.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The "Capability" type defines DataScope and ConnectivityScope fields,
but their value was set from consts in the datastore package, which
required importing that package and its dependencies for the consts
only.
This patch:
- Moves the consts to a separate "scope" package
- Adds aliases for the consts in the datastore package.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most drivers do not implement this, so detect if a driver implements
the discoverAPI, and remove the implementation from drivers that do
not support it.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.
To solve this, this commit changes the following rules:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```
into:
```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```
These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.
Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.
Solve #45460.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- use gotest.tools assertions
- use consts and struct-literals where possible
- use assert.Check instead of t.Fatal() where possible
- fix some unhandled errors
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The rootChain variable that the Key function references is a
package-global slice. As the append() built-in may append to the slice's
backing array in place, it is theoretically possible for the temporary
slices in concurrent Key() calls to share the same backing array, which
would be a data race. Thankfully in my tests (on Go 1.20.6)
cap(rootChain) == len(rootChain)
held true, so in practice a new slice is always allocated and there is
no race. But that is a very brittle assumption to depend upon, which
could blow up in our faces at any time without warning. Rewrite the
implementation in a way which cannot lead to data races.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It only had a single implementation, so let's remove the interface.
While changing, also renaming;
- datastore.DataStore -> datastore.Store
- datastore.NewDataStore -> datastore.New
- datastore.NewDataStoreFromConfig -> datastore.FromConfig
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These were only used internally, and ErrConntrackNotConfigurable was not used
as a sentinel error anywhere. Remove ErrConntrackNotConfigurable, and change
IsConntrackProgrammable to return an error instead.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
arrangeUserFilterRule uses the package-level [`ctrl` variable][1], which
holds a reference to a controller instance. This variable is set by
[`setupArrangeUserFilterRule()`][2], which is called when initialization
a controller ([`libnetwork.New`][3]).
In normal circumstances, there would only be one controller, created during
daemon startup, and the instance of the controller would be the same as
the controller that `NewNetwork` is called from, but there's no protection
for the `ctrl` variable, and various integration tests create their own
controller instance.
The global `ctrl` var was introduced in [54e7900fb89b1aeeb188d935f29cf05514fd419b][4],
with the assumption that [only one controller could ever exist][5].
This patch tries to reduce uses of the `ctrl` variable, and as we're calling
this code from inside a method on a specific controller, we inline the code
and use that specific controller instead.
[1]: 37b908aa62/libnetwork/firewall_linux.go (L12)
[2]: 37b908aa62/libnetwork/firewall_linux.go (L14-L17)
[3]: 37b908aa62/libnetwork/controller.go (L163)
[4]: 54e7900fb8
[5]: https://github.com/moby/libnetwork/pull/2471#discussion_r343457183
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was added in libnetwork through 50964c9948
and, based on the name of the function and its signature, I think it
was meant to be a test. This patch refactors it to be one.
Changing it into a test made it slightly broken:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
errors_test.go:15: Failed to detect err network not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchNetwork
errors_test.go:15: Failed to detect err endpoint not found is of type BadRequestError. Got type: libnetwork.ErrNoSuchEndpoint
errors_test.go:42: Failed to detect err unknown driver "" is of type ForbiddenError. Got type: libnetwork.NetworkTypeError
errors_test.go:42: Failed to detect err unknown network id is of type ForbiddenError. Got type: *libnetwork.UnknownNetworkError
errors_test.go:42: Failed to detect err unknown endpoint id is of type ForbiddenError. Got type: *libnetwork.UnknownEndpointError
--- FAIL: TestErrorInterfaces (0.00s)
FAIL
This was because some errors were tested twice, but for the wrong type
(`NetworkTypeError`, `UnknownNetworkError`, `UnknownEndpointError`).
Moving them to the right test left no test-cases for `types.ForbiddenError`,
so I added `ActiveContainerError` to not make that part of the code feel lonely.
Other failures were because some errors were changed from `types.BadRequestError`
to a `types.NotFoundError` error in commit ba012a703a,
so I moved those to the right part.
Before this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
After this patch:
go test -v -run TestErrorInterfaces
=== RUN TestErrorInterfaces
--- PASS: TestErrorInterfaces (0.00s)
PASS
ok github.com/docker/docker/libnetwork 0.013s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
commit ffd75c2e0c updated this function to
set up the DOCKER-USER chain for both iptables and ip6tables, however the
function would return early if a failure happened (instead of continuing
with the next iptables version).
This patch extracts setting up the chain to a separate function, and updates
arrangeUserFilterRule to log the failure as a warning, but continue with
the next iptables version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These functions were mostly identical, except for iptables being enabled
by default (unless explicitly disabled by config).
Rewrite the function to a enabledIptablesVersions, which returns the list
of iptables-versions that are enabled for the controller. This prevents
having to acquire a lock twice, and simplifies arrangeUserFilterRule, which
can now just iterate over the enabled versions.
Also moving this function to a linux-only file, as other platforms don't have
the iptables types defined.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The driver-configurations are only set when creating a new controller,
using the `config.OptionDriverConfig()` option that can be passed to
`New()`, and used as "read-only" after that.
Taking away any other paths that set these options, the only type used
for per-driver options are a `map[string]interface{}`, so we can change
the type from `map[string]interface{}` to a `map[string]map[string]interface{}`,
(or its "modern" variant: `map[string]map[string]any`), so that it's
no longer needed to cast the type before use.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Don't fail early if we can still test more, and be slightly more strict
in what error we're looking for.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test already creates instances for each ip-version, so let's
re-use them. While changing, also use assert.Check to not fail early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
New() allows for driver-options to be passed using the config.OptionDriverConfig.
Update the test to not manually mutate the controller's configuration after
creating.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Not critical, but when used from ChainInfo, we had to construct an IPTable
based on the version of the ChainInfo, which then only used the version
we passed to get the right loopback.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make some variables local to the if-branches to be slightly more iodiomatic,
and to make clear it's only used in that branch.
Move the bestEffortLock locking later in IPtable.raw(), because that function'
could return before the lock was even needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that all consumers of these functions are passing non-empty values,
let's validate that no empty strings for either chain or table are passed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's only used internally, and it was last used in commit:
0220b06cd6
But moved into the iptables package in this commit:
998f3ce22c
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This utility was not used for "Config", but for Networks and Endpoints.
Having this utility made it look like more than it was, and the related
test was effectively testing stdlib.
Abstracting the validation also was hiding that, while validation does
not allow "empty" names, it happily allows leading/trailing whitespace,
and does not remove that before creating networks or endpoints;
docker network create "bridge "
docker network create "bridge "
docker network create "bridge "
docker network create " bridge "
docker network create " bridge "
docker network create " bridge"
docker network ls --filter driver=bridge
NETWORK ID NAME DRIVER SCOPE
d4d53210f185 bridge bridge local
e9afba0d99de bridge bridge local
69fb7a7ba67c bridge bridge local
a452bf065403 bridge bridge local
49d96c59061d bridge bridge local
8eae1c4be12c bridge bridge local
86dd65b881b9 bridge bridge local
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the NewGeneric utility as it was not used anywhere, except for
in tests.
Also "modernize" the type, and use `any` instead of `interface{}`.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Use a more modern approach to check error-types
- Touch-up grammar of the error-message
- Remove redundant "nil" check for errors, as it's never nil at that point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
...that Swarmkit no longer needs now that it has been migrated to use
the new-style driver registration APIs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The only remaining user is Swarmkit, which now has its own private copy
of the package tailored to its needs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
...which ignore the config argument. Notably, none of the network
drivers referenced by Swarmkit use config, which is good as Swarmkit
unconditionally passes nil for the config when registering drivers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
setupBridgeNetFiltering:
- Indicate that the bridgeInterface argument is unused (but it's needed
to satisfy the signature).
- Return instead of nullifying the err. Still not great, but I thought it
was very slightly more logical thing to do.
checkBridgeNetFiltering:
- Remove unused argument, and scope ipVerName to the branch where it's
used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
initConnection was effectively just part of the constructor; ot was not
used elsewhere. Merge the two functions to simplify things a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This const was added in 8301dcc6d7, before
being moved to libnetwork, and moved back, but it was never used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- remove local bridgeName variable that shadowed the const, but
used the same value
- remove some redundant `var` declarations, and changed fixed
values to a const
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
None of these errors were string-matched anywhere, so let's change them
to be non-capitalized, as they should.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
looks like this error was added in 1cbdaebaa1,
and later moved to libnetwork in 44c96449c2
which also updated the description to something that doesn't match what
it means.
In either case, this error was never used as a special / sentinel error,
so we can just use a regular error return.
While at it, I also lower-cased the error-message; it's not string-matched
anywhere, so we can update it to make linters more happy.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- validate input variables before constructing the ChainInfo
- only construct the ChainInfo if things were successful
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Clarify that the argument to New is an exclusive upper bound.
Correct the documentation for SetAnyInRange: the end argument is
inclusive rather than exclusive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The idm package wraps bitseq.Handle to provide an offset and
synchronization. bitseq.Handle wraps bitmap.Bitmap to provide
persistence in a datastore. As no datastore is passed and the offset is
zero, the idm.Idm instance is nothing more than a concurrency-safe
wrapper around a bitmap.Bitmap with differently-named methods. Switch
over to using bitmap.Bitmap directly, using the ovmanager driver's mutex
for concurrency control.
Hold the driver mutex for the entire duration that VXLANs are being
assigned to the new network. This makes allocating VXLANs for a network
an atomic operation.
Signed-off-by: Cory Snider <csnider@mirantis.com>
In the network.obtainVxlanID() method, the mutex only guards a local
variable and a function argument. Locking is therefore unnecessary.
The network.releaseVxlanID() method is only called in two contexts:
driver.NetworkAllocate(), where the network struct is a local variable
and network.releaseVxlanID() is only called in failure code-paths in
which the network does not escape; and driver.NetworkFree(), while the
driver mutex is held. Locking is therefore unnecessary.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The OptionLocalKVProvider, OptionLocalKVProviderURL, and OptionLocalKVProviderConfig
options were only used in tests, so un-export them, and move them to the
test-files.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was implemented in dd4950f36d
which added a "key" field, but that field was never used anywhere, and
still appears unused.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `store.Watch()` was only used in `Controller.processEndpointCreate()`,
and skipped if the store was not "watchable" (`store.Watchable()`).
Whether a store is watchable depends on the store's datastore.scope;
local stores are not watchable;
func (ds *datastore) Watchable() bool {
return ds.scope != LocalScope
}
datastore is only initialized in two locations, and both locations set the
scope field to LocalScope:
datastore.newClient() (also called by datastore.NewDataStore()):
3e4c9d90cf/libnetwork/datastore/datastore.go (L213)
datastore.NewTestDataStore() (used in tests);
3e4c9d90cf/libnetwork/datastore/datastore_test.go (L14-L17)
Furthermore, the backing BoltDB kvstore does not implement the Watch()
method;
3e4c9d90cf/libnetwork/internal/kvstore/boltdb/boltdb.go (L464-L467)
Based on the above;
- our datastore is never Watchable()
- so datastore.Watch() is never used
This patch removes the Watchable(), Watch(), and RestartWatch() functions,
as well as the code handling watching.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The sequential field determined whether a lock was needed when storing
and retrieving data. This field was always set to true, with the exception
of NewTestDataStore() in the tests.
This field was added in a18e2f9965
to make locking optional for non-local scoped stores. Such stores are no
longer used, so we can remove this field.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the code slightly more idiomatic; remove some "var" declarations,
remove some intermediate variables and redundant error-checks, and remove
the filePerm const.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This boolean was not used anywhere, so we can remove it. Also cleaning up
the implementation a bit.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The WriteOptions struct was only used to set the "IsDir" option. This option
was added in d635a8e32b
and was only supported by the etcd libkv store.
The BoltDB store does not support this option, making the WriteOptions
struct fully redundant.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The only remaining kvstore is BoltDB, which doesn't use TLS connections
or authentication, so we can remove these options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the intermediate variable, and move the option closer
to where it's used, as in some cases we created the variable,
but could return with an error before it was used.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function was not using the DriverCallback interface, and only
required the Registerer interface.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
A reduced set of the dependency, only taking the parts that are used. Taken from
upstream commit: dfacc563de
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
cd ~/projects
# create a temporary clone of docker
git clone https://github.com/docker/libkv.git temp_libkv
cd temp_libkv
# create branch to work with
git checkout -b migrate_libkv
# remove all code, except for the files we need; rename the remaining ones to their new target location
git filter-repo --force \
--path libkv.go \
--path store/store.go \
--path store/boltdb/boltdb.go \
--path-rename libkv.go:libnetwork/internal/kvstore/kvstore_manage.go \
--path-rename store/store.go:libnetwork/internal/kvstore/kvstore.go \
--path-rename store/boltdb/:libnetwork/internal/kvstore/boltdb/
# go to the target github.com/moby/moby repository
cd ~/projects/docker
# create a branch to work with
git checkout -b integrate_libkv
# add the temporary repository as an upstream and make sure it's up-to-date
git remote add temp_libkv ~/projects/temp_libkv
git fetch temp_libkv
# merge the upstream code, rewriting "pkg/symlink" to "symlink"
git merge --allow-unrelated-histories --signoff -S temp_libkv/migrate_libkv
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Resolver.setupIPTable() checks whether it needs to flush or create the
user chains used for NATing container DNS requests by testing for the
existence of the rules which jump to said user chains. Unfortunately it
does so using the IPTable.RawCombinedOutputNative() method, which
returns a non-nil error if the iptables command returns any output even
if the command exits with a zero status code. While that is fine with
iptables-legacy as it prints no output if the rule exists, iptables-nft
v1.8.7 prints some information about the rule. Consequently,
Resolver.setupIPTable() would incorrectly think that the rule does not
exist during container restore and attempt to create it. This happened
work work by coincidence before 8f5a9a741b
because the failure to create the already-existing table would be
ignored and the new NAT rules would be inserted before the stale rules
left in the table from when the container was last started/restored. Now
that failing to create the table is treated as a fatal error, the
incompatibility with iptables-nft is no longer hidden.
Switch to using IPTable.ExistsNative() to test for the existence of the
jump rules as it correctly only checks the iptables command's exit
status without regard for whether it outputs anything.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The method to restore a network namespace takes a collection of
interfaces to restore with the options to apply. The interface names are
structured data, tuples of (SrcName, DstPrefix) but for whatever reason
are being passed into Restore() serialized to strings. A refactor,
f0be4d126d, accidentally broke the
serialization by dropping the delimiter. Rather than fix the
serialization and leave the time-bomb for someone else to trip over,
pass the interface names as structured data.
Signed-off-by: Cory Snider <csnider@mirantis.com>
While the VXLAN interface and the iptables rules to mark outgoing VXLAN
packets for encryption are configured to use the Swarm data path port,
the XFRM policies for actually applying the encryption are hardcoded to
match packets with destination port 4789/udp. Consequently, encrypted
overlay networks do not pass traffic when the Swarm is configured with
any other data path port: encryption is not applied to the outgoing
VXLAN packets and the destination host drops the received cleartext
packets. Use the configured data path port instead of hardcoding port
4789 in the XFRM policies.
Signed-off-by: Cory Snider <csnider@mirantis.com>
TestProxyNXDOMAIN has proven to be susceptible to failing as a
consequence of unlocked threads being set to the wrong network
namespace. As the failure mode looks a lot like a bug in the test
itself, it seems prudent to add a check for mismatched namespaces to the
test so we will know for next time that the root cause lies elsewhere.
Signed-off-by: Cory Snider <csnider@mirantis.com>
osl.setIPv6 mistakenly captured the calling goroutine's thread's network
namespace instead of the network namespace of the thread getting its
namespace temporarily changed. As this function appears to only be
called from contexts in the process's initial network namespace, this
mistake would be of little consequence at runtime. The libnetwork unit
tests, on the other hand, unshare network namespaces so as not to
interfere with each other or the host's network namespace. But due to
this bug, the isolation backfires and the network namespace of
goroutines used by a test which are expected to be in the initial
network namespace can randomly become the isolated network namespace of
some other test. Symptoms include a loopback network server running in
one goroutine being inexplicably and randomly being unreachable by a
client in another goroutine.
Capture the original network namespace of the thread from the thread to
be tampered with, after locking the goroutine to the thread.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Swapping out the global logger on the fly is causing tests to flake out
by logging to a test's log output after the test function has returned.
Refactor Resolver to use a dependency-injected logger and the resolver
unit tests to inject a private logger instance into the Resolver under
test.
Signed-off-by: Cory Snider <csnider@mirantis.com>
tstwriter mocks the server-side connection between the resolver and the
container, not the resolver and the external DNS server, so returning
the external DNS server's address as w.LocalAddr() is technically
incorrect and misleading. Only the protocols need to match as the
resolver uses the client's choice of protocol to determine which
protocol to use when forwarding the query to the external DNS server.
While this change has no material impact on the tests, it makes the
tests slightly more comprehensible for the next person.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our resolver is just a forwarder for external DNS so it should act like
it. Unless it's a server failure or refusal, take the response at face
value and forward it along to the client. RFC 8020 is only applicable to
caching recursive name servers and our resolver is neither caching nor
recursive.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Now that most uses of reexec have been replaced with non-reexec
solutions, most of the reexec.Init() calls peppered throughout the test
suites are unnecessary. Furthermore, most of the reexec.Init() calls in
test code neglects to check the return value to determine whether to
exit, which would result in the reexec'ed subprocesses proceeding to run
the tests, which would reexec another subprocess which would proceed to
run the tests, recursively. (That would explain why every reexec
callback used to unconditionally call os.Exit() instead of returning...)
Remove unneeded reexec.Init() calls from test and example code which no
longer needs it, and fix the reexec.Init() calls which are not inert to
exit after a reexec callback is invoked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
- sandbox, endpoint changed in c71555f030, but
missed updating the stubs.
- add missing stub for Controller.cleanupServiceDiscovery()
- While at it also doing some minor (formatting) changes.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function included a defer to close the net.Conn if an error occurred,
but the calling function (SetExternalKey()) also had a defer to close it
unconditionally.
Rewrite it to use json.NewEncoder(), which accepts a writer, and inline
the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's a no-op on Windows and other non-Linux, non-FreeBSD platforms,
so there's no need to register the re-exec.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Just print the error and os.Exit() instead, which makes it more
explicit that we're exiting, and there's no need to decorate the
error.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Split the function into a "backing" function that returns an error, and the
re-exec entrypoint, which handles the error to provide a more idiomatic approach.
This was part of a larger change accross multiple re-exec functions (now removed).
For history's sake; here's the description for that;
The `reexec.Register()` function accepts reexec entrypoints, which are a `func()`
without return (matching a binary's `main()` function). As these functions cannot
return an error, it's the entrypoint's responsibility to handle any error, and to
indicate failures through `os.Exit()`.
I noticed that some of these entrypoint functions had `defer()` statements, but
called `os.Exit()` either explicitly or implicitly (e.g. through `logrus.Fatal()`).
defer statements are not executed if `os.Exit()` is called, which rendered these
statements useless.
While I doubt these were problematic (I expect files to be closed when the process
exists, and `runtime.LockOSThread()` to not have side-effects after exit), it also
didn't seem to "hurt" to call these as was expected by the function.
This patch rewrites some of the entrypoints to split them into a "backing function"
that can return an error (being slightly more iodiomatic Go) and an wrapper function
to act as entrypoint (which can handle the error and exit the executable).
To some extend, I'm wondering if we should change the signatures of the entrypoints
to return an error so that `reexec.Init()` can handle (or return) the errors, so
that logging can be handled more consistently (currently, some some use logrus,
some just print); this would also keep logging out of some packages, as well as
allows us to provide more metadata about the error (which reexec produced the
error for example).
A quick search showed that there's some external consumers of pkg/reexec, so I
kept this for a future discussion / exercise.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
- Verify the content to be equal, not "contains"; this output should be
predictable.
- Also verify the content returned by the function to match.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looks like the intent is to exclude windows (which wouldn't have /etc/resolv.conf
nor systemd), but most tests would run fine elsewhere. This allows running the
tests on macOS for local testing.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Use t.TempDir() for convenience, and change some t.Fatal's to Errors,
so that all tests can run instead of failing early.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The test was assuming that the "source" file was always "/etc/resolv.conf",
but the `Get()` function uses `Path()` to find the location of resolv.conf,
which may be different.
While at it, also changed some `t.Fatalf()` to `t.Errorf()`, and renamed
some variables for clarity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After my last change, I noticed that the hash is used as a []byte in most
cases (other than tests). This patch updates the type to use a []byte, which
(although unlikely very important) also improves performance:
Compared to the previous version:
benchstat new.txt new2.txt
name old time/op new time/op delta
HashData-10 128ns ± 1% 116ns ± 1% -9.77% (p=0.000 n=20+20)
name old alloc/op new alloc/op delta
HashData-10 208B ± 0% 88B ± 0% -57.69% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=20+20)
And compared to the original version:
benchstat old.txt new2.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 116ns ± 1% -42.39% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 88B ± 0% -78.85% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 2.00 ± 0% -66.67% (p=0.000 n=20+20)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The code seemed overly complicated, requiring a reader to be constructed,
where in all cases, the data was already available in a variable. This patch
simplifies the utility to not require a reader, which also makes it a bit
more performant:
go install golang.org/x/perf/cmd/benchstat@latest
GO111MODULE=off go test -run='^$' -bench=. -count=20 > old.txt
GO111MODULE=off go test -run='^$' -bench=. -count=20 > new.txt
benchstat old.txt new.txt
name old time/op new time/op delta
HashData-10 201ns ± 1% 128ns ± 1% -36.16% (p=0.000 n=18+20)
name old alloc/op new alloc/op delta
HashData-10 416B ± 0% 208B ± 0% -50.00% (p=0.000 n=20+20)
name old allocs/op new allocs/op delta
HashData-10 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=20+20)
A small change was made in `Build()`, which previously returned the resolv.conf
data, even if the function failed to write it. In the new variation, `nil` is
consistently returned on failures.
Note that in various places, the hash is not even used, so we may be able to
simplify things more after this.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Since cc19eba (backported to v23.0.4), the PreferredPool for docker0 is
set only when the user provides the bip config parameter or when the
default bridge already exist. That means, if a user provides the
fixed-cidr parameter on a fresh install or reboot their computer/server
without bip set, dockerd throw the following error when it starts:
> failed to start daemon: Error initializing network controller: Error
> creating default "bridge" network: failed to parse pool request for
> address space "LocalDefault" pool "" subpool "100.64.0.0/26": Invalid
> Address SubPool
See #45356.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Use Linux BPF extensions to locate the offset of the VXLAN header within
the packet so that the same BPF program works with VXLAN packets
received over either IPv4 or IPv6.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This commit removes iptables rules configured for secure overlay
networks when a network is deleted. Prior to this commit, only
CreateNetwork() was taking care of removing stale iptables rules.
If one of the iptables rule can't be removed, the erorr is logged but
it doesn't prevent network deletion.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The (*network).ipamRelease function nils out the network's IPAM info
fields, putting the network struct into an inconsistent state. The
network-restore startup code panics if it tries to restore a network
from a struct which has fewer IPAM config entries than IPAM info
entries. Therefore (*network).delete contains a critical section: by
persisting the network to the store after ipamRelease(), the datastore
will contain an inconsistent network until the deletion operation
completes and finishes deleting the network from the datastore. If for
any reason the deletion operation is interrupted between ipamRelease()
and deleteFromStore(), the daemon will crash on startup when it tries to
restore the network.
Updating the datastore after releasing the network's IPAM pools may have
served a purpose in the past, when a global datastore was used for
intra-cluster communication and the IPAM allocator had persistent global
state, but nowadays there is no global datastore and the IPAM allocator
has no persistent state whatsoever. Remove the vestigial datastore
update as it is no longer necessary and only serves to cause problems.
If the network deletion is interrupted before the network is deleted
from the datastore, the deletion will resume during the next daemon
startup, including releasing the IPAM pools.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Linux kernel prior to v3.16 was not supporting netns for vxlan
interfaces. As such, moby/libnetwork#821 introduced a "host mode" to the
overlay driver. The related kernel fix is available for rhel7 users
since v7.2.
This mode could be forced through the use of the env var
_OVERLAY_HOST_MODE. However this env var has never been documented and
is not referenced in any blog post, so there's little chance many people
rely on it. Moreover, this host mode is deemed as an implementation
details by maintainers. As such, we can consider it dead and we can
remove it without a prior deprecation warning.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since 0fa873c, there's no function writing overlay networks to some
datastore. As such, overlay network struct doesn't need to implement
KVObject interface.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since a few commits, subnet's vni don't change during the lifetime of
the subnet struct, so there's no need to lock the network before
accessing it.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Since the previous commit, data from the local store are never read,
thus proving it was only used for Classic Swarm.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver in Swarm v2 mode doesn't support live-restore, ie.
the daemon won't even start if the node is part of a Swarm cluster and
live-restore is enabled. This feature was only used by Swarm Classic.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
VNI allocations made by the overlay driver were only used by Classic
Swarm. With Swarm v2 mode, the driver ovmanager is responsible of
allocating & releasing them.
Previously, vxlanIdm was initialized when a global store was available
but since 142b522, no global store can be instantiated. As such,
releaseVxlanID actually does actually nothing and iptables rules are
never removed.
The last line of dead code detected by golangci-lint is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Prior to 0fa873c, the serf-based event loop was started when a global
store was available. Since there's no more global store, this event loop
and all its associated code is dead.
Most dead code detected by golangci-lint in prior commits is now gone.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
- LocalKVProvider, LocalKVProviderURL, LocalKVProviderConfig,
GlobalKVProvider, GlobalKVProviderURL and GlobalKVProviderConfig
are all unused since moby/libnetwork@be2b6962 (moby/libnetwork#908).
- GlobalKVClient is unused since 0fa873c and c8d2c6e.
- MakeKVProvider, MakeKVProviderURL and MakeKVProviderConfig are unused
since 96cfb076 (moby/moby#44683).
- MakeKVClient is unused since 142b5229 (moby/moby#44875).
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is unused anymore since 142b522 (moby/moby#44875).
It was also creating a local store whenever netlabel.LocalKVClient was
specificed in its config argument. This store is unused since
moby/libnetwork@9e72136 (moby/libnetwork#1636).
Finally, the sync.Once properties are never used and thus can be
deleted.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The overlay driver was creating a global store whenever
netlabel.GlobalKVClient was specified in its config argument. This
specific label is not used anymore since 142b522 (moby/moby#44875).
golangci-lint now detects dead code. This will be fixed in subsequent
commits.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
This command was useful when overlay networks based on external KV store
was developed but is unused nowadays.
As the last reference to OverlayBindInterface and OverlayNeighborIP
netlabels are in the ovrouter cmd, they're removed too.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Drop support for platforms which only have xt_u32 but not xt_bpf. No
attempt is made to clean up old xt_u32 iptables rules left over from a
previous daemon instance.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Encrypted overlay networks are unique in that they are the only kind of
network for which libnetwork programs an iptables rule to explicitly
accept incoming packets. No other network driver does this. The overlay
driver doesn't even do this for unencrypted networks!
Because the ACCEPT rule is appended to the end of INPUT table rather
than inserted at the front, the rule can be entirely inert on many
common configurations. For example, FirewallD programs an unconditional
REJECT rule at the end of the INPUT table, so any ACCEPT rules appended
after it have no effect. And on systems where the rule is effective, its
presence may subvert the administrator's intentions. In particular,
automatically appending the ACCEPT rule could allow incoming traffic
which the administrator was expecting to be dropped implicitly with a
default-DROP policy.
Let the administrator always have the final say in how incoming
encrypted overlay packets are filtered by no longer automatically
programming INPUT ACCEPT iptables rules for them.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This is a follow-up of 48ad9e1. This commit removed the function
ElectInterfaceAddresses from utils_linux.go but not their FreeBSD &
Windows counterpart. As these functions are never called, they can be
safely removed.
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The ExtDNS2 field was added in
aad1632c15
to migrate existing state from < 1.14 to a new type. As it's unlikely
that installations still have state from before 1.14, rename ExtDNS2
back to ExtDNS and drop the migration code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Allow SetMatrix to be used as a value type with a ready-to-use zero
value. SetMatrix values are already non-copyable by virtue of having a
mutex field so there is no harm in allowing non-pointer values to be
used as local variables or struct fields. Any attempts to pass around
by-value copies, e.g. as function arguments, will be flagged by go vet.
Signed-off-by: Cory Snider <csnider@mirantis.com>
FirewallD creates the root INPUT chain with a default-accept policy and
a terminal rule which rejects all packets not accepted by any prior
rule. Any subsequent rules appended to the chain are therefore inert.
The administrator would have to open the VXLAN UDP port to make overlay
networks work at all, which would result in all VXLAN traffic being
accepted and defeating our attempts to enforce encryption on encrypted
overlay networks.
Insert the rule to drop unencrypted VXLAN packets tagged for encrypted
overlay networks at the top of the INPUT chain so that enforcement of
mandatory encryption takes precedence over any accept rules configured
by the administrator. Continue to append the accept rule to the bottom
of the chain so as not to override any administrator-configured drop
rules.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Some newer distros such as RHEL 9 have stopped making the xt_u32 kernel
module available with the kernels they ship. They do ship the xt_bpf
kernel module, which can do everything xt_u32 can and more. Add an
alternative implementation of the iptables match rule which uses xt_bpf
to implement exactly the same logic as the u32 filter using a BPF
program. Try programming the BPF-powered rules as a fallback when
programming the u32-powered rules fails.
Signed-off-by: Cory Snider <csnider@mirantis.com>