Ensure the package can be imported, no matter the build constratints, by
adding an unconstrained doc.go containing a package statement.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This was added in 6cdc4ba6cd in 2016, likely
because at the time we were still building for CentOS 6 and Ubuntu 14.04.
All currently supported distros appear to be on _at least_ 219 now, so it looks
safe to remove this;
```bash
docker run -it --rm centos:7
yum install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 219
pkg-config --print-provides 'libsystemd-journal'
libsystemd-journal = 219
```
And on a `debian:buster` (old stable)
```bash
docker run -it --rm debian:buster
apt-get update && apt-get install -y libsystemd-dev pkg-config
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 241
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
OpenSUSE leap (I think that's built for s390x)
```bash
docker run -it --rm docker.io/opensuse/leap:15
zypper install -y systemd-devel
pkg-config 'libsystemd >= 209' && echo "OK" || echo "KO"
OK
pkg-config --print-provides 'libsystemd'
libsystemd = 246
pkg-config --print-provides 'libsystemd-journal'
Package libsystemd-journal was not found in the pkg-config search path.
Perhaps you should add the directory containing `libsystemd-journal.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libsystemd-journal' found
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This was introduced in 906b979b88, which changed
a `goto` to a `break`, but afaics, the intent was still to break out of the loop.
(linter didn't catch this before because it didn't have the right build-tag set)
daemon/logger/journald/read.go:238:4: SA4011: ineffective break statement. Did you mean to break out of the outer loop? (staticcheck)
break // won't be able to write anything anymore
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
1. Add integration tests for the ContainerLogs API call
Each test handle a distinct case of ContainerLogs output.
- Muxed stream, when container is started without tty
- Single stream, when container is started with tty
2. Add unit test for LogReader suite that tests concurrent logging
It checks that there are no race conditions when logging concurrently
from multiple goroutines.
Co-authored-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The recent fix for log corruption changed the signature of the
NewLogFile and WriteLogEntry functions and the test wasn't adjusted to
this change.
Fix the test by adjusting to the new LogFile API.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Make the allocated buffers bigger to allow better reusability and avoid
frequent reallocations.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
The Message is not needed after it is marshalled, so no need to hold it
for the entire function scope.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Moved the buffer pools in json-file and local logging drivers to the
whole driver scope. It is more efficient to have a pool for the whole
driver rather than for each logger instance.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Marshalling log messages by json-file and local drivers involved
serializing the message into a shared buffer. This caused a regression
resulting in log corruption with recent changes where Log may be called
from multiple goroutines at the same time.
Solution is to use a sync.Pool to manage the buffers used for the
serialization. Also removed the MarshalFunc, which the driver had to
expose to the LogFile so that it can marshal the message. This is now
moved entirely to the driver.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Reduce the amount of time ReadLogs holds the LogFile fsop lock by
releasing it as soon as all the files are opened, before parsing the
compressed file headers.
Signed-off-by: Cory Snider <csnider@mirantis.com>
File watches have been a source of complexity and unreliability in the
LogFile follow implementation, especially when combined with file
rotation. File change events can be unreliably delivered, especially on
Windows, and the polling fallback adds latency. Following across
rotations has never worked reliably on Windows. Without synchronization
between the log writer and readers, race conditions abound: readers can
read from the file while a log entry is only partially written, leading
to decode errors and necessitating retries.
In addition to the complexities stemming from file watches, the LogFile
follow implementation had complexity from needing to handle file
truncations, and (due to a now-fixed bug in the polling file watcher
implementation) evictions to unlock the log file so it could be rotated.
Log files are now always rotated, never truncated, so these situations
no longer need to be handled by the follow code.
Rewrite the LogFile follow implementation in terms of waiting until
LogFile notifies it that a new message has been written to the log file.
The LogFile informs the follower of the file offset of the last complete
write so that the follower knows not to read past that, preventing it
from attempting to decode partial messages and making retries
unnecessary. Synchronization between LogFile and its followers is used
at critical points to prevent missed notifications of writes and races
between file rotations and the follower opening files for read.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The refCounter used for sharing temporary decompressed log files and
tracking when the files can be deleted is keyed off the source file's
path. But the path of a log file is not stable: it is renamed on each
rotation. Consequently, when logging is configured with both rotation
and compression, multiple concurrent readers of a container's logs could
read logs out of order, see duplicates or decompress a log file which
has already been decompressed.
Replace refCounter with a new implementation, sharedTempFileConverter,
which is agnostic to the file path, keying off the source file's
identity instead. Additionally, sharedTempFileConverter handles the full
lifecycle of the temporary file, from creation to deletion. This is all
abstracted from the consumer: all the bookkeeping and cleanup is handled
behind the scenes when Close() is called on the returned reader value.
Only one file descriptor is used per temporary file, which is shared by
all readers.
A channel is used for concurrency control so that the lock can be
acquired inside a select statement. While not currently utilized, this
makes it possible to add support for cancellation to
sharedTempFileConverter in the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Truncating the current log file while a reader is still reading through
it results in log lines getting missed. In contrast, rotating the file
allows readers who have the file open can continue to read from it
undisturbed. Rotating frees up the file name for the logger to create a
new file in its place. This remains true even when max-file=1; the
current log file is "rotated" from its name without giving it a new one.
On POSIXy filesystem APIs, rotating the last file is straightforward:
unlink()ing a file name immediately deletes the name from the filesystem
and makes it available for reuse, even if processes have the file open
at the time. Windows on the other hand only makes the name available
for reuse once the file itself is deleted, which only happens when no
processes have it open. To reuse the file name while the file is still
in use, the file needs to be renamed. So that's what we have to do:
rotate the file to a temporary name before marking it for deletion.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The json-file driver appends a newline character to log messages with
PLogMetaData.Last set, but the local driver did not. Alter the behavior
of the local driver to match that of the json-file driver.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The LogFile follower would stop immediately upon the producer closing.
The close signal would race the file watcher; if a message were to be
logged and the logger immediately closed, the follower could miss that
last message if the close signal (formerly ProducerGone) was to win the
race. Add logic to perform one more round of reading when the producer
is closed to catch up on any final logs.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Whether or not the logger has been closed is a property of the logger,
and only of concern to its log reading implementation, not log watchers.
The loggers and their reader implementations can communicate as they see
fit. A single channel per logger which is closed when the logger is
closed is plenty sufficient to broadcast the state to log readers, with
no extra bookeeping or synchronization required.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The asynchronous startup of the log-reading goroutine made the
follow-tail tests nondeterministic. The Log calls in the tests which
were supposed to happen after the reader started reading would sometimes
execute before the reader, throwing off the counts. Tweak the ReadLogs
implementation so that the order of operations is deterministic.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an extensive test suite for validating the behavior of any
LogReader. Test the current LogFile-based implementations against it.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The jsonfilelog read benchmark was incorrectly reusing the same message
pointer in the producer loop. The message value would be reset after the
first call to jsonlogger.Log, resulting in all subsequent calls logging
a zero-valued message. This is not a representative workload for
benchmarking and throws off the throughput metric.
Reduce variation between benchmark runs by using a constant timestamp.
Write to the producer goroutine's error channel only on a non-nil error
to eliminate spurious synchronization between producer and consumer
goroutines external to the logger being benchmarked.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This commit replaces `os.Setenv` with `t.Setenv` in tests. The
environment variable is automatically restored to its original value
when the test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.Setenv
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Some packages were using `logrus.Fatal()` in init functions (which logs the error,
and (by default) calls `os.Exit(1)` after logging).
Given that logrus formatting and outputs have not yet been configured during the
initialization stage, it does not provide much benefits over a plain `panic()`.
This patch replaces some instances of `logrus.Fatal()` with `panic()`, which has
the added benefits of not introducing logrus as a dependency in some of these
packages, and also produces a stacktrace, which could help locating the problem
in the unlikely event an `init()` fails.
Before this change, an error would look like:
$ dockerd
FATA[0000] something bad happened
After this change, the same error looks like:
$ dockerd
panic: something bad happened
goroutine 1 [running]:
github.com/docker/docker/daemon/logger/awslogs.init.0()
/go/src/github.com/docker/docker/daemon/logger/awslogs/cloudwatchlogs.go:128 +0x89
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- fix some cases where the host was ignored for valid addresses.
- removes a redundant use of urlutil.IsTransportURL(); instead adding code to
check if the given scheme (protocol) is supported.
- improve port validation for out of range ports.
- fix some missing validation: the driver was silently ignoring path elements,
but expected a host (not an URL)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fix some missing validation: the driver was silently ignoring path elements
in some cases, and expecting a host (not an URL), and for unix sockets did
not validate if a path was specified.
For the latter case, we should have a fix in the upstream driver, as it
uses an empty path as default path for the socket (`defaultSocketPath`),
and performs no validation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- removes a redundant use of urlutil.IsTransportURL(); instead adding some code
to check if the given scheme (protocol) is supported.
- define a `defaultPort` const for the default port.
- use `net.JoinHostPort()` instead of string concatenating, to account for possible
issues with IPv6 addresses.
- renames a variable that collided with an imported package.
- improves test coverage, and moves an integration test.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch removes the use of urlutil.IsURL(), in favor of just checking if the
provided scheme (protocol) is supported.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
pkg/urlutil (despite its poorly chosen name) is not really intended as a generic
utility to handle URLs, and should only be used by the builder to handle (remote)
build contexts.
This patch:
- removes a redundant use of urlutil.IsTransportURL(); code further below already
checked if the given scheme (protocol) was supported.
- renames some variables that collided with imported packages.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function is marked deprecated in Go 1.18; however, the suggested replacement
brings in a large amount of new code, and most strings we generate will be ASCII,
so this would only be in case it's used for some user-provided string. We also
don't have a language to use, so would be using the "default".
Adding a `//nolint` comment to suppress the linting failure instead.
daemon/logger/templates/templates.go:23:14: SA1019: strings.Title is deprecated: The rule Title uses for word boundaries does not handle Unicode punctuation properly. Use golang.org/x/text/cases instead. (staticcheck)
"title": strings.Title,
^
pkg/plugins/pluginrpc-gen/template.go:67:9: SA1019: strings.Title is deprecated: The rule Title uses for word boundaries does not handle Unicode punctuation properly. Use golang.org/x/text/cases instead. (staticcheck)
return strings.Title(s)
^
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The log message's timestamp was being read after it was returned to the
pool. By coincidence the timestamp field happened to not be zeroed on
reset so much of the time things would work as expected. But if the
message value was to be taken back out of the pool before WriteLogEntry
returned, the timestamp recorded in the gzip header of compressed
rotated log files would be incorrect.
Make future use-after-put bugs fail fast by zeroing all fields of the
Message value, including the timestamp, when it is put into the pool.
Signed-off-by: Cory Snider <csnider@mirantis.com>
followLogs() is getting really long (170+ lines) and complex.
The function has multiple inner functions that mutate its variables.
To refactor the function, this change introduces follow{} struct.
The inner functions are now defined as ordinal methods, which are
accessible from tests.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
* When async is enabled, this option defines the interval (ms) at which the connection
to the fluentd-address is re-established. This option is useful if the address
may resolve to one or more IP addresses, e.g. a Consul service address.
While the change in #42979 resolves the issue where a Docker container can be stuck
if the fluentd-address is unavailable, this functionality adds an additional benefit
in that a new and healthy fluentd-address can be resolved, allowing logs to flow once again.
This adds a `fluentd-async-reconnect-interval` log-opt for the fluentd logging driver.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Conor Evans <coevans@tcd.ie>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Co-authored-by: Conor Evans <coevans@tcd.ie>
Before this change, if Decode() couldn't read a log record fully,
the subsequent invocation of Decode() would read the record's non-header part
as a header and cause a huge heap allocation.
This change prevents such a case by having the intermediate buffer in
the decoder struct.
Fixes#42125.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>