Following a journal log almost always requires a descriptor to be
allocated. In cases where we're running out of descriptors, this means
we might get stuck while attempting to start following the journal, at a
point where it's too late to report it to the client and clean up
easily. The journal reading context will cache the value once it's
allocated, so here we move the check earlier, so that we can detect a
problem when we can still report it cleanly.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
(cherry picked from commit ab62ecf393)
When we set up to start following a journal, if we get error results
from sd_journal_get_fd() or sd_journal_get_events() that prevent us from
following the journal, report the error instead of just mysteriously
failing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
(cherry picked from commit 8d597d25a8)
This cleans up some of the use of the filepoller which makes reading
significantly more robust and gives fewer changes to fallback to the
polling based watcher.
In a lot of cases, if the file was being rotated while we were adding it
to the watcher, it would return an error that the file doesn't exist and
would fallback.
In some cases this fallback could be triggered multiple times even if we
were already on the fallback/poll-based watcher.
It also fixes an open file leak caused by not closing files properly on
rotate, as well as not closing files that were read via the `tail`
function until after the log reader is completed.
Prior to the above changes, it was relatively simple to cause the log
reader to error out by having quick rotations, for example:
```
$ docker run --name test --log-opt max-size=10b --log-opt max-files=10
-d busybox sh -c 'while true; do usleep 500000; echo hello; done'
$ docker logs -f test
```
After these changes I can run this forever without error.
Another fix removes 2 `os.Stat` calls when rotating files. The stat
calls are not needed since we are just calling `os.Rename` anyway, which
will in turn also just produce the same error that `Stat` would.
These `Stat` calls were also quite expensive.
Removing these stat calls also seemed to resolve an issue causing slow
memory growth on the daemon.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
When following a journal-based log, it was possible for the worker
goroutine, which reads the journal using the journal context and sends
entry data down the message channel, to be scheduled after the function
which started it had returned. This could create problems, since the
invoking function was closing the journal context object and message
channel before it returned, which could trigger use-after-free segfaults
and write-to-closed-channel panics in the worker goroutine.
Make the cleanup in the invoking function conditional so that it's only
done when we're not following the logs, and if we are, that it's left to
the worker goroutine to close them.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
The journald log reader keeps a map of following readers so that it can
close them properly when the journald reader object itself is closed,
but it was possible for its worker goroutine to be scheduled so that the
worker attempted to remove a reader from the map before the reader had
been added to the map. This patch adds the item to the map before
starting the goroutine which is expected to eventually remove it.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
The GCP logging driver is calling out to GCP cloud service on package
init.
This is regardless if you are using GCP logging or not.
This change makes this happen on the first invocation of a new GCP
logging driver instance instead.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
this allows user to choose the compression type (i.e. gzip/zlib/none) using
--log-opt=gelf-compression-type=none or the compression level (-1..9) using
--log-opt=gelf-compression-level=0 for gelf driver.
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
This change centralizes the template manipulation in a single package
and adds basic string functions to their execution.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Previously docker used obsolete rfc3164 syslog format for syslog. rfc3164 explicitly
uses semicolon as a separator between 'TAG' and 'Content' section of the log message.
Docker uses semicolon as a separator between image name and version tag.
When {{.ImageName}} was used as a tag expression and contained ":" syslog parser mistreated
"tag" part of the image name as syslog message body, which resulted in incorrect "syslogtag" been reported by syslog
daemon.
Use of rfc5424 log format partually fixes the issue as it does not use semicolon as a separator.
However using default rfc5424 syslog format itroduces backward incompatability because rsyslog template keyword %syslogtag%
is parsed differently. In rfc3164 it uses the "TAG" part reported before the "pid" part. In rfc5424 it uses "appname" part reported
before the pid part, while tag part is introduced by %msgid% part.
For more information on rsyslog configuration properties see: http://www.rsyslog.com/doc/master/configuration/properties.html
Added two options to specify logging in either rfc5424, rfc3164 format or unix format omitting hostname in order to keep backwards compatability with
previous versions.
Signed-off-by: Solganik Alexander <solganik@gmail.com>
When checking if we have the development files for libsystemd's journal
APIs, check for either 'libsystemd >= 209' and 'libsystemd-journal'. If
we find 'libsystemd', define the 'journald' tag, which defaults to using
the 'libsystemd.pc' file. If we find the older 'libsystemd-journal',
define both the 'journald' and 'journald_compat' tags, which causes the
'libsystemd-journal.pc' file to be consulted instead.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
inotify event is trigged immediately there's data written to disk.
But at the time that the inotify event is received, the json line might
not fully saved to disk. If the json decoder tries to decode in such
case, an io.UnexpectedEOF will be trigged.
We used to retry for several times to mitigate the io.UnexpectedEOF error.
But there are still flaky tests caused by the partial log entries.
The daemon knows exactly when there are new log entries emitted. We can
use the pubsub package to notify all the log readers instead of inotify.
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
try to fix broken test. will squash once tests pass
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
Signed-off-by: Jussi Nummelin <jussi.nummelin@gmail.com>
Changed buffer size to 1M and removed unnecessary fmt call
Signed-off-by: Jussi Nummelin <jussi.nummelin@gmail.com>
Updated docs for the new fluentd opts
Signed-off-by: Jussi Nummelin <jussi.nummelin@gmail.com>
this prevents the copier from sending messages in the buffer to the closed
driver. If the copied took longer than the timeout to drain the buffer, this
aborts the copier read loop and return back so we can cleanup resources
properly.
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
- Move time json marshaling to the jsonlog package: this is a docker
internal hack that we should not promote as a library.
- Move Timestamp encoding/decoding functions to the API types: This is
only used there. It could be a standalone library but I don't this
it's worth having a separated repo for this. It could introduce more
complexity than it solves.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Add support of `tag`, `env` and `labels` for Splunk logging driver.
Removed from message `containerId` as it is the same as `tag`.
Signed-off-by: Denis Gladkikh <denis@gladkikh.email>