Commit graph

383 commits

Author SHA1 Message Date
Samuel Karp
183cac25f9
awslogs: fix flaky TestLogBlocking unit test
TestLogBlocking is intended to test that the Log method blocks by
default.  It does this by mocking out the internals of the
awslogs.logStream and replacing one of its internal channels with one
that is controlled by the test.  The call to Log occurs inside a
goroutine.  Go may or may not schedule the goroutine immediately and the
blocking may or may not be observed outside the goroutine immediately
due to decisions made by the Go runtime.  This change adds a small
timeout for test failure so that the Go runtime has the opportunity to
run the goroutine before the test fails.

Signed-off-by: Samuel Karp <skarp@amazon.com>
(cherry picked from commit fd94bae0b8)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-25 21:43:08 +02:00
Sebastiaan van Stijn
de3a04a65d
Windows: skip flaky TestLogBlocking
This test frequently fails on Windows RS1 (mainly), so skipping it
for now on Windows;

```
ok  	github.com/docker/docker/daemon/logger	0.525s	coverage: 43.0% of statements
time="2019-09-09T20:37:35Z" level=info msg="Trying to get region from EC2 Metadata"
time="2019-09-09T20:37:36Z" level=info msg="Log stream already exists" errorCode=ResourceAlreadyExistsException logGroupName= logStreamName= message= origError="<nil>"
--- FAIL: TestLogBlocking (0.02s)
    cloudwatchlogs_test.go:313: Expected to be able to read from stream.messages but was unable to
time="2019-09-09T20:37:36Z" level=error msg=Error
time="2019-09-09T20:37:36Z" level=error msg="Failed to put log events" errorCode=InvalidSequenceTokenException logGroupName=groupName logStreamName=streamName message="use token token" origError="<nil>"
time="2019-09-09T20:37:36Z" level=error msg="Failed to put log events" errorCode=DataAlreadyAcceptedException logGroupName=groupName logStreamName=streamName message="use token token" origError="<nil>"
time="2019-09-09T20:37:36Z" level=info msg="Data already accepted, ignoring error" errorCode=DataAlreadyAcceptedException logGroupName=groupName logStreamName=streamName message="use token token"
FAIL
coverage: 78.2% of statements
FAIL	github.com/docker/docker/daemon/logger/awslogs	0.630s
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6c75c86240)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-25 21:36:11 +02:00
Deep Debroy
7e76438537
Be more conservative for Windows in TestFrequency for Splunk
Signed-off-by: Deep Debroy <ddebroy@docker.com>
(cherry picked from commit a5c420ac54)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-25 17:31:30 +02:00
Kir Kolyshkin
1e13f66fbb logger: fix follow logs for max-file=1
In case jsonlogfile is used with max-file=1 and max-size set,
the log rotation is not perfomed; instead, the log file is closed
and re-open with O_TRUNC.

This situation is not handled by the log reader in follow mode,
leading to an issue of log reader being stuck forever.

This situation (file close/reopen) could be handled in waitRead(),
but fsnotify library chose to not listen to or deliver this event
(IN_CLOSE_WRITE in inotify lingo).

So, we have to handle this by checking the file size upon receiving
io.EOF from the log reader, and comparing the size with the one received
earlier. In case the new size is less than the old one, the file was
truncated and we need to seek to its beginning.

Fixes #39235.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 9cd24ba605)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-09-23 11:29:52 -07:00
Kir Kolyshkin
dd7ef76474 journald/read: fix/unify errors
1. Use "in-place" variables for if statements to limit their scope to
   the respectful `if` block.

2. Report the error returned from sd_journal_* by using CErr().

3. Use errors.New() instead of fmt.Errorf().

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
 (cherry picked from commit 20a0e58a79)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:50:39 -07:00
Kir Kolyshkin
0375566412 journald: fix for --tail 0
From the first glance, `docker logs --tail 0` does not make sense,
as it is supposed to produce no output, but `tail -n 0` from GNU
coreutils is working like that, plus there is even a test case
(`TestLogsTail` in integration-cli/docker_cli_logs_test.go).

Now, something like `docker logs --follow --tail 0` makes total
sense, so let's make it work.

(NOTE if --tail is not used, config.Tail is set to -1)

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit dd4bfe30a8)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:43 -07:00
Kir Kolyshkin
3678438dd8 journald/read: avoid piling up open files
If we take a long time to process log messages, and during that time
journal file rotation occurs, the journald client library will keep
those rotated files open until sd_journal_process() is called.

By periodically calling sd_journal_process() during the processing
loop we shrink the window of time a client instance has open file
descriptors for rotated (deleted) journal files.

This code is modelled after that of journalctl [1]; the above explanation
as well as the value of 1024 is taken from there.

[v2: fix CErr() argument]

[1] https://github.com/systemd/systemd/blob/dc16327c48d/src/journal/journalctl.c#L2676
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b73fb8fd5d)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:38 -07:00
Kir Kolyshkin
1cc7b3881d journald/read: simplify/fix followJournal()
TL;DR: simplify the code, fix --follow hanging indefinitely

Do the following to simplify the followJournal() code:

1. Use Go-native select instead of C-native polling.

2. Use Watch{Producer,Consumer}Gone(), eliminating the need
to have journald.closed variable, and an extra goroutine.

3. Use sd_journal_wait(). In the words of its own man page:

> A synchronous alternative for using sd_journal_get_fd(),
> sd_journal_get_events(), sd_journal_get_timeout() and
> sd_journal_process() is sd_journal_wait().

Unfortunately, the logic is still not as simple as it
could be; the reason being, once the container has exited,
journald might still be writing some logs from its internal
buffers onto journal file(s), and there is no way to
figure out whether it's done so we are guaranteed to
read all of it back. This bug can be reproduced with
something like

> $ ID=$(docker run -d busybox seq 1 150000); docker logs --follow $ID
> ...
> 128123
> $

(The last expected output line should be `150000`).

To avoid exiting from followJournal() early, add the
following logic: once the container is gone, keep trying
to drain the journal until there's no new data for at
least `waitTimeout` time period.

Should fix https://github.com/docker/for-linux/issues/575

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f091febc94)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:34 -07:00
Kir Kolyshkin
03b1b078f9 Call sd_journal_get_fd() earlier, only if needed
1. The journald client library initializes inotify watch(es)
during the first call to sd_journal_get_fd(), and it make sense
to open it earlier in order to not lose any journal file rotation
events.

2. It only makes sense to call this if we're going to use it
later on -- so add a check for config.Follow.

3. Remove the redundant call to sd_journal_get_fd().

NOTE that any subsequent calls to sd_journal_get_fd() return
the same file descriptor, so there's no real need to save it
for later use in wait_for_data_cancelable().

Based on earlier patch by Nalin Dahyabhai <nalin@redhat.com>.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 981c01665b)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:34 -07:00
Kir Kolyshkin
5067389c36 journald/read: avoid being blocked on send
In case the LogConsumer is gone, the code that sends the message can
stuck forever. Wrap the code in select case, as all other loggers do.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 79039720c8)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:29 -07:00
Kir Kolyshkin
6d98ef8c69 journald/read: simplify walking backwards
In case Tail=N parameter is requested, we need to show N lines.
It does not make sense to walk backwards one by one if we can
do it at once. Now, if Since=T is also provided, make sure we
haven't jumped too far (before T), and if we did, move forward.

The primary motivation for this was to make the code simpler.

This also fixes a tiny bug in the "since" implementation.

Before this commit:
> $ docker logs -t --tail=6000 --since="2019-03-10T03:54:25.00" $ID | head
> 2019-03-10T03:54:24.999821000Z 95981

After:
> $ docker logs -t --tail=6000 --since="2019-03-10T03:54:25.00" $ID | head
> 2019-03-10T03:54:25.000013000Z 95982

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit ff3cd167ea)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:23 -07:00
Kir Kolyshkin
d5088c1488 journald/read: simplify code
Minor code simplification.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit e8f6166791)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:23 -07:00
Nalin Dahyabhai
df3689f8d0 Small journal cleanup
Clean up a deferred function call in the journal reading logic.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
(cherry picked from commit 1ada3e85bf)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2019-08-09 16:47:23 -07:00
Justin Cormack
510e79ebe9
Entropy cannot be saved
Remove non cryptographic randomness.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
(cherry picked from commit 2df693e533)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-06-11 17:40:09 +02:00
Sebastiaan van Stijn
688e67e1d3
bump fluent/fluent-logger-golang v1.4.0
- Add RequestAck to enable at-least-once message transferring
- Add Async option to update sending message in asynchronous way
- Deprecate AsyncConnect (Use Async instead)

full diff: https://github.com/fluent/fluent-logger-golang/compare/v1.3.0...v1.4.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-04-16 11:06:30 +02:00
Sebastiaan van Stijn
c524e15f30
Merge pull request #38952 from alexei38/master
fluentd log driver. failed parse last partial message in fluentd #38951
2019-04-15 20:40:57 +02:00
Alexei Margasov
4a9836a20b Adds PartialLogMetadata to encode protobuf for logger plugins
Signed-off-by: Alexei Margasov <alexei38@yandex.ru>
2019-04-09 16:14:33 +05:00
Alexei Margasov
8997b90c2c fluentd log driver. failed parse last partial message in fluentd #38951
Signed-off-by: Alexei Margasov <alexei38@yandex.ru>
2019-04-09 15:21:08 +05:00
Sebastiaan van Stijn
3449b12cc7
Use assert.NilError() instead of assert.Assert()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-21 13:16:02 +01:00
Yong Tang
0281db99a9 Follow up to PR 38407
This fix is a follow up to PR 38407 to use assert.Error
and assert.NilError when appropriate

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2019-01-03 01:23:24 +00:00
Yong Tang
626022d0f6
Merge pull request #38407 from maximilianomaccanti/master
Add two configurable options to awslogs driver
2019-01-03 08:48:25 +08:00
Joao Trindade
a7020454ca
Add options validation to syslog logger test
Adds the following validations to the syslog logger test:

 1. Only supported options are valid
 2. Log option syslog-address has to be a valid URI
 3. Log option syslog-address if is file has to exist
 4. Log option syslog-address if udp/tcp scheme, default to port 513
 5. Log-option syslog-facility has to be a valid facility
 6. Log-option syslog-format has to be a valid format

Signed-off-by: Joao Trindade <trindade.joao@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2018-12-24 20:43:41 +01:00
Maximiliano Maccanti
ad8a8e8a9e NewStreamConfig UTest fixes
Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 22:24:40 +00:00
Maximiliano Maccanti
687cbfa739 Split StreamConfig from New, Utest table driven
Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 20:45:11 +00:00
Maximiliano Maccanti
512ac778bf Add two configurable options to awslogs driver
Add awslogs-force-flush-interval-seconds and awslogs-max-buffered-events configurable options to aswlogs driver to replace hardcoded values of repsectively 5 seconds and 4K.

Signed-off-by: Maximiliano Maccanti <maccanti@amazon.com>
2018-12-21 20:45:11 +00:00
Yong Tang
fa6dabf876 Add zero padding for RFC5424 syslog format
This fix tries to address the issue raised in 38258
where current RFC5424 sys log format does not zero pad
the time (trailing zeros are removed)

This fix apply the patch to fix the issue. This fix fixes 38258.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-12-08 22:40:02 +00:00
Sebastiaan van Stijn
758255791e
Merge pull request #38177 from mooncak/fix_duplicate
Cleanup duplication in daemon files
2018-11-13 09:55:51 +01:00
mooncake
345d1fd089 Cleanup duplication in daemon files
Signed-off-by: Bily Zhang <xcoder@tenxcloud.com>
2018-11-13 10:42:57 +08:00
Sebastiaan van Stijn
b48bf39a79
Merge pull request #37944 from IRCody/awslogs_error_context
Return more context on awslogs create failure
2018-10-24 21:00:15 +02:00
Anusha Ragunathan
6611ab1c6f
Merge pull request #37986 from samuelkarp/moby/moby-37747
awslogs: account for UTF-8 normalization in limits
2018-10-18 10:17:24 -07:00
Yong Tang
533e07afbe
Merge pull request #38032 from RohitK89/21497-log-image-name
Add IMAGE_NAME attribute to journald log events
2018-10-17 12:18:05 -07:00
Colin Panisset
5cd2bb315a Only add CONTAINER_PARTIAL_MESSAGE if not the last partial
Addresses #38045

Signed-off-by: Colin Panisset <colin.panisset@cevo.com.au>
2018-10-17 07:51:59 +11:00
Cody Roseborough
7a5c813d9c Return more context on awslogs create failure
Signed-off-by: Cody Roseborough <crrosebo@amazon.com>
2018-10-16 11:36:52 -07:00
Rohit Kapur
5f7e102df7 Add IMAGE_NAME as a key to journald log messages
Signed-off-by: Rohit Kapur <rkapur@flatiron.com>
2018-10-12 16:16:31 -04:00
Samuel Karp
1e8ef38627 awslogs: account for UTF-8 normalization in limits
The CloudWatch Logs API defines its limits in terms of bytes, but its
inputs in terms of UTF-8 encoded strings.  Byte-sequences which are not
valid UTF-8 encodings are normalized to the Unicode replacement
character U+FFFD, which is a 3-byte sequence in UTF-8.  This replacement
can cause the input to grow, exceeding the API limit and causing failed
API calls.

This commit adds logic for counting the effective byte length after
normalization and splitting input without splitting valid UTF-8
byte-sequences into two invalid byte-sequences.

Fixes https://github.com/moby/moby/issues/37747

Signed-off-by: Samuel Karp <skarp@amazon.com>
2018-10-10 14:45:06 -07:00
mooncake
ea60a87fcf Remove duplicated words in daemon files
Signed-off-by: mooncake <xcoder@tenxcloud.com>
2018-10-06 00:06:38 +08:00
Kir Kolyshkin
b2b169f13f daemon/logger/journald: simplify readers field
As in other similar drivers (jsonlog, local), use a set
(i.e. `map[whatever]struct{}`), making the code simpler.

While at it, make sure we remove the reader from the set
after calling `ProducerGone()` on it.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-10 16:17:05 -07:00
Sebastiaan van Stijn
77faf158f5
Merge pull request #37576 from kolyshkin/logs-f-leak
daemon.ContainerLogs(): fix resource leak on follow
2018-09-10 14:26:43 +02:00
Kir Kolyshkin
9b0097a699 Format code with gofmt -s from go-1.11beta1
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.

No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).

Patch generated with:

> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 15:24:16 -07:00
Kir Kolyshkin
f845d76d04 TestFollowLogsProducerGone: add
This should test that
 - all the messages produced are delivered (i.e. not lost)
 - followLogs() exits

Loosely based on the test having the same name by Brian Goff, see
https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:48:37 -07:00
Kir Kolyshkin
916eabd459 daemon.ContainerLogs(): fix resource leak on follow
When daemon.ContainerLogs() is called with options.follow=true
(as in "docker logs --follow"), the "loggerutils.followLogs()"
function never returns (even then the logs consumer is gone).
As a result, all the resources associated with it (including
an opened file descriptor for the log file being read, two FDs
for a pipe, and two FDs for inotify watch) are never released.

If this is repeated (such as by running "docker logs --follow"
and pressing Ctrl-C a few times), this results in DoS caused by
either hitting the limit of inotify watches, or the limit of
opened files. The only cure is daemon restart.

Apparently, what happens is:

1. logs producer (a container) is gone, calling (*LogWatcher).Close()
for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175).

2. WatchClose() is properly handled by a dedicated goroutine in
followLogs(), cancelling the context.

3. Upon receiving the ctx.Done(), the code in followLogs()
(daemon/logger/loggerutils/logfile.go#L626-L638) keeps to
send messages _synchronously_ (which is OK for now).

4. Logs consumer is gone (Ctrl-C is pressed on a terminal running
"docker logs --follow"). Method (*LogWatcher).Close() is properly
called (see daemon/logs.go:114). Since it was called before and
due to to once.Do(), nothing happens (which is kinda good, as
otherwise it will panic on closing a closed channel).

5. A goroutine (see item 3 above) keeps sending log messages
synchronously to the logWatcher.Msg channel. Since the
channel reader is gone, the channel send operation blocks forever,
and resource cleanup set up in defer statements at the beginning
of followLogs() never happens.

Alas, the fix is somewhat complicated:

1. Distinguish between close from logs producer and logs consumer.
To that effect,
 - yet another channel is added to LogWatcher();
 - {Watch,}Close() are renamed to {Watch,}ProducerGone();
 - {Watch,}ConsumerGone() are added;

*NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed
in order to stop ConsumerLogs(follow=true) when a container is stopped;
otherwise we're not interested in it. In other words, we're only
using it in followLogs().

2. Code that was doing (logWatcher*).Close() is modified to either call
ProducerGone() or ConsumerGone(), depending on the context.

3. Code that was waiting for WatchClose() is modified to wait for
either ConsumerGone() or ProducerGone(), or both, depending on the
context.

4. followLogs() are modified accordingly:
 - context cancellation is happening on WatchProducerGone(),
and once it's received the FileWatcher is closed and waitRead()
returns errDone on EOF (i.e. log rotation handling logic is disabled);
 - due to this, code that was writing synchronously to logWatcher.Msg
can be and is removed as the code above it handles this case;
 - function returns once ConsumerGone is received, freeing all the
resources -- this is the bugfix itself.

While at it,

1. Let's also remove the ctx usage to simplify the code a bit.
It was introduced by commit a69a59ffc7 ("Decouple removing the
fileWatcher from reading") in order to fix a bug. The bug was actually
a deadlock in fsnotify, and the fix was just a workaround. Since then
the fsnofify bug has been fixed, and a new fsnotify was vendored in.
For more details, please see
https://github.com/moby/moby/pull/27782#issuecomment-416794490

2. Since `(*filePoller).Close()` is fixed to remove all the files
being watched, there is no need to explicitly call
fileWatcher.Remove(name) anymore, so get rid of the extra code.

Should fix https://github.com/moby/moby/issues/37391

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:47:42 -07:00
Brian Goff
d37a11bfba daemon/logger/loggerutils: add TestFollowLogsClose
This test case checks that followLogs() exits once the reader is gone.
Currently it does not (i.e. this test is supposed to fail) due to #37391.

[kolyshkin@: test case Brian Goff, changelog and all bugs are by me]
Source: https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-09-06 11:46:34 -07:00
SeungUkLee
a79f8b48d4 fixed typo (becuase -> because)
Signed-off-by: SeungUkLee <lsy931106@gmail.com>
2018-08-26 17:30:40 +09:00
Phil Estes
f962bd06ed
Fix incorrect spelling in error message
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2018-08-22 11:28:11 -04:00
Brian Goff
5da8bc2e5b Remove now unused multireader.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-08-20 09:42:19 -07:00
Sebastiaan van Stijn
e0ad6d045c
Merge pull request #37092 from cpuguy83/local_logger
Add "local" log driver
2018-08-20 07:01:41 +01:00
Brian Goff
a351b38e72 Add new local log driver
This driver uses protobuf to store log messages and has better defaults
for log file handling (e.g. compression and file rotation enabled by
default).

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-08-17 09:36:56 -07:00
Sebastiaan van Stijn
678d4b3a6d
Merge pull request #37412 from AzureCR/naduggar/logissue
Select polling based watcher for docker log file watcher on Windows
2018-08-14 14:40:44 +02:00
Brian Goff
94a10150f6 Decouple logfile from tailfile.
This makes it so consumers of `LogFile` should pass in how to get an
io.Reader to the requested number of lines to tail.

This is also much more efficient when tailing a large number of lines.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2018-08-10 21:02:19 -07:00
Kir Kolyshkin
2c6fbd864a loggerutils: fix a typo
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-07-31 15:44:46 +03:00