Age | Commit message (Collapse) | Author |
|
Logs constantly show
systemd-journald[395]: Failed to set file attributes: Inappropriate ioctl for device
This is because ext4 does not support FS_NOCOW_FL.
[zj: fold into one conditional as suggested on the ML and
fix (preexisting) r/errno confusion in error message.]
|
|
Commit 668c965af "journal: skipping of exhausted journal files is bad if
direction changed" fixed a correctness issue, but it also significantly
limited the cases where the optimization that skips exhausted journal
files could apply.
As a result, some journalctl queries are much slower in v219 than in v218.
(e.g. queries where a "--since" cutoff should have quickly eliminated
older journal files from consideration, but didn't.)
If already in the initial iteration find_location_with_matches() finds
no entry, the journal file's location is not updated. This is fine,
except that:
- We must update at least f->last_direction. The optimization relies on
it. Let's separate that from journal_file_save_location() and update
it immediately after the direction checks.
- The optimization was conditional on "f->current_offset > 0", but it
would always be 0 in this scenario. This check is unnecessary for the
optimization.
|
|
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
|
|
include-what-you-use automatically does this and it makes finding
unnecessary harder to spot. The only content of poll.h is a include
of sys/poll.h so should be harmless.
|
|
This fixes various issues found by globally reordering the include
sections of all .c files.
|
|
Leave it to the compiler to figure out whether it shall inline stuff or
not.
Only place where using static inline is OK to use is in in header
files, really.
|
|
|
|
After all it is now much more like strjoin() than strappend(). At the
same time, add support for NULL sentinels, even if they are normally not
necessary.
|
|
If we scale our buffer to be wide enough for the format string, we
should expect that the calculation was correct.
char_array_0() invocations are removed, since snprintf nul-terminates
the output in any case.
A similar wrapper is used for strftime calls, but only in timedatectl.c.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=87354
|
|
This reverts commit b914ea8d379b446c4c9fac4ba181771676ef38cd.
We really need to put a limit on all our resources, everywhere, and in
particular if we operate on external data.
Hence, let's reintroduce the limit, but bump it substantially, so that
it is guaranteed to be higher than any realistic RLIMIT_NOFILE setting.
|
|
Otherwise they can be optimized away with -DNDEBUG
|
|
|
|
Types used for pids and uids in various interfaces are unpredictable.
Too bad.
|
|
|
|
client to it
The old "systemd-import" binary is now an internal tool. We still use it
as asynchronous backend for systemd-importd. Since the import tool might
require some IO and CPU resources (due to qcow2 explosion, and
decompression), and because we might want to run it with more minimal
priviliges we still keep it around as the worker binary to execute as
child process of importd.
machinectl now has verbs for pulling down images, cancelling them and
listing them.
|
|
In case CAP_SYS_ADMIN is missing (like in containers), one cannot fake pid in
struct ucred (uid/gid are fine if CAP_SETUID/CAP_SETGID are present).
Ensure that journald will try again to forward the messages to syslog without
faking the SCM_CREDENTIALS pid (which isn't guaranteed to succeed anyway, since
it also does the same thing if the process has already exited).
With this patch, journald will no longer silently discard messages
that are supposed to be sent to syslog in these situations.
https://bugs.debian.org/775067
|
|
Terminals tend to be 80 columns wide by default, and the help
text is only supposed to be a terse reminder anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1183771
|
|
This remove the need for various header files to include the
(relatively heavyweight) util.h.
|
|
|
|
This undoes a small part of 13790add4bf648fed816361794d8277a75253410
which was erroneously added, given that zero length datagrams are OK,
and hence zero length reads on a SOCK_DGRAM be no means mean EOF.
|
|
Now that we bump rlimit, we do not really know how many files
we can open. Remove the check.
https://bugzilla.redhat.com/show_bug.cgi?id=1179980
|
|
When there are a lot of split out journal files, we might run out of fds
quicker then we want. Hence: bump RLIMIT_NOFILE to 16K if possible.
Do these even for journalctl. On Fedora the soft RLIMIT_NOFILE is at 1K,
the hard at 4K by default for normal user processes, this code hence
bumps this up for users to 4K.
https://bugzilla.redhat.com/show_bug.cgi?id=1179980
|
|
fd_setcrtime()
|
|
btrfs' COW logic results in heavily fragment journal files, which is
detrimental for perfomance. Hence, turn off COW for journal files as we
create them.
Turning off COW comes at the cost of data integrity guarantees, but this
should be acceptable, given that we do our own checksumming, and
generally have a pretty conservative write pattern.
Also see discussion on linux-btrfs:
http://www.spinics.net/lists/linux-btrfs/msg41001.html
|
|
|
|
Our write pattern is quite awful for CoW file systems (btrfs...), as we
keep updating file parts in the beginning of the file. This results in
fragmented journal files. Hence: when rotating files, defragment them,
since at that point we know that no further write accesses will be made.
|
|
LOG_DEBUG is already a log level, there is no need to use LOG_PRI which
is for filtering out the facility.
|
|
Making use of the fd storage capability of the previous commit, allow
restarting journald by serilizing stream state to /run, and pushing open
fds to PID 1.
|
|
|
|
deleted, rotate
https://bugzilla.redhat.com/show_bug.cgi?id=1171719
|
|
journal file headers
Since the file headers might be replaced by zeroed pages now due to
sigbus we should make sure we don't end up dividing by zero because we
don't check values read from journal file headers for changes.
|
|
arguments should be prefixed with "arg_"
|
|
This makes them robust regarding truncation. Ideally, we'd export this
as an API, but given how messy SIGBUS handling is, and the uncertain
ownership logic of signal handlers we should not do this (unless libc
one day invents a scheme how to sanely install SIGBUS handlers for
specific memory areas only). However, for now we can still make all our
own tools robust.
Note that external tools will only have read-access to the journal
anyway, where SIGBUS is much more unlikely, given that only writes are
subject to disk full problems.
|
|
|
|
|
|
Even though we use fallocate() it appears that file systems like btrfs
will trigger SIGBUS on certain low-disk-space situation. We should
handle that, hence catch the signal, add it to a list of invalidated
pages, and replace the page with an empty memory area. After each write
check if SIGBUS was triggered, and consider the write invalid if it was.
This should make journald a lot more robust with file systems where
fallocate() is not reliable, for example all CoW file systems
(btrfs...), where changing written data can fail with disk full errors.
https://bugzilla.redhat.com/show_bug.cgi?id=1045810
|
|
https://github.com/vlajos/misspell_fixer
https://github.com/torstehu/systemd/commit/b6fdeb618cf2f3ce1645b3315f15f482710c7ffa
Thanks to Torstein Husebo <torstein@huseboe.net>.
|
|
If OBJECT_PID= came as the last field, we would not reallocate the iovec to bigger size,
and fail the assertion later on in dispatch_message_real().
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=1177184
|
|
|
|
There is alot of cleanup that will have to happen to turn on
-fstrict-aliasing, but I think our code should be "correct" to the rule.
|
|
EOF is meaningless if the direction of iteration changes.
Move the EOF optimization under the direction check.
This fixes test-journal-interleaving for me.
Thanks to Filipe Brandenburger for telling me about the failure.
|
|
next_with_matches() is odd in that its "unit64_t *offset" parameter is
both input and output. In other it's purely for output.
The function is called from two places in next_beyond_location(). In
both of them "&cp" is used as the argument and in both cases cp is
guaranteed to equal f->current_offset.
Let's just have next_with_matches() ignore "*offset" on input and
operate with f->current_offset.
I did not investigate why it is, but it makes my usual benchmark run
reproducibly faster:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.032s
user 0m3.896s
sys 0m0.135s
(Compare to preceding commit, where real was 4.4s.)
|
|
I accidentally broke the detection of duplicate entries in 7943f42275
"journal: optimize iteration by returning previously found candidate
entry".
When we have a known location of a candidate entry, we must not return
from next_beyond_location() immediately. We must go through the
duplicates detection to make sure the candidate differs from the
already iterated entry.
This fix slows down iteration a bit, but it's still faster than it
was before the rework.
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.448s
user 0m4.298s
sys 0m0.149s
(Compare with results from commit 7943f42275, where real was 5.3s before
the rework.)
|
|
Now that journal_file_next_entry() does not need a pointer to the
current object, next_with_matches() does not need it either.
|
|
The current offset is sufficient information.
|
|
In next_beyond_location() when the JournalFile's location type is
LOCATION_SEEK, it means there's nothing to do, because we already have
the location of the candidate entry. Do an early return. Note that now
next_beyond_location() does not anymore guarantee on return that the
entry is mapped, but previous patches made sure the caller does not
care.
This optimization is at least as good as "journal: optimize iteration:
skip files that cannot improve current candidate entry" was.
Timing results on my workstation, using:
$ time ./journalctl -q --since=2014-06-01 --until=2014-07-01 > /dev/null
Before "Revert "journal: optimize iteration: skip files that cannot
improve current candidate entry":
real 0m5.349s
user 0m5.166s
sys 0m0.181s
Now:
real 0m3.901s
user 0m3.724s
sys 0m0.176s
|
|
If from a previous iteration we know we are at the end of a journal
file, don't bother looking into the file again. This is complicated by
the fact that the EOF does not have to be permanent (think of
"journalctl -f"). So we also check if the number of entries in the
journal file changed.
This optimization has a similar effect as "journal: optimize iteration:
skip whole files behind current location" had.
|
|
offset is redundant, because the caller can rely on f->current_offset.
The object pointer the function saves in *ret is thrown away by the caller.
|