Age | Commit message (Collapse) | Author |
|
|
|
Types used for pids and uids in various interfaces are unpredictable.
Too bad.
|
|
|
|
client to it
The old "systemd-import" binary is now an internal tool. We still use it
as asynchronous backend for systemd-importd. Since the import tool might
require some IO and CPU resources (due to qcow2 explosion, and
decompression), and because we might want to run it with more minimal
priviliges we still keep it around as the worker binary to execute as
child process of importd.
machinectl now has verbs for pulling down images, cancelling them and
listing them.
|
|
In case CAP_SYS_ADMIN is missing (like in containers), one cannot fake pid in
struct ucred (uid/gid are fine if CAP_SETUID/CAP_SETGID are present).
Ensure that journald will try again to forward the messages to syslog without
faking the SCM_CREDENTIALS pid (which isn't guaranteed to succeed anyway, since
it also does the same thing if the process has already exited).
With this patch, journald will no longer silently discard messages
that are supposed to be sent to syslog in these situations.
https://bugs.debian.org/775067
|
|
Terminals tend to be 80 columns wide by default, and the help
text is only supposed to be a terse reminder anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1183771
|
|
This remove the need for various header files to include the
(relatively heavyweight) util.h.
|
|
|
|
This undoes a small part of 13790add4bf648fed816361794d8277a75253410
which was erroneously added, given that zero length datagrams are OK,
and hence zero length reads on a SOCK_DGRAM be no means mean EOF.
|
|
Now that we bump rlimit, we do not really know how many files
we can open. Remove the check.
https://bugzilla.redhat.com/show_bug.cgi?id=1179980
|
|
When there are a lot of split out journal files, we might run out of fds
quicker then we want. Hence: bump RLIMIT_NOFILE to 16K if possible.
Do these even for journalctl. On Fedora the soft RLIMIT_NOFILE is at 1K,
the hard at 4K by default for normal user processes, this code hence
bumps this up for users to 4K.
https://bugzilla.redhat.com/show_bug.cgi?id=1179980
|
|
fd_setcrtime()
|
|
btrfs' COW logic results in heavily fragment journal files, which is
detrimental for perfomance. Hence, turn off COW for journal files as we
create them.
Turning off COW comes at the cost of data integrity guarantees, but this
should be acceptable, given that we do our own checksumming, and
generally have a pretty conservative write pattern.
Also see discussion on linux-btrfs:
http://www.spinics.net/lists/linux-btrfs/msg41001.html
|
|
|
|
Our write pattern is quite awful for CoW file systems (btrfs...), as we
keep updating file parts in the beginning of the file. This results in
fragmented journal files. Hence: when rotating files, defragment them,
since at that point we know that no further write accesses will be made.
|
|
LOG_DEBUG is already a log level, there is no need to use LOG_PRI which
is for filtering out the facility.
|
|
Making use of the fd storage capability of the previous commit, allow
restarting journald by serilizing stream state to /run, and pushing open
fds to PID 1.
|
|
|
|
deleted, rotate
https://bugzilla.redhat.com/show_bug.cgi?id=1171719
|
|
journal file headers
Since the file headers might be replaced by zeroed pages now due to
sigbus we should make sure we don't end up dividing by zero because we
don't check values read from journal file headers for changes.
|
|
arguments should be prefixed with "arg_"
|
|
This makes them robust regarding truncation. Ideally, we'd export this
as an API, but given how messy SIGBUS handling is, and the uncertain
ownership logic of signal handlers we should not do this (unless libc
one day invents a scheme how to sanely install SIGBUS handlers for
specific memory areas only). However, for now we can still make all our
own tools robust.
Note that external tools will only have read-access to the journal
anyway, where SIGBUS is much more unlikely, given that only writes are
subject to disk full problems.
|
|
|
|
|
|
Even though we use fallocate() it appears that file systems like btrfs
will trigger SIGBUS on certain low-disk-space situation. We should
handle that, hence catch the signal, add it to a list of invalidated
pages, and replace the page with an empty memory area. After each write
check if SIGBUS was triggered, and consider the write invalid if it was.
This should make journald a lot more robust with file systems where
fallocate() is not reliable, for example all CoW file systems
(btrfs...), where changing written data can fail with disk full errors.
https://bugzilla.redhat.com/show_bug.cgi?id=1045810
|
|
https://github.com/vlajos/misspell_fixer
https://github.com/torstehu/systemd/commit/b6fdeb618cf2f3ce1645b3315f15f482710c7ffa
Thanks to Torstein Husebo <torstein@huseboe.net>.
|
|
If OBJECT_PID= came as the last field, we would not reallocate the iovec to bigger size,
and fail the assertion later on in dispatch_message_real().
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=1177184
|
|
|
|
There is alot of cleanup that will have to happen to turn on
-fstrict-aliasing, but I think our code should be "correct" to the rule.
|
|
EOF is meaningless if the direction of iteration changes.
Move the EOF optimization under the direction check.
This fixes test-journal-interleaving for me.
Thanks to Filipe Brandenburger for telling me about the failure.
|
|
next_with_matches() is odd in that its "unit64_t *offset" parameter is
both input and output. In other it's purely for output.
The function is called from two places in next_beyond_location(). In
both of them "&cp" is used as the argument and in both cases cp is
guaranteed to equal f->current_offset.
Let's just have next_with_matches() ignore "*offset" on input and
operate with f->current_offset.
I did not investigate why it is, but it makes my usual benchmark run
reproducibly faster:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.032s
user 0m3.896s
sys 0m0.135s
(Compare to preceding commit, where real was 4.4s.)
|
|
I accidentally broke the detection of duplicate entries in 7943f42275
"journal: optimize iteration by returning previously found candidate
entry".
When we have a known location of a candidate entry, we must not return
from next_beyond_location() immediately. We must go through the
duplicates detection to make sure the candidate differs from the
already iterated entry.
This fix slows down iteration a bit, but it's still faster than it
was before the rework.
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.448s
user 0m4.298s
sys 0m0.149s
(Compare with results from commit 7943f42275, where real was 5.3s before
the rework.)
|
|
Now that journal_file_next_entry() does not need a pointer to the
current object, next_with_matches() does not need it either.
|
|
The current offset is sufficient information.
|
|
In next_beyond_location() when the JournalFile's location type is
LOCATION_SEEK, it means there's nothing to do, because we already have
the location of the candidate entry. Do an early return. Note that now
next_beyond_location() does not anymore guarantee on return that the
entry is mapped, but previous patches made sure the caller does not
care.
This optimization is at least as good as "journal: optimize iteration:
skip files that cannot improve current candidate entry" was.
Timing results on my workstation, using:
$ time ./journalctl -q --since=2014-06-01 --until=2014-07-01 > /dev/null
Before "Revert "journal: optimize iteration: skip files that cannot
improve current candidate entry":
real 0m5.349s
user 0m5.166s
sys 0m0.181s
Now:
real 0m3.901s
user 0m3.724s
sys 0m0.176s
|
|
If from a previous iteration we know we are at the end of a journal
file, don't bother looking into the file again. This is complicated by
the fact that the EOF does not have to be permanent (think of
"journalctl -f"). So we also check if the number of entries in the
journal file changed.
This optimization has a similar effect as "journal: optimize iteration:
skip whole files behind current location" had.
|
|
offset is redundant, because the caller can rely on f->current_offset.
The object pointer the function saves in *ret is thrown away by the caller.
|
|
The file's current_offset is already updated at this point, so let's use
it.
|
|
When comparing the locations of candidate entries, we can rely on the
location information stored in struct JournalFile.
|
|
set_location() is called from real_journal_next() when a winning entry
has been picked from among the candidates in journal files.
The location type is always set to LOCATION_DISCRETE. No need to pass
it as a parameter.
The per-JournalFile location information is already updated at this
point. No need for having the direction and offset here.
|
|
In next_beyond_location() when we find a candidate entry in a journal
file, save its location information in struct JournalFile.
The purpose of remembering the locations of candidate entries is to be
able to save work in the next iteration. This patch does only the
remembering part.
LOCATION_SEEK means the location identifies a candidate entry.
When a winner is picked from among candidates, it becomes
LOCATION_DISCRETE.
LOCATION_TAIL here signifies we've iterated the file to the end (or the
beginning in the case of reversed direction).
|
|
|
|
In preparation for individual JournalFiles maintaining a location
of their own.
|
|
This reverts commit b7c88ab8cc7d55a43450bf3dea750f95f2e910d6.
This optimization will be made redundant by the following patches.
|
|
candidate entry"
This reverts commit f8b5a3b75fb55f0acb85c21424b3893c822742e9.
This optimization will be made redundant by the following patches.
|
|
Its only caller is a test.
|
|
|
|
try_context() is such a hot path that the hashmap lookup is expensive.
The number of contexts is small - it is the number of object types.
Using a hashmap is overkill. A plain array will do.
Before:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m9.445s
user 0m9.228s
sys 0m0.213s
After:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m5.438s
user 0m5.266s
sys 0m0.170s
|
|
This never had any callers. Contexts are freed when the MMapCache is
freed.
|