Age | Commit message (Collapse) | Author |
|
There is alot of cleanup that will have to happen to turn on
-fstrict-aliasing, but I think our code should be "correct" to the rule.
|
|
EOF is meaningless if the direction of iteration changes.
Move the EOF optimization under the direction check.
This fixes test-journal-interleaving for me.
Thanks to Filipe Brandenburger for telling me about the failure.
|
|
next_with_matches() is odd in that its "unit64_t *offset" parameter is
both input and output. In other it's purely for output.
The function is called from two places in next_beyond_location(). In
both of them "&cp" is used as the argument and in both cases cp is
guaranteed to equal f->current_offset.
Let's just have next_with_matches() ignore "*offset" on input and
operate with f->current_offset.
I did not investigate why it is, but it makes my usual benchmark run
reproducibly faster:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.032s
user 0m3.896s
sys 0m0.135s
(Compare to preceding commit, where real was 4.4s.)
|
|
I accidentally broke the detection of duplicate entries in 7943f42275
"journal: optimize iteration by returning previously found candidate
entry".
When we have a known location of a candidate entry, we must not return
from next_beyond_location() immediately. We must go through the
duplicates detection to make sure the candidate differs from the
already iterated entry.
This fix slows down iteration a bit, but it's still faster than it
was before the rework.
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.448s
user 0m4.298s
sys 0m0.149s
(Compare with results from commit 7943f42275, where real was 5.3s before
the rework.)
|
|
Now that journal_file_next_entry() does not need a pointer to the
current object, next_with_matches() does not need it either.
|
|
The current offset is sufficient information.
|
|
In next_beyond_location() when the JournalFile's location type is
LOCATION_SEEK, it means there's nothing to do, because we already have
the location of the candidate entry. Do an early return. Note that now
next_beyond_location() does not anymore guarantee on return that the
entry is mapped, but previous patches made sure the caller does not
care.
This optimization is at least as good as "journal: optimize iteration:
skip files that cannot improve current candidate entry" was.
Timing results on my workstation, using:
$ time ./journalctl -q --since=2014-06-01 --until=2014-07-01 > /dev/null
Before "Revert "journal: optimize iteration: skip files that cannot
improve current candidate entry":
real 0m5.349s
user 0m5.166s
sys 0m0.181s
Now:
real 0m3.901s
user 0m3.724s
sys 0m0.176s
|
|
If from a previous iteration we know we are at the end of a journal
file, don't bother looking into the file again. This is complicated by
the fact that the EOF does not have to be permanent (think of
"journalctl -f"). So we also check if the number of entries in the
journal file changed.
This optimization has a similar effect as "journal: optimize iteration:
skip whole files behind current location" had.
|
|
offset is redundant, because the caller can rely on f->current_offset.
The object pointer the function saves in *ret is thrown away by the caller.
|
|
The file's current_offset is already updated at this point, so let's use
it.
|
|
When comparing the locations of candidate entries, we can rely on the
location information stored in struct JournalFile.
|
|
set_location() is called from real_journal_next() when a winning entry
has been picked from among the candidates in journal files.
The location type is always set to LOCATION_DISCRETE. No need to pass
it as a parameter.
The per-JournalFile location information is already updated at this
point. No need for having the direction and offset here.
|
|
In next_beyond_location() when we find a candidate entry in a journal
file, save its location information in struct JournalFile.
The purpose of remembering the locations of candidate entries is to be
able to save work in the next iteration. This patch does only the
remembering part.
LOCATION_SEEK means the location identifies a candidate entry.
When a winner is picked from among candidates, it becomes
LOCATION_DISCRETE.
LOCATION_TAIL here signifies we've iterated the file to the end (or the
beginning in the case of reversed direction).
|
|
|
|
This reverts commit b7c88ab8cc7d55a43450bf3dea750f95f2e910d6.
This optimization will be made redundant by the following patches.
|
|
candidate entry"
This reverts commit f8b5a3b75fb55f0acb85c21424b3893c822742e9.
This optimization will be made redundant by the following patches.
|
|
Note that numbers 0 and -1 are both replaced with OBJECT_UNUSED,
because they are treated the same everywhere (e.g. type_to_context()
translates them both to 0).
|
|
The only user is sd_journal_enumerate_unique() and, as explained in
the previous commit (fed67c38e3 "journal: map objects to context set by
caller, not by actual object type"), the use of them there is now
superfluous. Let's remove them.
This reverts major parts of commits:
ae97089d49 journal: fix access to munmapped memory in
sd_journal_enumerate_unique
06cc69d44c sd-journal: fix sd_journal_enumerate_unique skipping values
Tested with an "--enable-debug" build and "journalctl --list-boots".
It gives the expected number of results. Additionally, if I then revert
the previous commit ("journal: map objects to context set by caller, not
to actual object type"), it crashes with SIGSEGV, as expected.
|
|
Let's add some syntactic sugar for iterating through inotify events, and
use it everywhere.
|
|
candidate entry
Suppose that while iterating we have already looked into a journal file
and got a candidate for the next entry. And we are considering to look
into another journal file because it may contain an entry that is nearer
to the current location than the candidate.
We should skip the whole journal file if we can tell by looking at its
header that none of its entries can precede the candidate.
Before:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m20.518s
user 0m19.989s
sys 0m0.328s
After:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m9.445s
user 0m9.228s
sys 0m0.213s
|
|
Interleaving of entries from many journal files is expensive. But there
is room for optimization.
We can skip looking into journal files whose entries all lie before the
current iterating location. We can tell if that's the case from looking
at the journal file header. This saves a huge amount of work if one has
many of mostly not interleaved journal files.
On my workstation with 90 journal files in /var/log/journal/ID/
totalling 3.4 GB I get these results:
Before:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 5m54.258s
user 2m4.263s
sys 3m48.965s
After:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m20.518s
user 0m19.989s
sys 0m0.328s
The high "sys" time in the original was caused by putting more stress on
the mmap-cache than it could handle. With the patch the working set
now consists of fewer mmap windows and mmap-cache is not thrashing.
|
|
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'
Plus some whitespace, linewrap, and indent adjustments.
|
|
Basically:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/log_(debug|info|notice|warning|error|emergency)\("([^"]*)%s"([^;]*),\s*strerror\(-?([->a-zA-Z_]+)\)\);/log_\1_errno(\4, "\2%m"\3);/gms;print;' \
$f; done
Plus manual indentation fixups.
|
|
It corrrectly handles both positive and negative errno values.
|
|
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
|
|
Anything that uses hashmap_next() almost certainly cares about the order
and needs to be an OrderedHashmap.
|
|
string_is_safe()
After all, we know have this as generic validator, so let's be correct
and use it wherver applicable.
|
|
|
|
sd_journal_enumerate_unique will lock its mmap window to prevent it
from being released by calling mmap_cache_get with keep_always=true.
This call may return windows that are wider, but compatible with the
parameters provided to it.
This can result in a mismatch where the window to be released cannot
properly be selected, because we have more than one window matching the
parameters of mmap_cache_release. Therefore, introduce a release_cookie
to be used when releasing the window.
https://bugs.freedesktop.org/show_bug.cgi?id=79380
|
|
systemctl would call sd_j_enumerate_unique() interleaved with
sd_j_next(). But the latter can remove a file if it detects an
error in it. In those circumstances sd_j_enumerate_unique would
restart with the first file in hashmap. With many corrupted files
sd_j_enumerate_unique might iterate over the list multiple times.
Avoid this by jumping to the next file in unique list if possible,
or setting a flag that tells sd_j_enumerate_unique that it is done
otherwise.
|
|
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.
systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
|
|
If the journal is corrupted, we might return an object that does
not start with the expected field name and/or is shorter than it
should.
|
|
|
|
They have different size on 32 bit, so they are really not interchangable.
|
|
String which ended in an unfinished quote were accepted, potentially
with bad memory accesses.
Reject anything which ends in a unfished quote, or contains
non-whitespace characters right after the closing quote.
_FOREACH_WORD now returns the invalid character in *state. But this return
value is not checked anywhere yet.
Also, make 'word' and 'state' variables const pointers, and rename 'w'
to 'word' in various places. Things are easier to read if the same name
is used consistently.
mbiebl_> am I correct that something like this doesn't work
mbiebl_> ExecStart=/usr/bin/encfs --extpass='/bin/systemd-ask-passwd "Unlock EncFS"'
mbiebl_> systemd seems to strip of the quotes
mbiebl_> systemctl status shows
mbiebl_> ExecStart=/usr/bin/encfs --extpass='/bin/systemd-ask-password Unlock EncFS $RootDir $MountPoint
mbiebl_> which is pretty weird
|
|
Also modify the function itself to be a bit simpler to read.
|
|
|
|
|
|
Add liblz4 as an optional dependency when requested with --enable-lz4,
and use it in preference to liblzma for journal blob and coredump
compression. To retain backwards compatibility, XZ is used to
decompress old blobs.
Things will function correctly only with lz4-119.
Based on the benchmarks found on the web, lz4 seems to be the best
choice for "quick" compressors atm.
For pkg-config status, see http://code.google.com/p/lz4/issues/detail?id=135.
|
|
|
|
No functional change expected :)
|
|
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:
fd = safe_close(fd);
Which will close an fd if it is open, and reset the fd variable
correctly.
By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
|
|
If we encounter an inconsistency in a file, let's just
ignore it. Otherwise, after previous patch, we would try,
and fail, to use this file in every invocation of sd_journal_next
or sd_journal_previous that happens afterwards.
|
|
As pointed-out by clang -Wunreachable-code.
No behaviour changes.
|
|
gcc (4.8.2, arm) does not understand that next_beyond_location() will
always set 'p' when it returns > 0. Initialize p in order to fix this.
|
|
sd_journal_get_cutoff_realtime_usec() on failure
|
|
If -flto is used then gcc will generate a lot more warnings than before,
among them a number of use-without-initialization warnings. Most of them
without are false positives, but let's make them go away, because it
doesn't really matter.
|
|
Resolve spotted issues related to missing or extraneous commas, dashes.
|
|
sd_j_e_u needs to keep a reference to an object while comparing it
with possibly duplicate objects in other files. Because the size of
mmap cache is limited, with enough files and object to compare to,
at some point the object being compared would be munmapped, resulting
in a segmentation fault.
Fix this issue by turning keep_always into a reference count that can
be increased and decreased. Other callers which set keep_always=true
are unmodified: their references are never released but are ignored
when the whole file is closed, which happens at some point. keep_always
is increased in sd_j_e_u and later on released.
|
|
This commit also adds error handling for failures during
directory reading.
|