summaryrefslogtreecommitdiff
path: root/src/journal/journal-file.c
AgeCommit message (Collapse)Author
2016-01-23Merge pull request #2318 from vcaputo/coalesce-ftruncates-reduxZbigniew Jędrzejewski-Szmek
journal: coalesce ftruncate()s in 250ms windows
2016-01-14journal: coalesce ftruncate()s in 250ms windowsVito Caputo
Prior to this change every journal append causes an ftruncate() for the sake of inotify propagation of the mmap-based writes. With this change the notification is deferred up to ~250ms, coalescing any repeated journal writes during the deferred period into a single ftruncate(). The ftruncate() call isn't free and doing it on every append adds unnecessary overhead and latency in the journald event loop. Introduces journal_file_enable_post_change_timer() which manages a timer on the provided sd-event instance for scheduling coalesced ftruncates. The ftruncate() behavior is unchanged unless journal_file_enable_post_change_timer() is called on the JournalFile. While not a tremendous improvement, profiling systemd-journald event loop latencies using instrumentation as introduced by 34b8751 it was observed that coalescing the ftruncates was low-hanging fruit worth pursuing. Note orders 12 and 13 shifting left into order 11 and order 6 dipping into order 5: Unmodified: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [10685.414572] 0 0 0 0 38 602 61 2 290 60 1643 2554 13 1 4 1 0 0 1 [10690.415114] 0 0 0 0 0 646 54 7 309 44 2073 2148 17 1 3 0 0 0 1 [10695.415509] 0 0 0 0 1 650 73 3 324 37 2071 2270 9 0 0 1 0 1 0 [10700.416297] 0 0 0 0 0 659 50 4 318 38 2111 2152 6 0 1 0 0 1 1 [10705.417136] 0 0 0 0 2 660 48 4 320 38 2129 2146 12 1 1 0 0 1 1 [10710.489114] 0 0 0 0 0 673 38 3 321 37 1925 2339 7 0 0 0 0 1 1 [10715.489613] 0 0 0 0 3 656 64 8 317 48 2365 2007 7 0 0 0 0 0 1 Coalesced: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [ 6169.161360] 0 0 0 1 24 786 54 11 389 24 4192 771 6 4 0 0 1 0 1 [ 6174.161705] 0 0 0 1 18 800 35 6 380 27 3977 893 3 1 0 0 1 0 1 [ 6179.162741] 0 0 0 1 28 768 51 4 391 16 3998 831 5 3 0 0 0 0 2 [ 6184.162856] 0 0 0 0 19 770 60 2 376 26 3795 1004 9 5 1 0 1 0 1 [ 6189.163279] 0 0 0 0 28 761 49 7 372 27 3729 1056 3 2 0 0 1 0 1 [ 6194.164255] 0 0 0 0 25 785 49 7 394 19 3996 908 6 3 2 0 0 0 1 [ 6199.164658] 0 0 0 0 29 797 35 5 389 18 3995 898 3 4 1 1 1 0 1 The remaining high-order delays are a result of the synchronous fsyncs in systemd-journald, beyond the scope of this commit.
2015-12-23Merge pull request #2158 from keszybz/journal-decompressionLennart Poettering
Journal decompression fixes
2015-12-13journal: add dst_allocated_size parameter for compress_blobZbigniew Jędrzejewski-Szmek
compress_blob took src, src_size, dst and *dst_size, but dst_size wasn't used as an input parameter with the size of dst, but only as an output parameter. dst was implicitly assumed to be at least src_size-1. This code wasn't *wrong*, because the only real caller in journal-file.c got it right. But it was misleading, and the tests in test-compress.c got it wrong, and worked only because the output buffer happened to be the same size as input buffer. So add a seperate dst_allocated_size parameter to make it explicit what the size of the buffer is, and to allow test to proceed with different output buffer sizes.
2015-12-10journal: make mmap_cache_unref() a NOP when NULL is passed, like all other ↵Lennart Poettering
destructors
2015-11-06journal: reduce minimum journal file size to 512 KiBMichael Olbrich
For low end embedded systems 4 MiB for each journal file is a lot of memory. Journald will use at least 512 KiB even if JOURNAL_FILE_SIZE_MIN is set to less than that so just use 512 KiB.
2015-11-03journal: return better error for empty filesZbigniew Jędrzejewski-Szmek
When reading stuff, we should only return EIO when an actual read error occured, not when we don't like the data for whatever reason. We already return ENODATA for all other kinds of file truncation, hence do the same for the most obvious kind, so that callers know what ENODATA means.
2015-10-27util-lib: split out allocation calls into alloc-util.[ch]Lennart Poettering
2015-10-27util-lib: split out file attribute calls to chattr-util.[ch]Lennart Poettering
2015-10-27util-lib: split xattr-related calls into xattr-util.[ch]Lennart Poettering
2015-10-27util-lib: split string parsing related calls from util.[ch] into parse-util.[ch]Lennart Poettering
2015-10-25Merge pull request #1654 from poettering/util-libTom Gundersen
Various changes to src/basic/
2015-10-25util-lib: split out fd-related operations into fd-util.[ch]Lennart Poettering
There are more than enough to deserve their own .c file, hence move them over.
2015-10-24util-lib: split our string related calls from util.[ch] into its own file ↵Lennart Poettering
string-util.[ch] There are more than enough calls doing string manipulations to deserve its own files, hence do something about it. This patch also sorts the #include blocks of all files that needed to be updated, according to the sorting suggestions from CODING_STYLE. Since pretty much every file needs our string manipulation functions this effectively means that most files have sorted #include blocks now. Also touches a few unrelated include files.
2015-10-24journal: irrelevant coding style fixesLennart Poettering
2015-10-24journal: fix error handling when compressing journal objectsLennart Poettering
Let's make sure we handle compression errors properly, and don't misunderstand an error for success. Also, let's actually compress things if lz4 is enabled. Fixes #1662.
2015-10-02journal: rework vacuuming logicLennart Poettering
Implement a maximum limit on number of journal files to keep around. Enforcing a limit is useful on this since our performance when viewing pays a heavy penalty for each journal file to interleve. This setting is turned on now by default, and set to 100. Also, actully implement what 348ced909724a1331b85d57aede80a102a00e428 promised: use whatever we find on disk at startup as lower bound on how much disk space we can use. That commit introduced some provisions to implement this, but actually never did. This also adds "journalctl --vacuum-files=" to vacuum files on disk by their number explicitly.
2015-10-02journal: improve some messagesLennart Poettering
Indicate that we are ignoring errors, when we ignore them, and log that at LOG_WARNING level. Use the right error code for the log message.
2015-10-02journal: simplify things by using the LESS_BY() macroLennart Poettering
2015-10-02journal: make journal_file_close() return NULLLennart Poettering
The way it is customary everywhere else in our sources.
2015-09-10tree-wide: never use the off_t unless glibc makes us use itLennart Poettering
off_t is a really weird type as it is usually 64bit these days (at least in sane programs), but could theoretically be 32bit. We don't support off_t as 32bit builds though, but still constantly deal with safely converting from off_t to other types and back for no point. Hence, never use the type anymore. Always use uint64_t instead. This has various benefits, including that we can expose these values directly as D-Bus properties, and also that the values parse the same in all cases.
2015-08-17Bug #944: Deletion of unnecessary checks before calls of the function "free"Markus Elfring
The function "free" is documented in the way that no action shall occur for a passed null pointer. It is therefore not needed that a function caller repeats a corresponding check. http://stackoverflow.com/questions/18775608/free-a-null-pointer-anyway-or-check-first This issue was fixed by using the software Coccinelle 1.0.1.
2015-07-24journal: avoid mapping empty data and field hash tablesLennart Poettering
When a new journal file is created we write the header first, then sync and only then create the data and field hash tables in them. That means to other processes it might appear that the files have a valid header but not data and field hash tables. Our reader code should be able to deal with this. With this change we'll not map the two hash tables right-away after opening a file for reading anymore (because that will of course fail if the objects are missing), but delay this until the first time we access them. On top of that, when we want to look something up in the hash tables and we notice they aren't initialized yet, we consider them empty. This improves handling of some journal files reported in #487.
2015-04-22journal: don't force FS_NOCOW_FL on new journal files, but warn if it is missingLennart Poettering
This way users have the freedom to set or unset the FS_NOCOW_FL flag on their journal files by setting it on the journal directory. Since our default tmpfiles configuration now sets this flag on the directory the flag is set by default on new files, however people can opt-out of this by masking the tmpfiles file for it.
2015-04-11shared: add random-util.[ch]Ronny Chevalier
2015-04-08util: merge change_attr_fd() and chattr_fd()Lennart Poettering
2015-03-27fix gcc warnings about uninitialized variablesHarald Hoyer
like: src/shared/install.c: In function ‘unit_file_lookup_state’: src/shared/install.c:1861:16: warning: ‘r’ may be used uninitialized in this function [-Wmaybe-uninitialized] return r < 0 ? r : state; ^ src/shared/install.c:1796:13: note: ‘r’ was declared here int r; ^
2015-03-09journal: fix return codeZbigniew Jędrzejewski-Szmek
Introduced in fa6ac76083b8ff. Might be related to CID #1261724, but I don't know if coverity can recurse this deep.
2015-03-09journal-file: update format string to remove castZbigniew Jędrzejewski-Szmek
2015-03-09journal: align comments to make them more legibleZbigniew Jędrzejewski-Szmek
2015-03-02journal: fix Inappropriate ioctl for device on ext4Cristian Rodríguez
Logs constantly show systemd-journald[395]: Failed to set file attributes: Inappropriate ioctl for device This is because ext4 does not support FS_NOCOW_FL. [zj: fold into one conditional as suggested on the ML and fix (preexisting) r/errno confusion in error message.]
2015-02-25journal: make skipping of exhausted journal files effective againMichal Schmidt
Commit 668c965af "journal: skipping of exhausted journal files is bad if direction changed" fixed a correctness issue, but it also significantly limited the cases where the optimization that skips exhausted journal files could apply. As a result, some journalctl queries are much slower in v219 than in v218. (e.g. queries where a "--since" cutoff should have quickly eliminated older journal files from consideration, but didn't.) If already in the initial iteration find_location_with_matches() finds no entry, the journal file's location is not updated. This is fine, except that: - We must update at least f->last_direction. The optimization relies on it. Let's separate that from journal_file_save_location() and update it immediately after the direction checks. - The optimization was conditional on "f->current_offset > 0", but it would always be 0 in this scenario. This check is unnecessary for the optimization.
2015-02-23remove unused includesThomas Hindoe Paaboel Andersen
This patch removes includes that are not used. The removals were found with include-what-you-use which checks if any of the symbols from a header is in use.
2015-02-10journald: don't specify inline in local functionsLennart Poettering
Leave it to the compiler to figure out whether it shall inline stuff or not. Only place where using static inline is OK to use is in in header files, really.
2015-01-22Fix some format strings for enums, they are signedZbigniew Jędrzejewski-Szmek
2015-01-08util: make it easy to initialize the crtime from the current time in ↵Lennart Poettering
fd_setcrtime()
2015-01-08journald: turn off COW for journal files on btrfsLennart Poettering
btrfs' COW logic results in heavily fragment journal files, which is detrimental for perfomance. Hence, turn off COW for journal files as we create them. Turning off COW comes at the cost of data integrity guarantees, but this should be acceptable, given that we do our own checksumming, and generally have a pretty conservative write pattern. Also see discussion on linux-btrfs: http://www.spinics.net/lists/linux-btrfs/msg41001.html
2015-01-06journal: consider file deletion errors a reason for rotationLennart Poettering
2015-01-06journald: whenever we rotate a file, btrfs defrag itLennart Poettering
Our write pattern is quite awful for CoW file systems (btrfs...), as we keep updating file parts in the beginning of the file. This results in fragmented journal files. Hence: when rotating files, defragment them, since at that point we know that no further write accesses will be made.
2015-01-05journald: when we detect the journal file we are about to write to has been ↵Lennart Poettering
deleted, rotate https://bugzilla.redhat.com/show_bug.cgi?id=1171719
2015-01-05journald: add some additional checks before we divide by values read from ↵Lennart Poettering
journal file headers Since the file headers might be replaced by zeroed pages now due to sigbus we should make sure we don't end up dividing by zero because we don't check values read from journal file headers for changes.
2015-01-05journald: process SIGBUS for the memory maps we set upLennart Poettering
Even though we use fallocate() it appears that file systems like btrfs will trigger SIGBUS on certain low-disk-space situation. We should handle that, hence catch the signal, add it to a list of invalidated pages, and replace the page with an empty memory area. After each write check if SIGBUS was triggered, and consider the write invalid if it was. This should make journald a lot more robust with file systems where fallocate() is not reliable, for example all CoW file systems (btrfs...), where changing written data can fail with disk full errors. https://bugzilla.redhat.com/show_bug.cgi?id=1045810
2014-12-24util: make creation time xattr logic more genericLennart Poettering
2014-12-18journal: journal_file_next_entry() does not need pointer to current ObjectMichal Schmidt
The current offset is sufficient information.
2014-12-18journal: compare candidate entries using JournalFiles' locationsMichal Schmidt
When comparing the locations of candidate entries, we can rely on the location information stored in struct JournalFile.
2014-12-18journal: keep per-JournalFile location info during iterationMichal Schmidt
In next_beyond_location() when we find a candidate entry in a journal file, save its location information in struct JournalFile. The purpose of remembering the locations of candidate entries is to be able to save work in the next iteration. This patch does only the remembering part. LOCATION_SEEK means the location identifies a candidate entry. When a winner is picked from among candidates, it becomes LOCATION_DISCRETE. LOCATION_TAIL here signifies we've iterated the file to the end (or the beginning in the case of reversed direction).
2014-12-18journal: abstract the resetting of JournalFile's locationMichal Schmidt
2014-12-18journal: delete unused function journal_file_skip_entry()Michal Schmidt
Its only caller is a test.
2014-12-18journal: delete unused function journal_file_move_to_entry_by_offset()Michal Schmidt
2014-12-13journal: replace contexts hashmap with a plain arrayMichal Schmidt
try_context() is such a hot path that the hashmap lookup is expensive. The number of contexts is small - it is the number of object types. Using a hashmap is overkill. A plain array will do. Before: $ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null real 0m9.445s user 0m9.228s sys 0m0.213s After: $ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null real 0m5.438s user 0m5.266s sys 0m0.170s