Age | Commit message (Collapse) | Author |
|
|
|
When we rotate journals, we must set offline and close the current one,
but don't generally need to wait for this to complete.
Instead, we'll initiate an asynchronous offline via
journal_file_set_offline(oldfile, false), and add the file to a
per-server set of deferred closes to be closed later when they
won't block.
There's one complication however; journal_file_open() via
journal_file_verify_header() assumes that any writable journal in the
online state is the product of an unclean shutdown or other form of
corruption.
Thus there's a need for journal_file_open() to be aware of deferred
closes and synchronize with their completion when opening preexisting
journals for writing. To facilitate this the deferred closes set is
supplied to the journal_file_open() function where the deferred closes
may be closed synchronously before verifying the header in such
circumstances.
|
|
This adds a wait flag to journal_file_set_offline(), when false the offline is
performed asynchronously in a separate thread.
When wait is true, if an asynchronous offline is already in-progress it is
restarted and waited for. Otherwise the offline is performed synchronously
without the use of a thread.
journal_file_set_online() cancels or waits for the asynchronous offline to
complete if in-flight, depending on where in the offline process the thread
happens to be. If the thread is in the fsync() phase, it is cancelled and
waiting is unnecessary. Otherwise, the thread is joined before proceeding.
A new offline_state member is added to JournalFile which is used via
atomic operations for communicating between the offline thread and the
journal_file_set_{offline,online}() functions.
|
|
|
|
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
|
|
None of the callers take advantage of this parameter, it's always NULL,
this is just a private helper function to simplify the call sites so
drop the template parameter altogether. If a caller emerges later who
needs it, it can be restored.
|
|
journald: minor fixes
|
|
Whenever we include a log level or facility in a journal string field, make sure the compiler checks for us that that's
actually the right thing to do.
|
|
Journald disk usage
|
|
This primarily contains some minor coding style fixups for 7a24f3bf2fb181243a1957a0cdd54cd919396793 and earlier changes. Specifically:
* Don't log at log levels above LOG_DEBUG from "library" code like journal-file.c
* Don't negate errno values before passing them to log_debug_errno(), as the call can handle this fine anyway
* Cast some calls we knowingly ignore the return values of to (void)
* Don't clobber function call-by-ref return values on failure
* Don't mix function calls and variable declarations in one line
There's also one more relevant change: when failing to enqueue a journal change fs event, we'll run it immediately.
|
|
v2:
- use xsprintf
|
|
journal: coalesce ftruncate()s in 250ms windows
|
|
The format of the journald disk usage log entry was changed back and
forth a few times. It is annoying to have a very verbose message, but
if it is short it is hard to understand. But we have a tool for this,
the catalogue.
$ journalctl -x -u systemd-journald
Jan 23 18:48:50 rawhide systemd-journald[891]: Runtime journal (/run/log/journal/) is 8.0M, max 196.2M, 188.2M free.
-- Subject: Disk space used by the journal
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Runtime journal (/run/log/journal/) is currently using 8.0M.
-- Maximum allowed usage is set to 196.2M.
-- Leaving at least 294.3M free (of currently available 1.9G of disk space).
-- Enforced usage limit is thus 196.2M, of which 188.2M are still available.
--
-- The limits controlling how much disk space is used by the journal may
-- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=,
-- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in
-- /etc/systemd/journald.conf. See journald.conf(5) for details.
Jan 23 18:48:50 rawhide systemd-journald[891]: System journal (/var/log/journal/) is 480.1M, max 1.6G, 1.2G free.
-- Subject: Disk space used by the journal
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- System journal (/var/log/journal/) is currently using 480.1M.
-- Maximum allowed usage is set to 1.6G.
-- Leaving at least 2.5G free (of currently available 5.8G of disk space).
-- Enforced usage limit is thus 1.6G, of which 1.2G are still available.
--
-- The limits controlling how much disk space is used by the journal may
-- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=,
-- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in
-- /etc/systemd/journald.conf. See journald.conf(5) for details.
|
|
The code to format the iovec is shared with log.c. All call sites to
server_driver_message are changed to include the additional "MESSAGE="
part, but the new functionality is not used and change in functionality
is not expected.
iovec is preallocated, so the maximum number of messages is limited.
In server_driver_message N_IOVEC_PAYLOAD_FIELDS is currently set to 1.
New code is not oom safe, it will fail if memory cannot be allocated.
This will be fixed in subsequent commit.
|
|
|
|
Prior to this change every journal append causes an ftruncate() for the
sake of inotify propagation of the mmap-based writes.
With this change the notification is deferred up to ~250ms, coalescing
any repeated journal writes during the deferred period into a single
ftruncate(). The ftruncate() call isn't free and doing it on every
append adds unnecessary overhead and latency in the journald event loop.
Introduces journal_file_enable_post_change_timer() which manages a
timer on the provided sd-event instance for scheduling coalesced
ftruncates. The ftruncate() behavior is unchanged unless
journal_file_enable_post_change_timer() is called on the JournalFile.
While not a tremendous improvement, profiling systemd-journald event loop
latencies using instrumentation as introduced by 34b8751 it was observed that
coalescing the ftruncates was low-hanging fruit worth pursuing.
Note orders 12 and 13 shifting left into order 11 and order 6 dipping into
order 5:
Unmodified:
log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
-----------------------------------------------------------
[10685.414572] 0 0 0 0 38 602 61 2 290 60 1643 2554 13 1 4 1 0 0 1
[10690.415114] 0 0 0 0 0 646 54 7 309 44 2073 2148 17 1 3 0 0 0 1
[10695.415509] 0 0 0 0 1 650 73 3 324 37 2071 2270 9 0 0 1 0 1 0
[10700.416297] 0 0 0 0 0 659 50 4 318 38 2111 2152 6 0 1 0 0 1 1
[10705.417136] 0 0 0 0 2 660 48 4 320 38 2129 2146 12 1 1 0 0 1 1
[10710.489114] 0 0 0 0 0 673 38 3 321 37 1925 2339 7 0 0 0 0 1 1
[10715.489613] 0 0 0 0 3 656 64 8 317 48 2365 2007 7 0 0 0 0 0 1
Coalesced:
log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
-----------------------------------------------------------
[ 6169.161360] 0 0 0 1 24 786 54 11 389 24 4192 771 6 4 0 0 1 0 1
[ 6174.161705] 0 0 0 1 18 800 35 6 380 27 3977 893 3 1 0 0 1 0 1
[ 6179.162741] 0 0 0 1 28 768 51 4 391 16 3998 831 5 3 0 0 0 0 2
[ 6184.162856] 0 0 0 0 19 770 60 2 376 26 3795 1004 9 5 1 0 1 0 1
[ 6189.163279] 0 0 0 0 28 761 49 7 372 27 3729 1056 3 2 0 0 1 0 1
[ 6194.164255] 0 0 0 0 25 785 49 7 394 19 3996 908 6 3 2 0 0 0 1
[ 6199.164658] 0 0 0 0 29 797 35 5 389 18 3995 898 3 4 1 1 1 0 1
The remaining high-order delays are a result of the synchronous fsyncs in
systemd-journald, beyond the scope of this commit.
|
|
Two unrelated fixes
|
|
Most of the function is moved to acl-util.c to make it possible to
add tests in subsequent commit.
Setting of the mode in server_fix_perms is removed:
- we either just created the file ourselves, and the permission be better right,
- or the file was already there, and we should not modify the permissions.
server_fix_perms is renamed to server_fix_acls to better reflect new
meaning, and made static because it is only used in one file.
|
|
Let's distuingish the cases where our code takes an active role in
selinux management, or just passively reports whatever selinux
properties are set.
mac_selinux_have() now checks whether selinux is around for the passive
stuff, and mac_selinux_use() for the active stuff. The latter checks the
former, plus also checks UID == 0, under the assumption that only when
we run priviliged selinux management really makes sense.
Fixes: #1941
|
|
tree-wide: group include of libudev.h with sd-*
|
|
|
|
|
|
|
|
Sort the includes accoding to the new coding style.
|
|
Adding 3/4th of the watchdog frequency as accuracy on top of 1/2 of the
watchdog frequency means we might end up at 5/4th of the frequency which
means we might miss the message from time to time.
Maybe fixes #1804
|
|
Previously, we'd rely on the mtime timestamps of the touch files to see
if our sync/rotation requests were already suppressed. This means we
rely on CLOCK_REALTIME timestamps. With this patch we instead store the
CLOCK_MONOTONIC timestamp *in* the touch files, and avoid relying on
mtime.
This should make things more reliable when the clock or underlying mtime
granularity is not very good.
This also adds warning messages if writing any of the flag files fails.
|
|
Of course, ideally we'd just use normal synchronous bus calls, but this
is out of the question as long as we rely on dbus-daemon (which logs to
journald, and thus cannot use to avoid cyclic sync loops). Hence,
instead, reuse the wait logic already implemented for --sync, and use a
signal in one direction, and a mtime watch file for the reply.
|
|
With this new "--sync" switch we add a synchronous way to sync
everything queued to disk, and return only after that's complete. This
command gives the guarantee that anything queued before has hit the disk
before the command returns.
While we are at it, also improve the man pages and help text for
journalctl a bit.
|
|
The event might be flagged with stuff we don't expect, hence don't
be needlessly picky, just rely on the kernel passing us sensible events.
|
|
Let's make sure to process all queued log data before exiting, so that
we don't unnecessary lose messages when shutting down.
https://github.com/systemd/systemd/pull/1812#issuecomment-155149871
|
|
The macro is generically useful for putting together search paths, hence
let's make it truly generic, by dropping the implicit ".d" appending it
does, and leave that to the caller. Also rename it from
CONF_DIRS_NULSTR() to CONF_PATHS_NULSTR(), since it's not strictly about
dirs that way, but any kind of file system path.
Also, mark CONF_DIR_SPLIT_USR() as internal macro by renaming it to
_CONF_PATHS_SPLIT_USR() so that the leading underscore indicates that
it's internal.
|
|
Our functions return negative error codes.
Do not rely on errno being set after calling our own functions.
|
|
|
|
Otherwise we might run into deadlocks, when journald blocks on the
notify socket on PID 1, and PID 1 blocks on IPC to dbus-daemon and
dbus-daemon blocks on logging to journald. Break this cycle by making
sure that journald never ever blocks on PID 1.
Note that this change disables support for event loop watchdog support,
as these messages are sent in blocking style by sd-event. That should
not be a big loss though, as people reported frequent problems with the
watchdog hitting journald on excessively slow IO.
Fixes: #1505.
|
|
|
|
|
|
|
|
capability-util.[ch]
The files are named too generically, so that they might conflict with
the upstream project headers. Hence, let's add a "-util" suffix, to
clarify that this are just our utility headers and not any official
upstream headers.
|
|
|
|
|
|
Also, move a couple of more path-related functions to path-util.c.
|
|
|
|
|
|
journald-server: port to extract_first_word
|
|
|
|
There are more than enough to deserve their own .c file, hence move them
over.
|
|
string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
|
|
|
|
|
|
Make journald audit socket maskable
|