Age | Commit message (Collapse) | Author |
|
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.
The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.
The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.
The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.
Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.
This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().
PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.
A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.
Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
|
|
|
|
We are already attaching the system slice information to log messages, now add
theuser slice info too, as well as the object slice info.
|
|
journal_rate_limit_test() (#4291)
Currently, the ratelimit does not handle the number of suppressed messages accurately.
Even though the number of messages reaches the limit, it still allows to add one extra messages to journal.
This patch fixes the problem.
|
|
When s->length is zero this function doesn't do anything, note that in a
comment.
|
|
This patch fixes wrong calculation of burst_modulate(), which now calculates
the values smaller than really expected ones if available disk space is
strictly more than 1MB.
In particular, if available disk space is strictly more than 1MB and strictly
less than 16MB, the resulted value becomes smaller than its original one.
>>> (math.log2(1*1024**2)-16) / 4
1.0
>>> (math.log2(16*1024**2)-16) / 4
2.0
>>> (math.log2(256*1024**2)-16) / 4
3.0
→ This matches the comment in the function.
|
|
Since commit 5996c7c295e073ce21d41305169132c8aa993ad0 (v190 !), the
calculation of the HMAC is broken because the hash for a data object
including a field is done in the wrong order: the field object is
hashed before the data object is.
However during verification, the hash is done in the opposite order as
objects are scanned sequentially.
|
|
We shouldn't silently fail when appending the tag to a journal file
since FSS protection will simply be disabled in this case.
|
|
|
|
Network file dropins
|
|
In preparation for adding a version which takes a strv.
|
|
The key used to be jammed next to the local file path. Based on the format string on line 1675, I determined that the order of arguments was written incorrectly, and updated the function based on that assumption.
Before:
```
Please write down the following secret verification key. It should be stored
at a safe location and should not be saved locally on disk.
/var/log/journal/9b47c1a5b339412887a197b7654673a7/fss8f66d6-f0a998-f782d0-1fe522/18fdb8-35a4e900
The sealing key is automatically changed every 15min.
```
After:
```
Please write down the following secret verification key. It should be stored
at a safe location and should not be saved locally on disk.
d53ed4-cc43d6-284e10-8f0324/18fdb8-35a4e900
The sealing key is automatically changed every 15min.
```
|
|
Strerror removal and other janitorial cleanups
|
|
|
|
|
|
According to its manual page, flags given to mkostemp(3) shouldn't include
O_RDWR, O_CREAT or O_EXCL flags as these are always included. Beyond
those, the only flag that all callers (except a few tests where it
probably doesn't matter) use is O_CLOEXEC, so set that unconditionally.
|
|
Minor cleanup suggested by Lennart.
|
|
When the system journal becomes re-opened post-flush with the runtime
journal open, it implies we've recovered from something like an ENOSPC
situation where the system journal rotate had failed, leaving the system
journal closed, causing the runtime journal to be opened post-flush.
For the duration of the unavailable system journal, we log to the
runtime journal. But when the system journal gets opened (space made
available, for example), we need to close the runtime journal before new
journal writes will go to the system journal. Calling
server_flush_to_var() after opening the system journal with a runtime
journal present, post-flush, achieves this while preserving the runtime
journal's contents in the system journal.
The combination of the present flushed flag file and the runtime journal
being open is a state where we should be logging to the system journal,
so it's appropriate to resume doing so once we've successfully opened
the system journal.
|
|
Dynamic users should be treated like system users, and their logs
should end up in the main system journal.
|
|
Make journalctl more flexible
|
|
If journals get into a closed state like when rotate fails due to
ENOSPC, when space is made available it currently goes unnoticed leaving
the journals in a closed state indefinitely.
By calling system_journal_open() on entry to find_journal() we ensure
the journal has been opened/created if possible.
Also moved system_journal_open() up to after open_journal(), before
find_journal().
Fixes https://github.com/systemd/systemd/issues/3968
|
|
It is useful to look at a (possibly inactive) container or other os tree
with --root=/path/to/container. This is similar to specifying
--directory=/path/to/container/var/log/journal --directory=/path/to/container/run/systemd/journal
(if using --directory multiple times was allowed), but doesn't require
as much typing.
|
|
The directory argument that is given to sd_j_o_d was ignored when
SD_JOURNAL_OS_ROOT was given, and directories relative to the root of the host
file system were used. With that flag, sd_j_o_d should do the same as
sd_j_open_container: use the path as "prefix", i.e. the directory relative to
which everything happens.
Instead of touching sd_j_o_d, journal_new is fixed to do what sd_j_o_c
was doing, and treat the specified path as prefix when SD_JOURNAL_OS_ROOT is
specified.
|
|
There is no reason not to. This makes journalctl -D ... --system work,
useful for example when viewing files from a deactivated container.
|
|
… in preparation for future changes.
|
|
the /) (#3934)
Fixes #3927.
|
|
Fixed (master) versions of libtool pass -fsanitize=address correctly
into CFLAGS and LDFLAGS allowing ASAN to be used without any special
configure tricks..however ASAN triggers in lookup3.c for the same
reasons valgrind does. take the alternative codepath if
__SANITIZE_ADDRESS__ is defined as well.
|
|
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a
matching tmp_dir() call (the former looks for the place for /var/tmp, the
latter for /tmp).
Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses.
All dirs are validated before use. secure_getenv() is used in order to limite
exposure in suid binaries.
This also ports a couple of users over to these new APIs.
The var_tmp() return parameter is changed from an allocated buffer the caller
will own to a const string either pointing into environ[], or into a static
const buffer. Given that environ[] is mostly considered constant (and this is
exposed in the very well-known getenv() call), this should be OK behaviour and
allows us to avoid memory allocations in most cases.
Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
|
|
|
|
…since 4de282cf9324ab.
|
|
In this mode, messages from processes which are not part of the session
land in the main journal file, and only output of processes which are
properly part of the session land in the user's journal. This is
confusing, in particular because systemd-coredump runs outside of the
login session.
"Deprecate" SplitMode=login by removing it from documentation, to
discourage people from using it.
|
|
It's a bit easier to read because shorter. Also, most likely a tiny bit faster.
|
|
Let's make sure our logging APIs is in sync with how stdout/stderr logging
works.
|
|
When converting log messages from human readable text into binary records to
send off to journald in sd_journal_print(), strip trailing whitespace in the
log message. This way, handling of logs made via syslog(), stdout/stderr and
sd_journal_print() are treated the same way: trailing (but not leading)
whitespace is automatically removed, in particular \n and \r. Note that in case
of syslog() and stdout/stderr based logging the stripping takes place
server-side though, while for the native protocol based transport this takes
place client-side. This is because in the former cases conversion from
free-form human-readable strings into structured, binary log records takes
place on the server-side while for journal-native logging it happens on the
client side, and after conversion into binary records we probably shouldn't
alter the data anymore.
See: #3416
|
|
https://github.com/SELinuxProject/selinux/commit/9eb9c9327563014ad6a807814e7975424642d5b9
deprecated selinux_context_t. Replace with a simple char* everywhere.
Alternative fix for #3719.
|
|
|
|
journalctl: Use env variable TMPDIR to save temporary files
|
|
--boot=0 magically meant "this boot", but when used with --file/--directory it
should simply refer to the last boot found in the specified journal. This way,
--boot and --list-boots are consistent.
Fixes #3603.
|
|
Those are just local variables and ref_boot_offset is especially
obnoxious.
|
|
Before --this-boot was deprecated in a331b5e6d47243, it did not take
any arguments.
|
|
It works mostly fine, and can be quite useful to examine data from another
system.
OTOH, a single boot id doesn't make sense with --merge, so mixing with --merge
is still not allowed.
|
|
variables
|
|
|
|
Let's make sure SYSTEMD_COLORS is honour by more tools
|
|
No need to check whether alloca() failed...
|
|
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.
This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
|
|
|
|
chattr_path() takes two bitmasks, and no booleans. Fix the various invocations
to do this properly.
|
|
We generally follow the rule that for time settings we suffix the setting name
with "Sec" to indicate the default unit if none is specified. The only
exception was the rate limiting interval settings. Fix this, and keep the old
names for compatibility.
Do the same for journald's RateLimitInterval= setting
|
|
As suggested by:
https://github.com/systemd/systemd/pull/3126#discussion_r61125474
|