Age | Commit message (Collapse) | Author |
|
Fixes #1188.
|
|
Let's store the invocation ID in the per-service keyring as a root-owned key,
with strict access rights. This has the advantage over the environment-based ID
passing that it also works from SUID binaries (as they key cannot be overidden
by unprivileged code starting them), in contrast to the secure_getenv() based
mode.
The invocation ID is now passed in three different ways to a service:
- As environment variable $INVOCATION_ID. This is easy to use, but may be
overriden by unprivileged code (which might be a bad or a good thing), which
means it's incompatible with SUID code (see above).
- As extended attribute on the service cgroup. This cannot be overriden by
unprivileged code, and may be queried safely from "outside" of a service.
However, it is incompatible with containers right now, as unprivileged
containers generally cannot set xattrs on cgroupfs.
- As "invocation_id" key in the kernel keyring. This has the benefit that the
key cannot be changed by unprivileged service code, and thus is safe to
access from SUID code (see above). But do note that service code can replace
the session keyring with a fresh one that lacks the key. However in that case
the key will not be owned by root, which is easily detectable. The keyring is
also incompatible with containers right now, as it is not properly namespace
aware (but this is being worked on), and thus most container managers mask
the keyring-related system calls.
Ideally we'd only have one way to pass the invocation ID, but the different
ways all have limitations. The invocation ID hookup in journald is currently
only available on the host but not in containers, due to the mentioned
limitations.
How to verify the new invocation ID in the keyring:
# systemd-run -t /bin/sh
Running as unit: run-rd917366c04f847b480d486017f7239d6.service
Press ^] three times within 1s to disconnect TTY.
# keyctl show
Session Keyring
680208392 --alswrv 0 0 keyring: _ses
250926536 ----s-rv 0 0 \_ user: invocation_id
# keyctl request user invocation_id
250926536
# keyctl read 250926536
16 bytes of data in key:
9c96317c ac64495a a42b9cd7 4f3ff96b
# echo $INVOCATION_ID
9c96317cac64495aa42b9cd74f3ff96b
# ^D
This creates a new transient service runnint a shell. Then verifies the
contents of the keyring, requests the invocation ID key, and reads its payload.
For comparison the invocation ID as passed via the environment variable is also
displayed.
|
|
|
|
Udev property ordering
|
|
|
|
Add new "khash" API and add new sd_id128_get_machine_app_specific() function
|
|
Let's use chase_symlinks() everywhere, and stop using GNU
canonicalize_file_name() everywhere. For most cases this should not change
behaviour, however increase exposure of our function to get better tested. Most
importantly in a few cases (most notably nspawn) it can take the correct root
directory into account when chasing symlinks.
|
|
We have only two callers, and for neither this "optimization" is useful.
So let's drop it an save some code and a malloc.
|
|
We cannot compare filenames directly, because paths are not sortable
lexicographically, e.g. /etc/udev is "later" (has higher priority)
than /usr/lib/udev.
The on-disk format is changed to have a separate field for "file priority",
which is stored when writing the binary file, and then loaded and used in
comparisons. For data in the previous format (as generated by systemd 232),
this information is not available, and we use a trick where the offset into the
string table is used as a proxy for priority. Most of the time strings are
stored in the order in which the files were processed. This is not entirely
reliable, but is good enough to properly order /usr/lib and /etc/, which are
the two most common cases. This hack is included because it allows proper
parsing of files until the binary hwdb is regenerated.
Instead of adding a new field, I reduced the size of line_number from 64 to 32
bits, and added a 16 bit priority field, and 16 bits of padding. Adding a new
field of 16 bytes would significantly screw up alignment and increase file
size, and line number realistically don't need more than ~20 bits.
Fixes #4750.
|
|
|
|
This adds an API for retrieving an app-specific machine ID to sd-id128.
Internally it calculates HMAC-SHA256 with an 128bit app-specific ID as payload
and the machine ID as key.
(An alternative would have been to use siphash for this, which is also
cryptographically strong. However, as it only generates 64bit hashes it's not
an obvious choice for generating 128bit IDs.)
Fixes: #4667
|
|
This patch handles the custom MTU field in IPv6 RA.
fixes RFE #4464
|
|
Fixes: #4721
|
|
To properly store priority in passed in pointer and return 0 for success.
Also add a test for verifying that it works correctly.
|
|
extract_first_words deals fine with the string being NULL, so drop the upfront
check for that.
|
|
busctl introspect: accept direction="out" for signals.
|
|
|
|
According to the D-Bus spec (v0.29),
| The direction element on <arg> may be omitted, in which case it
| defaults to "in" for method calls and "out" for signals. Signals only
| allow "out" so while direction may be specified, it's pointless.
Therefore we still should accept a 'direction' attribute, even if it's
useless in reality.
Closes: #4616
|
|
Format string tweaks (and a small fix on 32bit)
|
|
The .so symlinks got moved to rootlibdir in 082210c7.
|
|
According to comments in <asm/types.h>, __u64 is always defined as unsigned
long long. Those casts should be superfluous.
|
|
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
|
|
This makes strjoin and strjoina more similar and avoids the useless final
argument.
spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)
git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'
This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
|
|
|
|
|
|
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.
The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.
The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.
The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.
Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.
This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().
PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.
A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.
Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.
|
|
We generate these, hence we should also add errno translations for them.
|
|
These were forgotten, let's add some useful mappings for all errors we define.
|
|
As suggested here:
https://github.com/systemd/systemd/pull/4296#issuecomment-251911349
Let's try AF_INET first as socket, but let's fall back to AF_NETLINK, so that
we can use a protocol-independent socket here if possible. This has the benefit
that our code will still work even if AF_INET/AF_INET6 is made unavailable (for
exmple via seccomp), at least on current kernels.
|
|
hwdb: return conflicts in a well-defined order
|
|
This test sometimes fails in semaphore, but not when run interactively,
so it's hard to debug.
|
|
|
|
If we find duplicates in a property-lookup, make sure to order them by
their origin. That is, matches defined "later" take precedence over
earlier matches. The "later"-order is defined by file-name + line-number
combination. That is, if a match is defined below another one in the
same hwdb file, it takes precedence, same as if it is defined in a file
ordered after another one.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
|
|
Extend the hwdb to store the source file-name and file-number for each
property. We simply extend the stored value struct with the new
information. It is fully backwards compatible and old readers will
continue to work.
The libudev/sd-hwdb reader is updated in a followup.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
|
|
It is not legal to use hard-coded types to calculate offsets. We must
always use the offsets of the hwdb header to calculate those. Otherwise,
we will break horribly if run on hwdb files written by other
implementations or written with future extensions.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
|
|
1. add support for kind vcan
2. fixup indention netlink-types.c, networkd-netdev.c
|
|
|
|
Let's bump it further, as this the current limit turns out to be problematic
IRL. Let's bump it to more than twice what we know of is needed.
Fixes: #4068
|
|
Old libdbus has a feature that the process is terminated whenever the the bus
connection receives a disconnect. This is pretty useful on desktop apps (where
a disconnect indicates session termination), as well as on command line apps
(where we really shouldn't stay hanging in most cases if dbus daemon goes
down).
Add a similar feature to sd-bus, but make it opt-in rather than opt-out, like
it is on libdbus. Also, if the bus is attached to an event loop just exit the
event loop rather than the the whole process.
|
|
This tests in particular that disconnecting results in the tracking object's
handlers to be called.
|
|
objects immediately
If the server side kicks us from the bus, from our view no names are on the bus
anymore, hence let's make sure to dispatch all tracking objects immediately.
|
|
In order to add a name to a bus tracking object we need to do some bus
operations: we need to check if the name already exists and add match for it.
Both are synchronous bus calls. While processing those we need to make sure
that the tracking object is not dispatched yet, as it might still be empty, but
is not going to be empty for very long.
hence, block dispatching by removing the object from the dispatch queue while
adding it, and readding it on error.
|
|
When a bus connection is closed we dispatch all reply callbacks. Do so in a new
function if its own.
No behaviour changes.
|
|
This adds two (privileged) bus calls Ref() and Unref() to the Unit interface.
The two calls may be used by clients to pin a unit into memory, so that various
runtime properties aren't flushed out by the automatic GC. This is necessary
to permit clients to race-freely acquire runtime results (such as process exit
status/code or accumulated CPU time) on successful service termination.
Ref() and Unref() are fully recursive, hence act like the usual reference
counting concept in C. Taking a reference is a privileged operation, as this
allows pinning units into memory which consumes resources.
Transient units may also gain a reference at the time of creation, via the new
AddRef property (that is only defined for transient units at the time of
creation).
|
|
This adds an optional "recursive" counting mode to sd_bus_track. If enabled
adding the same name multiple times to an sd_bus_track object is counted
individually, so that it also has to be removed the same number of times before
it is gone again from the tracking object.
This functionality is useful for implementing local ref counted objects that
peers make take references on.
|
|
A following patch will update cgroup handling so that the systemd controller
(/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel
resource controllers are on the legacy hierarchies. This would require
distinguishing whether all controllers are on cgroup v2 or only the systemd
controller is. In preparation, this patch renames cg_unified() to
cg_all_unified().
This patch doesn't cause any functional changes.
|
|
|
|
Accept both files with and without trailing newlines. Apparently some rkt
releases generated them incorrectly, missing the trailing newlines, and we
shouldn't break that.
|
|
service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).
For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.
A simple way to test this is:
systemd-run -p DynamicUser=1 /bin/sleep 99999
|
|
If the return parameter is NULL, simply validate the string, and return no
error.
|