Age | Commit message (Collapse) | Author |
|
We want that systemd --user gets its own keyring as usual, even if the
barebones PAM snippet we ship upstream is used. If we don't do this we get the
basic keyring systemd --system sets up for us.
|
|
Let's store the invocation ID in the per-service keyring as a root-owned key,
with strict access rights. This has the advantage over the environment-based ID
passing that it also works from SUID binaries (as they key cannot be overidden
by unprivileged code starting them), in contrast to the secure_getenv() based
mode.
The invocation ID is now passed in three different ways to a service:
- As environment variable $INVOCATION_ID. This is easy to use, but may be
overriden by unprivileged code (which might be a bad or a good thing), which
means it's incompatible with SUID code (see above).
- As extended attribute on the service cgroup. This cannot be overriden by
unprivileged code, and may be queried safely from "outside" of a service.
However, it is incompatible with containers right now, as unprivileged
containers generally cannot set xattrs on cgroupfs.
- As "invocation_id" key in the kernel keyring. This has the benefit that the
key cannot be changed by unprivileged service code, and thus is safe to
access from SUID code (see above). But do note that service code can replace
the session keyring with a fresh one that lacks the key. However in that case
the key will not be owned by root, which is easily detectable. The keyring is
also incompatible with containers right now, as it is not properly namespace
aware (but this is being worked on), and thus most container managers mask
the keyring-related system calls.
Ideally we'd only have one way to pass the invocation ID, but the different
ways all have limitations. The invocation ID hookup in journald is currently
only available on the host but not in containers, due to the mentioned
limitations.
How to verify the new invocation ID in the keyring:
# systemd-run -t /bin/sh
Running as unit: run-rd917366c04f847b480d486017f7239d6.service
Press ^] three times within 1s to disconnect TTY.
# keyctl show
Session Keyring
680208392 --alswrv 0 0 keyring: _ses
250926536 ----s-rv 0 0 \_ user: invocation_id
# keyctl request user invocation_id
250926536
# keyctl read 250926536
16 bytes of data in key:
9c96317c ac64495a a42b9cd7 4f3ff96b
# echo $INVOCATION_ID
9c96317cac64495aa42b9cd74f3ff96b
# ^D
This creates a new transient service runnint a shell. Then verifies the
contents of the keyring, requests the invocation ID key, and reads its payload.
For comparison the invocation ID as passed via the environment variable is also
displayed.
|
|
This patch ensures that each system service gets its own session kernel keyring
automatically, and implicitly. Without this a keyring is allocated for it
on-demand, but is then linked with the user's kernel keyring, which is OK
behaviour for logged in users, but not so much for system services.
With this change each service gets a session keyring that is specific to the
service and ceases to exist when the service is shut down. The session keyring
is not linked up with the user keyring and keys hence only search within the
session boundaries by default.
(This is useful in a later commit to store per-service material in the keyring,
for example the invocation ID)
(With input from David Howells)
|
|
|
|
|
|
|
|
When getting SIGCHLD we should not assume that it was the first
child forked from system-nspawn that has died as it may also be coming
from an orphan process. This change adds a signal handler that ignores
SIGCHLD unless it came from the first containerized child - the real
child.
Before this change the problem can be reproduced as follows:
$ sudo systemd-nspawn --directory=/container-root --share-system
Press ^] three times within 1s to kill container.
[root@andreyu-coreos ~]# { true & } &
[1] 22201
[root@andreyu-coreos ~]#
Container root-fedora-latest terminated by signal KILL
|
|
Udev property ordering
|
|
Catalog message improvements
|
|
This is also an error, but it wasn't caught.
[/tmp/tmp.YWeKax4fMI/etc/udev/hwdb.d/10-bad.hwdb:26] Property expected, ignoring record with no properties
|
|
systemd.journal-fields(7) documents CODE_FUNC=. Internally, we were
inconsistent: sd_journal_print uses CODE_FUNC=, log.h has CODE_FUNCTION=,
python-systemd and bootchart also used CODE_FUNC=, when they were internal.
Most external projects use sd_journal_* functions, so CODE_FUNC=,
python-systemd still uses CODE_FUNC=, as does systemd-bootchart, and
independent reimplementations in golang-github-coreos-go-systemd, qtbase,
network manager, glib, pulseaudio. Hence, I don't think there's much
choice.
|
|
Those square brackets don't fit how our other messages look like; we use colons
everywhere else. The "[a:b]" format was originally added in
ed5bcfbe3c3b68e59242c03649eea03a9707d318, and remained unchanged for 7 years,
but in the meantime other conventions evolved.
The new version is also one character shorter.
[/etc/systemd/system/systemd-networkd.service.d/override.conf:2] Failed to parse sec value, ignoring: ...
↓
/etc/systemd/system/systemd-networkd.service.d/override.conf:2: Failed to parse sec value, ignoring: ...
|
|
We can take advantage of the fact a NULL argument terminates the list.
|
|
Networkd man page update and fixes for the fallout
|
|
Fix some build issues and warnings
|
|
A prettification of the dissect code, mkosi and TODO updates
|
|
This add a new message id for the end of user instance startup.
User manager startup is a different beast then the system startup.
Their descriptions are completely different too. Let's just separate
them.
Partially fixes #3351.
Also remove "successful" from the description, since we don't know if
the startup was successful or not.
|
|
Our warning message was misleading, because we wouldn't "correct" anything,
we'd just ignore unkown escapes. Update the message.
Also, print just the extracted word (which contains the offending sequences) in
the message, instead of the whole line.
Fixes #4697.
|
|
The loop must terminate after at most three iterations anyway.
|
|
This is already fixed upstream, so warning is not useful.
Let's keep the workaround until the fix has percolated downstream.
|
|
|
|
We define those macros, and there's no reason to have one without
the other.
|
|
Completely unstested. Fixes #4862.
|
|
Various specifier resolution fixes.
|
|
Generalize image dissection logic of nspawn, and make it useful for other tools.
|
|
|
|
Otherwise we'd fail with an assertion:
Assertion 't->family == AF_INET' failed at ../src/network/netdev/tunnel.c:244, function netdev_vti_fill_message_create(). Aborting.
|
|
When assigning addresses, we'd set the family, and later
verify that the address on the other end has the same family.
But when the address was specified as "any", we'd simply unset
the family. Instead, only unset the family if both addresses
are wiped.
Also, don't bother setting family = AF_UNSPEC, since it's the default (0).
|
|
|
|
%m isn't useful in success path.
|
|
Generally non-inverted conditions are nicer, and ternary operators
with complex conditions are a bit hard to read.
No functional change.
|
|
|
|
Add new "khash" API and add new sd_id128_get_machine_app_specific() function
|
|
Follow up for #4809.
|
|
This might happen that resolv.conf is missing in a minimal rootfs and in this
case the following warning is emitted:
Failed to mount n/a on /mnt/etc/resolv.conf (MS_BIND ""): No such file or directory
This patch fixes this case.
|
|
Go through stop_post on failure (#4770)
|
|
This makes the code to set arg_flags much more readable.
|
|
|
|
|
|
%c and %r rely on settings made in the unit files themselves and hence resolve
to different values depending on whether they are used before or after Slice=.
Let's simply deprecate them and drop them from the documentation, as that's not
really possible to fix. Moreover they are actually redundant, as the same
information may always be queried from /proc/self/cgroup and /proc/1/cgroup.
(Accurately speaking, %R is actually not broken like this as it is constant.
However, let's remove all cgroup-related specifiers at once, as it is also
redundant, and doesn't really make much sense alone.)
|
|
|
|
Expanding specifiers here definitely makes sense.
Also simplifies the loop a bit, as there's no reason to keep "prev" around...
|
|
This might be useful for some people, for example to pull in mounts for paths
including the machine ID or hostname.
|
|
Let's permit specifier expansion at a numbre of additional fields, where
arbitrary strings might be passed where this might be useful one day. (Or at
least where there's no clear reason where it wouldn't make sense to have.)
|
|
unit_name_printf() before
For settings that are not taking unit names there's no reason to use
unit_name_printf(). Use unit_full_printf() instead, as the names are validated
anyway in one form or another after expansion.
|
|
unit_name_printf() is usually what we use when the resulting string shall
qualify as unit name, and it hence avoids resolving specifiers that almost
certainly won't result in valid unit names.
Add a couple of more specifiers that unit_full_printf() resolves also to the
list unit_name_printf() resolves, as they are likely to be useful in valid unit
names too. (Note that there might be cases where this doesn't hold, but we
should still permit this, as more often than not they are safe, and if people
want to use them that way, they should be able to.)
|
|
This monopolizes unit file specifier expansion in load-fragment.c, and removes
it from socket.c + service.c. This way expansion becomes an operation done exclusively at time of loading unit files.
Previously specifiers were resolved for all settings during loading of unit
files with the exception of ExecStart= and friends which were resolved in
socket.c and service.c. With this change the latter is also moved to the
loading of unit files.
Fixes: #3061
|
|
This adds support for discovering and making use of properly tagged dm-verity
data integrity partitions. This extends both systemd-nspawn and systemd-dissect
with a new --root-hash= switch that takes the root hash to use for the root
partition, and is otherwise fully automatic.
Verity partitions are discovered automatically by GPT table type UUIDs, as
listed in
https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/
(which I updated prior to this change, to include new UUIDs for this purpose.
mkosi with https://github.com/systemd/mkosi/pull/39 applied may generate images
that carry the necessary integrity data. With that PR and this commit, the
following simply lines suffice to boot up an integrity-protected container image:
```
# mkdir test
# cd test
# mkosi --verity
# systemd-nspawn -i ./image.raw -bn
```
Note that mkosi writes the image file to "image.raw" next to a a file
"image.roothash" that contains the root hash. systemd-nspawn will look for that
file and use it if it exists, in case --root-hash= is not specified explicitly.
|
|
Let's prettify the machine name we generate for image-based containers: let's
chop off the .raw suffix before using it as machine name.
|
|
This adds support to the image dissector to deal with encrypted images (only
LUKS). Given that we now have a neatly isolated image dissector codebase, let's
add a new feature to it: support for automatically dealing with encrypted
images. This is then exposed in systemd-dissect and nspawn.
It's pretty basic: only support for passphrase-based encryption.
In order to ensure that "systemd-dissect --mount" results in mount points whose
backing LUKS DM devices are cleaned up automatically we use the DM_DEV_REMOVE
ioctl() directly on the device (in DM_DEFERRED_REMOVE mode). libgcryptsetup at
the moment doesn't provide a proper API for this. Thankfully, the ioctl() API
is pretty easy to use.
|