Age | Commit message (Collapse) | Author |
|
CID #1368243.
|
|
|
|
|
|
This substantially reworks the seccomp code, to ensure better
compatibility with some architectures, including i386.
So far we relied on libseccomp's internal handling of the multiple
syscall ABIs supported on Linux. This is problematic however, as it does
not define clear semantics if an ABI is not able to support specific
seccomp rules we install.
This rework hence changes a couple of things:
- We no longer use seccomp_rule_add(), but only
seccomp_rule_add_exact(), and fail the installation of a filter if the
architecture doesn't support it.
- We no longer rely on adding multiple syscall architectures to a single filter,
but instead install a separate filter for each syscall architecture
supported. This way, we can install a strict filter for x86-64, while
permitting a less strict filter for i386.
- All high-level filter additions are now moved from execute.c to
seccomp-util.c, so that we can test them independently of the service
execution logic.
- Tests have been added for all types of our seccomp filters.
- SystemCallFilters= and SystemCallArchitectures= are now implemented in
independent filters and installation logic, as they semantically are
very much independent of each other.
Fixes: #4575
|
|
instance might be "", and that string would be leaked.
CID #1368264.
|
|
|
|
This is a follow-up for dc7dd61de610e9330
|
|
Fixes:
```
touch hola.service
systemctl link $(pwd)/hola.service $(pwd)/hola.service
```
```
==1==ERROR: AddressSanitizer: attempting double-free on 0x60300002c560 in thread T0 (systemd):
#0 0x7fc8c961cb00 in free (/lib64/libasan.so.3+0xc6b00)
#1 0x7fc8c90ebd3b in strv_clear src/basic/strv.c:83
#2 0x7fc8c90ebdb6 in strv_free src/basic/strv.c:89
#3 0x55637c758c77 in strv_freep src/basic/strv.h:37
#4 0x55637c763ba9 in method_enable_unit_files_generic src/core/dbus-manager.c:1960
#5 0x55637c763d16 in method_link_unit_files src/core/dbus-manager.c:2001
#6 0x7fc8c92537ec in method_callbacks_run src/libsystemd/sd-bus/bus-objects.c:418
#7 0x7fc8c9258830 in object_find_and_run src/libsystemd/sd-bus/bus-objects.c:1255
#8 0x7fc8c92594d7 in bus_process_object src/libsystemd/sd-bus/bus-objects.c:1371
#9 0x7fc8c91e7553 in process_message src/libsystemd/sd-bus/sd-bus.c:2563
#10 0x7fc8c91e78ce in process_running src/libsystemd/sd-bus/sd-bus.c:2605
#11 0x7fc8c91e8f61 in bus_process_internal src/libsystemd/sd-bus/sd-bus.c:2837
#12 0x7fc8c91e90d2 in sd_bus_process src/libsystemd/sd-bus/sd-bus.c:2856
#13 0x7fc8c91ea8f9 in io_callback src/libsystemd/sd-bus/sd-bus.c:3126
#14 0x7fc8c928333b in source_dispatch src/libsystemd/sd-event/sd-event.c:2268
#15 0x7fc8c9285cf7 in sd_event_dispatch src/libsystemd/sd-event/sd-event.c:2627
#16 0x7fc8c92865fa in sd_event_run src/libsystemd/sd-event/sd-event.c:2686
#17 0x55637c6b5257 in manager_loop src/core/manager.c:2274
#18 0x55637c6a2194 in main src/core/main.c:1920
#19 0x7fc8c7ac7400 in __libc_start_main (/lib64/libc.so.6+0x20400)
#20 0x55637c697339 in _start (/usr/lib/systemd/systemd+0xcd339)
0x60300002c560 is located 0 bytes inside of 19-byte region [0x60300002c560,0x60300002c573)
freed by thread T0 (systemd) here:
#0 0x7fc8c961cb00 in free (/lib64/libasan.so.3+0xc6b00)
#1 0x7fc8c90ee320 in strv_remove src/basic/strv.c:630
#2 0x7fc8c90ee190 in strv_uniq src/basic/strv.c:602
#3 0x7fc8c9180533 in unit_file_link src/shared/install.c:1996
#4 0x55637c763b25 in method_enable_unit_files_generic src/core/dbus-manager.c:1985
#5 0x55637c763d16 in method_link_unit_files src/core/dbus-manager.c:2001
#6 0x7fc8c92537ec in method_callbacks_run src/libsystemd/sd-bus/bus-objects.c:418
#7 0x7fc8c9258830 in object_find_and_run src/libsystemd/sd-bus/bus-objects.c:1255
#8 0x7fc8c92594d7 in bus_process_object src/libsystemd/sd-bus/bus-objects.c:1371
#9 0x7fc8c91e7553 in process_message src/libsystemd/sd-bus/sd-bus.c:2563
#10 0x7fc8c91e78ce in process_running src/libsystemd/sd-bus/sd-bus.c:2605
#11 0x7fc8c91e8f61 in bus_process_internal src/libsystemd/sd-bus/sd-bus.c:2837
#12 0x7fc8c91e90d2 in sd_bus_process src/libsystemd/sd-bus/sd-bus.c:2856
#13 0x7fc8c91ea8f9 in io_callback src/libsystemd/sd-bus/sd-bus.c:3126
#14 0x7fc8c928333b in source_dispatch src/libsystemd/sd-event/sd-event.c:2268
#15 0x7fc8c9285cf7 in sd_event_dispatch src/libsystemd/sd-event/sd-event.c:2627
#16 0x7fc8c92865fa in sd_event_run src/libsystemd/sd-event/sd-event.c:2686
#17 0x55637c6b5257 in manager_loop src/core/manager.c:2274
#18 0x55637c6a2194 in main src/core/main.c:1920
#19 0x7fc8c7ac7400 in __libc_start_main (/lib64/libc.so.6+0x20400)
previously allocated by thread T0 (systemd) here:
#0 0x7fc8c95b0160 in strdup (/lib64/libasan.so.3+0x5a160)
#1 0x7fc8c90edf32 in strv_extend src/basic/strv.c:552
#2 0x7fc8c923ae41 in bus_message_read_strv_extend src/libsystemd/sd-bus/bus-message.c:5578
#3 0x7fc8c923b0de in sd_bus_message_read_strv src/libsystemd/sd-bus/bus-message.c:5600
#4 0x55637c7639d1 in method_enable_unit_files_generic src/core/dbus-manager.c:1969
#5 0x55637c763d16 in method_link_unit_files src/core/dbus-manager.c:2001
#6 0x7fc8c92537ec in method_callbacks_run src/libsystemd/sd-bus/bus-objects.c:418
#7 0x7fc8c9258830 in object_find_and_run src/libsystemd/sd-bus/bus-objects.c:1255
#8 0x7fc8c92594d7 in bus_process_object src/libsystemd/sd-bus/bus-objects.c:1371
#9 0x7fc8c91e7553 in process_message src/libsystemd/sd-bus/sd-bus.c:2563
#10 0x7fc8c91e78ce in process_running src/libsystemd/sd-bus/sd-bus.c:2605
#11 0x7fc8c91e8f61 in bus_process_internal src/libsystemd/sd-bus/sd-bus.c:2837
#12 0x7fc8c91e90d2 in sd_bus_process src/libsystemd/sd-bus/sd-bus.c:2856
#13 0x7fc8c91ea8f9 in io_callback src/libsystemd/sd-bus/sd-bus.c:3126
#14 0x7fc8c928333b in source_dispatch src/libsystemd/sd-event/sd-event.c:2268
#15 0x7fc8c9285cf7 in sd_event_dispatch src/libsystemd/sd-event/sd-event.c:2627
#16 0x7fc8c92865fa in sd_event_run src/libsystemd/sd-event/sd-event.c:2686
#17 0x55637c6b5257 in manager_loop src/core/manager.c:2274
#18 0x55637c6a2194 in main src/core/main.c:1920
#19 0x7fc8c7ac7400 in __libc_start_main (/lib64/libc.so.6+0x20400)
SUMMARY: AddressSanitizer: double-free (/lib64/libasan.so.3+0xc6b00) in free
==1==ABORTING
```
Closes #5015
|
|
Easily reproducible:
1) systemctl mask foo
2) systemctl unmask foo foo
The problem here is that the *i that is put into todo[] is later freed
in strv_uniq(), which is not directly visible from this patch. Somewhere
further in the code, the string that *i pointed to is freed again. That
happens only when multiple services with the same name/path are specified.
|
|
|
|
This permits "systemctl enable" and "systemctl add-wants" on template
units without any specifications of an instance name, neither specified
on the command line, nor specified in DefaultInstance= field of the
[install] section.
Fixes: #3473
|
|
The system call is obsolete after all.
|
|
|
|
These groupe reboot()/kexec() and swapon()/swapoff() respectively
|
|
.roothash file too
Since nspawn looks for them, importd now downloads them, and mkosi
generates them, let's make sure they also processed correctly on all
machined operations.
|
|
In preparation for reusing the image dissector in the GPT auto-discovery
logic, only optionally fail the dissection when we can't identify a root
partition.
In the GPT auto-discovery we are completely fine with any kind of root,
given that we run when it is already mounted and all we do is find some
additional auxiliary partitions on the same disk.
|
|
This is useful as we can match up the EFI UUID with the one the firmware
supposedly used.
|
|
This adds support for a new kernel command line option "systemd.volatile=" that
provides the same functionality that systemd-nspawn's --volatile= switch
provides, but for host systems (i.e. systems booting with a kernel).
It takes the same parameter and has the same effect.
In order to implement systemd.volatile=yes a new service
systemd-volatile-root.service is introduced that only runs in the initrd and
rearranges the root directory as needed to become a tmpfs instance. Note that
systemd.volatile=state is implemented different: it simply generates a
var.mount unit file that is part of the normal boot and has no effect on the
initrd execution.
The way this is implemented ensures that other explicit configuration for /var
can always override the effect of these options. Specifically, the var.mount
unit is generated in the "late" generator directory, so that it only is in
effect if nothing else overrides it.
|
|
Let's follow symlinks before invoking mount() on arbitrary paths, so that we
won't get confused if directories are prepared with absolute symlinks.
Use FOREACH_STRING() instead of NULSTR_FOREACH() as it is more readable.
Don't use snprintf() for concatenating strings, let chase_symlinks() to that.
Replace homegrown mount check with path_is_mount_point(). Also, change the
behaviour when we encounter this: instead of unmounting the old mount point,
simply leave it around and don't replace it, so that initrds can mount stuff
there with different settings than we would apply. This is in-line with how we
handle automatic mounts in nspawn for example.
Use umount_recursive() instead of a simple umount2() for unmounting the old
root, so that we actually cover really all mounts, not just the top-level one.
|
|
This is useful for reusing the dissector logic in the gpt-auto-discovery logic:
there we really don't want to use MBR or naked file systems as root device.
|
|
Let's make sure O_CLOEXEC is set for the file descriptor.
|
|
|
|
This moves the VolatileMode enum and its helper functions to src/shared/. This
is useful to then reuse them to implement systemd.volatile= in a later commit.
|
|
This means that callers can distiguish an error from flags==0,
and don't have to special-case the empty string.
|
|
Otherwise we might get started too early.
|
|
Fixes: #4402
|
|
This adds two new settings BindPaths= and BindReadOnlyPaths=. They allow
defining arbitrary bind mounts specific to particular services. This is
particularly useful for services with RootDirectory= set as this permits making
specific bits of the host directory available to chrooted services.
The two new settings follow the concepts nspawn already possess in --bind= and
--bind-ro=, as well as the .nspawn settings Bind= and BindReadOnly= (and these
latter options should probably be renamed to BindPaths= and BindReadOnlyPaths=
too).
Fixes: #3439
|
|
This makes "systemd-run -p MountFlags=shared -t /bin/sh" work, by making
MountFlags= to the list of properties that may be accessed transiently.
|
|
Fix some build issues and warnings
|
|
A prettification of the dissect code, mkosi and TODO updates
|
|
This is already fixed upstream, so warning is not useful.
Let's keep the workaround until the fix has percolated downstream.
|
|
We define those macros, and there's no reason to have one without
the other.
|
|
Generalize image dissection logic of nspawn, and make it useful for other tools.
|
|
|
|
This makes the code to set arg_flags much more readable.
|
|
This adds support for discovering and making use of properly tagged dm-verity
data integrity partitions. This extends both systemd-nspawn and systemd-dissect
with a new --root-hash= switch that takes the root hash to use for the root
partition, and is otherwise fully automatic.
Verity partitions are discovered automatically by GPT table type UUIDs, as
listed in
https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/
(which I updated prior to this change, to include new UUIDs for this purpose.
mkosi with https://github.com/systemd/mkosi/pull/39 applied may generate images
that carry the necessary integrity data. With that PR and this commit, the
following simply lines suffice to boot up an integrity-protected container image:
```
# mkdir test
# cd test
# mkosi --verity
# systemd-nspawn -i ./image.raw -bn
```
Note that mkosi writes the image file to "image.raw" next to a a file
"image.roothash" that contains the root hash. systemd-nspawn will look for that
file and use it if it exists, in case --root-hash= is not specified explicitly.
|
|
This adds support to the image dissector to deal with encrypted images (only
LUKS). Given that we now have a neatly isolated image dissector codebase, let's
add a new feature to it: support for automatically dealing with encrypted
images. This is then exposed in systemd-dissect and nspawn.
It's pretty basic: only support for passphrase-based encryption.
In order to ensure that "systemd-dissect --mount" results in mount points whose
backing LUKS DM devices are cleaned up automatically we use the DM_DEV_REMOVE
ioctl() directly on the device (in DM_DEFERRED_REMOVE mode). libgcryptsetup at
the moment doesn't provide a proper API for this. Thankfully, the ioctl() API
is pretty easy to use.
|
|
This adds a small tool that may be used to look into OS images, and mount them
to any place. This is mostly a friendlier version of test-dissect-image.c. I am
not sure this should really become a proper command of systemd, hence for now
do not install it into bindir, but simply libexecdir.
This tool is already pretty useful since you can mount image files with it,
honouring the various partitions correctly. I figure this is going to become
more interesting if the dissctor learns luks and verity support.
|
|
This adds two new APIs to systemd:
- loop-util.h is a simple internal API for allocating, setting up and releasing
loopback block devices.
- dissect-image.h is an internal API for taking apart disk images and figuring
out what the purpose of each partition is.
Both APIs are basically refactored versions of similar code in nspawn. This
rework should permit us to reuse this in other places than just nspawn in the
future. Specifically: to implement RootImage= in the service image, similar to
RootDirectory=, but operating on a disk image; to unify the gpt-auto-discovery
generator code with the discovery logic in nspawn; to add new API to machined
for determining the OS version of a disk image (i.e. not just running
containers). This PR does not make any such changes however, it just provides
the new reworked API.
The reworked code is also slightly more powerful than the nspawn original one.
When pointing it to an image or block device with a naked file system (i.e. no
partition table) it will simply make it the root device.
|
|
Let's use chase_symlinks() everywhere, and stop using GNU
canonicalize_file_name() everywhere. For most cases this should not change
behaviour, however increase exposure of our function to get better tested. Most
importantly in a few cases (most notably nspawn) it can take the correct root
directory into account when chasing symlinks.
|
|
|
|
Various networkd/DHCP fixes.
|
|
Given that other file systems (notably: xfs) support reflinks these days, let's
extend the file system snapshotting logic to fall back to plan copies or
reflinks when full btrfs subvolume snapshots are not available.
This essentially makes "systemd-nspawn --ephemeral" and "systemd-nspawn
--template=" available on non-btrfs subvolumes. Of course, both operations will
still be slower on non-btrfs than on btrfs (simply because reflinking each file
individually in a directory tree is still slower than doing this in one step
for a whole subvolume), but it's probably good enough for many cases, and we
should provide the users with the tools, they have to figure out what's good
for them.
Note that "machinectl clone" already had a fallback like this in place, this
patch generalizes this, and adds similar support to our other cases.
|
|
This adds a new undocumented env var $SYSTEMD_NSPAWN_LOCK. When set to "0",
nspawn will not attempt to lock the image.
Fixes: #4037
|
|
on success
We forgot to initialize the "global" return parameter in one case. Fix that.
|
|
@filesystem groups various file system operations, such as opening files and
directories for read/write and stat()ing them, plus renaming, deleting,
symlinking, hardlinking.
|
|
|
|
Confirm spawn fixes/enhancements
|
|
Support on the server side has already been in place for quite some time, let's
also add support on the client side for this.
|
|
In contrast to all other unit types device units when queued just track
external state, they cannot effect state changes on their own. Hence unless a
client or other job waits for them there's no reason to keep them in the job
queue. This adds a concept of GC'ing jobs of this type as soon as no client or
other job waits for them anymore.
To ensure this works correctly we need to track which clients actually
reference a job (i.e. which ones enqueued it). Unfortunately that's pretty
nasty to do for direct connections, as sd_bus_track doesn't work for
them. For now, work around this, by simply remembering in a boolean that a job
was requested by a direct connection, and reset it when we notice the direct
connection is gone. This means the GC logic works fine, except that jobs are
not immediately removed when direct connections disconnect.
In the longer term, a rework of the bus logic should fix this properly. For now
this should be good enough, as GC works for fine all cases except this one, and
thus is a clear improvement over the previous behaviour.
Fixes: #1921
|