Age | Commit message (Collapse) | Author |
|
core: add new RestrictNamespaces= unit file setting
Merging, not rebasing, because this touches many files and there were tree-wide cleanups in the mean time.
|
|
Format string tweaks (and a small fix on 32bit)
|
|
Remove FOREACH_WORD_QUOTED
|
|
|
|
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
|
|
It's the default, and NULL is shorter.
|
|
journalctl: fix memleak
|
|
This new setting permits restricting whether namespaces may be created and
managed by processes started by a unit. It installs a seccomp filter blocking
certain invocations of unshare(), clone() and setns().
RestrictNamespaces=no is the default, and does not restrict namespaces in any
way. RestrictNamespaces=yes takes away the ability to create or manage any kind
of namspace. "RestrictNamespaces=mnt ipc" restricts the creation of namespaces
so that only mount and IPC namespaces may be created/managed, but no other
kind of namespaces.
This setting should be improve security quite a bit as in particular user
namespacing was a major source of CVEs in the kernel in the past, and is
accessible to unprivileged processes. With this setting the entire attack
surface may be removed for system services that do not make use of namespaces.
|
|
systemd-analyze syscall-filter
|
|
Fixes:
$ ./libtool --mode execute valgrind --leak-check=full ./journalctl >/dev/null
==22309== Memcheck, a memory error detector
==22309== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==22309== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==22309== Command: /home/vagrant/systemd/.libs/lt-journalctl
==22309==
Hint: You are currently not seeing messages from other users and the system.
Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages.
Pass -q to turn off this notice.
==22309==
==22309== HEAP SUMMARY:
==22309== in use at exit: 8,680 bytes in 4 blocks
==22309== total heap usage: 5,543 allocs, 5,539 frees, 9,045,618 bytes allocated
==22309==
==22309== 488 (56 direct, 432 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 4
==22309== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==22309== by 0x6F37A0A: __new_var_obj_p (__libobj.c:36)
==22309== by 0x6F362F7: __acl_init_obj (acl_init.c:28)
==22309== by 0x6F37731: __acl_from_xattr (__acl_from_xattr.c:54)
==22309== by 0x6F36087: acl_get_file (acl_get_file.c:69)
==22309== by 0x4F15752: acl_search_groups (acl-util.c:172)
==22309== by 0x113A1E: access_check_var_log_journal (journalctl.c:1836)
==22309== by 0x113D8D: access_check (journalctl.c:1889)
==22309== by 0x115681: main (journalctl.c:2236)
==22309==
==22309== LEAK SUMMARY:
==22309== definitely lost: 56 bytes in 1 blocks
==22309== indirectly lost: 432 bytes in 1 blocks
==22309== possibly lost: 0 bytes in 0 blocks
==22309== still reachable: 8,192 bytes in 2 blocks
==22309== suppressed: 0 bytes in 0 blocks
|
|
bash-4.3# journalctl --no-hostname >/dev/null
=================================================================
==288==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48492 byte(s) in 2694 object(s) allocated from:
#0 0x7fb4aba13e60 in malloc (/lib64/libasan.so.3+0xc6e60)
#1 0x7fb4ab5b2cc4 in malloc_multiply src/basic/alloc-util.h:70
#2 0x7fb4ab5b3194 in parse_field src/shared/logs-show.c:98
#3 0x7fb4ab5b4918 in output_short src/shared/logs-show.c:347
#4 0x7fb4ab5b7cb7 in output_journal src/shared/logs-show.c:977
#5 0x5650e29cd83d in main src/journal/journalctl.c:2581
#6 0x7fb4aabdb730 in __libc_start_main (/lib64/libc.so.6+0x20730)
SUMMARY: AddressSanitizer: 48492 byte(s) leaked in 2694 allocation(s).
Closes: #4568
|
|
Tree wide cleanups
|
|
Just to make the whole thing easier for users.
|
|
Now that the list is user-visible, @default should be first.
|
|
If we encounter the (unlikely) situation where the combined path to the
new root and a path to a mount to be moved together exceed maximum path length,
we shouldn't crash, but fail this path instead.
|
|
@resources contains various syscalls that alter resource limits and memory and
scheduling parameters of processes. As such they are good candidates to block
for most services.
@basic-io contains a number of basic syscalls for I/O, similar to the list
seccomp v1 permitted but slightly more complete. It should be useful for
building basic whitelisting for minimal sandboxes
|
|
These system calls clearly fall in the @ipc category, hence should be listed
there, simply to avoid confusion and surprise by the user.
|
|
The system call is already part in @default hence implicitly allowed anyway.
Also, if it is actually blocked then systemd couldn't execute the service in
question anymore, since the application of seccomp is immediately followed by
it.
|
|
Timing and sleep are so basic operations, it makes very little sense to ever
block them, hence don't.
|
|
"Secondary arch" table for mips is entirely speculative…
|
|
detect-virt: add --private-users switch to check if a userns is active; add Condition=private-users
|
|
Rewrite the function to be slightly simpler. In particular, if a specific
match is found (like ConditionVirtualization=yes), simply return an answer
immediately, instead of relying that "yes" will not be matched by any of
the virtualization names below.
No functional change.
|
|
This can be useful to silence warnings about units which fail in userns
container.
|
|
This validates the system call set table and many of our seccomp-util.c APIs.
|
|
This allows us to unify most of the code in apply_protect_kernel_modules() and
apply_private_devices().
|
|
"oldumount()" is not a syscall, but simply a wrapper for it, the actual syscall
nr is called "umount" (and the nr of umount() is called umount2 internally).
"sysctl()" is not a syscall, but "_syscall()" is. Fix this in the table.
Without these changes libseccomp cannot actually translate the tables in full.
This wasn't noticed before as the code was written defensively for this case.
|
|
This adds a new seccomp_init_conservative() helper call that is mostly just a
wrapper around seccomp_init(), but turns off NNP and adds in all secondary
archs, for best compatibility with everything else.
Pretty much all of our code used the very same constructs for these three
steps, hence unifying this in one small function makes things a lot shorter.
This also changes incorrect usage of the "scmp_filter_ctx" type at various
places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type
(pretty poor choice already!) that casts implicitly to and from all other
pointer types (even poorer choice: you defined a confusing type now, and don't
even gain any bit of type safety through it...). A lot of the code assumed the
type would refer to a structure, and hence aded additional "*" here and there.
Remove that.
|
|
A variety of fixes:
- rename the SystemCallFilterSet structure to SyscallFilterSet. So far the main
instance of it (the syscall_filter_sets[] array) used to abbreviate
"SystemCall" as "Syscall". Let's stick to one of the two syntaxes, and not
mix and match too wildly. Let's pick the shorter name in this case, as it is
sufficiently well established to not confuse hackers reading this.
- Export explicit indexes into the syscall_filter_sets[] array via an enum.
This way, code that wants to make use of a specific filter set, can index it
directly via the enum, instead of having to search for it. This makes
apply_private_devices() in particular a lot simpler.
- Provide two new helper calls in seccomp-util.c: syscall_filter_set_find() to
find a set by its name, seccomp_add_syscall_filter_set() to add a set to a
seccomp object.
- Update SystemCallFilter= parser to use extract_first_word(). Let's work on
deprecating FOREACH_WORD_QUOTED().
- Simplify apply_private_devices() using this functionality
|
|
This is a follow-up for fb8b0869a7bc30e23be175cf978df23192d59118, and makes a
couple of minor clean-up changes:
- The field name in the timestamp file is changed from "TimestampNSec=" to
"TIMESTAMP_NSEC=". This is done simply to reflect the fact that we parse the
file with the env var file parser, and hence the contents should better
follow the usual capitalization of env vars, i.e. be all uppercase.
- Needless negation of the errno parameter log_error_errno() and friends has
been removed.
- Instead of manually calculating the nsec remainder of the timestamp, use
timespec_store().
- We now check whether we were able to write the timestamp file in full with
fflush_and_check() the way we usually do it.
|
|
It may be desired by users to know what targets a particular service is
installed into. Improve user friendliness by teaching the is-enabled
command to show such information when used with --full.
This patch makes use of the newly added UnitFileFlags and adds
UNIT_FILE_DRY_RUN flag into it. Since the API had already been modified,
it's now easy to add the new dry-run feature for other commands as
well. As a next step, --dry-run could be added to systemctl, which in
turn might pave the way for a long requested dry-run feature when
running systemctl start.
|
|
Introduce a new enum to get rid of some boolean arguments of unit_file_*
functions. It unifies the code, makes it a bit cleaner and extensible.
|
|
This makes strjoin and strjoina more similar and avoids the useless final
argument.
spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)
git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'
This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
|
|
Test case:
[Install]
DefaultInstance=bond1
WantedBy= foobar-U-%U.device
WantedBy= foobar-u-%u.device
$ sudo systemctl --root=/ enable testing4@.service
(before)
Created symlink /etc/systemd/system/foobar-U-0.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
Created symlink /etc/systemd/system/foobar-u-zbyszek.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
(after)
Created symlink /etc/systemd/system/foobar-U-0.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
Created symlink /etc/systemd/system/foobar-u-root.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
It doesn't make much sense to use a different user for %U and %u.
|
|
We should substitute DefaultInstance if the instance is not specified.
Test case:
[Install]
DefaultInstance=bond1
WantedBy= foobar-n-%n.device
WantedBy= foobar-N-%N.device
$ systemctl --root=/ enable testing4@.service
Created symlink /etc/systemd/system/foobar-n-testing4@bond1.service.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
Created symlink /etc/systemd/system/foobar-N-testing4@bond1.device.wants/testing4@bond1.service → /etc/systemd/system/testing4@.service.
(before, the symlink would be created with empty %n, %N parts).
|
|
We should substitute DefaultInstance if the instance is not specified.
Test case:
[Install]
DefaultInstance=bond1
WantedBy= foobar-i-%i.device
$ systemctl --root=/ enable testing4@.service
Created symlink /etc/systemd/system/foobar-i-bond1.device.wants/testing4@bond1.service
→ /etc/systemd/system/testing4@.service.
(before, the symlink would be created as
/etc/systemd/system/foobar-i-.device.wants/testing4@bond1.service)
Fixes #4411.
|
|
Various install-related tweaks
|
|
When a unit file is invalid, we'd return an error without any details:
$ systemctl --root=/ enable testing@instance.service
Failed to enable: Invalid argument.
Fix things to at least print the offending file name:
$ systemctl enable testing@instance.service
Failed to enable unit: File testing@instance.service: Invalid argument
$ systemctl --root=/ enable testing@instance.service
Failed to enable unit, file testing@instance.service: Invalid argument.
A real fix would be to pass back a proper error message from conf-parser.
But this would require major surgery, since conf-parser functions now
simply print log errors, but we would need to return them over the bus.
So let's just print the file name, to indicate where the error is.
(Incomplete) fix for #4210.
|
|
Test case:
[Install]
WantedBy= default.target
Also=getty@%p.service
$ ./systemctl --root=/ enable testing@instance.service
Created symlink /etc/systemd/system/default.target.wants/testing@instance.service → /etc/systemd/system/testing@.service.
Created symlink /etc/systemd/system/getty.target.wants/getty@testing.service → /usr/lib/systemd/system/getty@.service.
$ ./systemctl --root=/ disable testing@instance.service
Removed /etc/systemd/system/getty.target.wants/getty@testing.service.
Removed /etc/systemd/system/default.target.wants/testing@instance.service.
Fixes part of #4210.
Resolving specifiers in DefaultInstance seems to work too:
[Install]
WantedBy= default.target
DefaultInstance=%u
$ systemctl --root=/ enable testing3@instance.service
Created symlink /etc/systemd/system/default.target.wants/testing3@instance.service → /etc/systemd/system/testing3@.service.
$ systemctl --root=/ enable testing3@.service
Created symlink /etc/systemd/system/default.target.wants/testing3@zbyszek.service → /etc/systemd/system/testing3@.service.
|
|
Test case:
[Install]
WantedBy= default.target
Also=foobar-unknown.service
Before:
$ systemctl --root=/ enable testing2@instance.service
Failed to enable: No such file or directory.
After
$ ./systemctl --root=/ enable testing2@instance.service
Failed to enable unit, file foobar-unknown.service: No such file or directory.
|
|
With the following test case:
[Install]
WantedBy= default.target
Also=foobar-unknown.service
disabling would fail with:
$ ./systemctl --root=/ disable testing.service
Cannot find unit foobar-unknown.service. # this is level debug
Failed to disable: No such file or directory. # this is the error
After the change we proceed:
$ ./systemctl --root=/ disable testing.service
Cannot find unit foobar-unknown.service.
Removed /etc/systemd/system/default.target.wants/testing.service.
This does not affect specifying a missing unit directly:
$ ./systemctl --root=/ disable nosuch.service
Failed to disable: No such file or directory.
|
|
We should ignore that unit, but otherwise continue.
|
|
It's a common pattern, so add a helper for it. A macro is necessary
because a function that takes a pointer to a pointer would be type specific,
similarly to cleanup functions. Seems better to use a macro.
|
|
|
|
Also rewrap some comments so that they don't have a very long line and a very
short line.
|
|
This is useful to turn off explicit module load and unload operations on modular
kernels. This option removes CAP_SYS_MODULE from the capability bounding set for
the unit, and installs a system call filter to block module system calls.
This option will not prevent the kernel from loading modules using the module
auto-load feature which is a system wide operation.
|
|
Allowed paths are unified betwen the configuration file parses and the bus
property checker. The biggest change is that the bus code now allows "block-"
and "char-" classes. In addition, path_startswith("/dev") was used in the bus
code, and startswith("/dev") was used in the config file code. It seems
reasonable to use path_startswith() which allows a slightly broader class of
strings.
Fixes #3935.
|
|
Various smaller documentation fixes.
|
|
Add an "invocation ID" concept to the service manager
|
|
|
|
|