Age | Commit message (Collapse) | Author |
|
This allows us to unify most of the code in apply_protect_kernel_modules() and
apply_private_devices().
|
|
"oldumount()" is not a syscall, but simply a wrapper for it, the actual syscall
nr is called "umount" (and the nr of umount() is called umount2 internally).
"sysctl()" is not a syscall, but "_syscall()" is. Fix this in the table.
Without these changes libseccomp cannot actually translate the tables in full.
This wasn't noticed before as the code was written defensively for this case.
|
|
This adds a new seccomp_init_conservative() helper call that is mostly just a
wrapper around seccomp_init(), but turns off NNP and adds in all secondary
archs, for best compatibility with everything else.
Pretty much all of our code used the very same constructs for these three
steps, hence unifying this in one small function makes things a lot shorter.
This also changes incorrect usage of the "scmp_filter_ctx" type at various
places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type
(pretty poor choice already!) that casts implicitly to and from all other
pointer types (even poorer choice: you defined a confusing type now, and don't
even gain any bit of type safety through it...). A lot of the code assumed the
type would refer to a structure, and hence aded additional "*" here and there.
Remove that.
|
|
seccomp_add_syscall_filter_set()
Let's simplify this call, by making use of the new infrastructure.
This is actually more in line with Djalal's original patch but instead of
search the filter set in the array by its name we can now use the set index and
jump directly to it.
|
|
A variety of fixes:
- rename the SystemCallFilterSet structure to SyscallFilterSet. So far the main
instance of it (the syscall_filter_sets[] array) used to abbreviate
"SystemCall" as "Syscall". Let's stick to one of the two syntaxes, and not
mix and match too wildly. Let's pick the shorter name in this case, as it is
sufficiently well established to not confuse hackers reading this.
- Export explicit indexes into the syscall_filter_sets[] array via an enum.
This way, code that wants to make use of a specific filter set, can index it
directly via the enum, instead of having to search for it. This makes
apply_private_devices() in particular a lot simpler.
- Provide two new helper calls in seccomp-util.c: syscall_filter_set_find() to
find a set by its name, seccomp_add_syscall_filter_set() to add a set to a
seccomp object.
- Update SystemCallFilter= parser to use extract_first_word(). Let's work on
deprecating FOREACH_WORD_QUOTED().
- Simplify apply_private_devices() using this functionality
|
|
|
|
Let's prefer early-exit over deep-indented if blocks. Not behavioural change.
|
|
This is a follow-up for fb8b0869a7bc30e23be175cf978df23192d59118, and makes a
couple of minor clean-up changes:
- The field name in the timestamp file is changed from "TimestampNSec=" to
"TIMESTAMP_NSEC=". This is done simply to reflect the fact that we parse the
file with the env var file parser, and hence the contents should better
follow the usual capitalization of env vars, i.e. be all uppercase.
- Needless negation of the errno parameter log_error_errno() and friends has
been removed.
- Instead of manually calculating the nsec remainder of the timestamp, use
timespec_store().
- We now check whether we were able to write the timestamp file in full with
fflush_and_check() the way we usually do it.
|
|
Commandline parsing simplification and udev fix
|
|
test: lets add more tests to cover SupplementaryGroups= cases.
|
|
shared, systemctl: teach is-enabled to show install targets
|
|
When systemd-networkd is run on the same IPv6 enabled interface where
radvd is announcing prefixes, a route is being set up pointing to the
interface address. As this will fail with an invalid argument error,
the link is marked as failed and the following message like the
following will appear in in the logs:
systemd-networkd[21459]: eth1: Could not set NDisc route or address: Invalid argument
systemd-networkd[21459]: eth1: Failed
Should the interface be required by systemd-networkd-wait-online,
network-online.target will wait until its timeout hits thereby
significantly delaying system startup.
The fix is to check whether the gateway address obtained from NDisc
messages is equal to any of the interface addresses on the same link
and not set the NDisc route in that case.
|
|
Remove the assert and check the return code of sysconf(_SC_NGROUPS_MAX).
_SC_NGROUPS_MAX maps to NGROUPS_MAX which is defined in <limits.h> to
65536 these days. The value is a sysctl read-only
/proc/sys/kernel/ngroups_max and the kernel assumes that it is always
positive otherwise things may break. Follow this and support only
positive values for all other case return either -errno or -EOPNOTSUPP.
Now if there are systems that want to re-write NGROUPS_MAX then they
should not pass SupplementaryGroups= in units even if it is empty, in
this case nothing fails and we just ignore supplementary groups. However
if SupplementaryGroups= is passed even if it is empty we have to assume
that there will be groups manipulation from our side or the kernel and
since the kernel always assumes that NGROUPS_MAX is positive, then
follow that and support only positive values.
|
|
|
|
It may be desired by users to know what targets a particular service is
installed into. Improve user friendliness by teaching the is-enabled
command to show such information when used with --full.
This patch makes use of the newly added UnitFileFlags and adds
UNIT_FILE_DRY_RUN flag into it. Since the API had already been modified,
it's now easy to add the new dry-run feature for other commands as
well. As a next step, --dry-run could be added to systemctl, which in
turn might pave the way for a long requested dry-run feature when
running systemctl start.
|
|
Introduce a new enum to get rid of some boolean arguments of unit_file_*
functions. It unifies the code, makes it a bit cleaner and extensible.
|
|
|
|
https://github.com/torvalds/linux/commit/036d523641c66bef713042894a17f4335f199e49
> vfs: Don't create inodes with a uid or gid unknown to the vfs
It is expected that filesystems can not represent uids and gids from
outside of their user namespace. Keep things simple by not even
trying to create filesystem nodes with non-sense uids and gids.
So, we actually should `reset_uid_gid` early to prevent https://github.com/systemd/systemd/pull/4223#issuecomment-252522955
$ sudo UNIFIED_CGROUP_HIERARCHY=no LD_LIBRARY_PATH=.libs .libs/systemd-nspawn -D /var/lib/machines/fedora-rawhide -U -b systemd.unit=multi-user.target
Spawning container fedora-rawhide on /var/lib/machines/fedora-rawhide.
Press ^] three times within 1s to kill container.
Child died too early.
Selected user namespace base 1073283072 and range 65536.
Failed to mount to /sys/fs/cgroup/systemd: No such file or directory
Details: https://github.com/systemd/systemd/pull/4223#issuecomment-253046519
Fixes: #4352
|
|
https://github.com/systemd/systemd/pull/4372#issuecomment-253723849:
* `mount_all (outer_child)` creates `container_dir/sys/fs/selinux`
* `mount_all (outer_child)` doesn't patch `container_dir/sys/fs` and so on.
* `mount_sysfs (inner_child)` tries to create `/sys/fs/cgroup`
* This fails
370 stat("/sys/fs", {st_dev=makedev(0, 28), st_ino=13880, st_mode=S_IFDIR|0755, st_nlink=3, st_uid=65534, st_gid=65534, st_blksize=4096, st_blocks=0, st_size=60, st_atime=2016/10/14-05:16:43.398665943, st_mtime=2016/10/14-05:16:43.399665943, st_ctime=2016/10/14-05:16:43.399665943}) = 0
370 mkdir("/sys/fs/cgroup", 0755) = -1 EACCES (Permission denied)
* `mount_syfs (inner_child)` ignores that error and
mount(NULL, "/sys", NULL, MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND, NULL) = 0
* `mount_cgroups` finally fails
|
|
https://github.com/systemd/systemd/pull/4372#discussion_r83354107:
I get `open("/proc/self/fdinfo/13", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)`
327 mkdir("/proc", 0755 <unfinished ...>
327 <... mkdir resumed> ) = -1 EEXIST (File exists)
327 stat("/proc", <unfinished ...>
327 <... stat resumed> {st_dev=makedev(8, 1), st_ino=28585, st_mode=S_IFDIR|0755, st_nlink=2, st_uid=0, st_gid=0, st_blksize=1024, st_blocks=4, st_size=1024, st_atime=2016/10/14-02:55:32, st_mtime=2016/
327 mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC, NULL <unfinished ...>
327 <... mount resumed> ) = 0
327 lstat("/proc", <unfinished ...>
327 <... lstat resumed> {st_dev=makedev(0, 34), st_ino=1, st_mode=S_IFDIR|0555, st_nlink=75, st_uid=65534, st_gid=65534, st_blksize=1024, st_blocks=0, st_size=0, st_atime=2016/10/14-03:13:35.971031263,
327 lstat("/proc/sys", {st_dev=makedev(0, 34), st_ino=4026531855, st_mode=S_IFDIR|0555, st_nlink=1, st_uid=65534, st_gid=65534, st_blksize=1024, st_blocks=0, st_size=0, st_atime=2016/10/14-03:13:39.1630
327 openat(AT_FDCWD, "/proc", O_RDONLY|O_DIRECTORY|O_CLOEXEC|O_PATH) = 11</proc>
327 name_to_handle_at(11</proc>, "sys", {handle_bytes=128}, 0x7ffe3a238604, AT_SYMLINK_FOLLOW) = -1 EOPNOTSUPP (Operation not supported)
327 name_to_handle_at(11</proc>, "", {handle_bytes=128}, 0x7ffe3a238608, AT_EMPTY_PATH) = -1 EOPNOTSUPP (Operation not supported)
327 openat(11</proc>, "sys", O_RDONLY|O_CLOEXEC|O_PATH) = 13</proc/sys>
327 open("/proc/self/fdinfo/13", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
327 close(13</proc/sys> <unfinished ...>
327 <... close resumed> ) = 0
327 close(11</proc> <unfinished ...>
327 <... close resumed> ) = 0
-bash-4.3# ls -ld /proc/
dr-xr-xr-x 76 65534 65534 0 Oct 14 02:57 /proc/
-bash-4.3# ls -ld /proc/1
dr-xr-xr-x 9 root root 0 Oct 14 02:57 /proc/1
-bash-4.3# ls -ld /proc/1/fdinfo
dr-x------ 2 65534 65534 0 Oct 14 03:00 /proc/1/fdinfo
|
|
This is minor but lets try to split and move bit by bit cgroups and
portable environment setup before applying the security context.
|
|
|
|
|
|
This fixes: https://github.com/systemd/systemd/issues/4357
Let's lookup and cache creds then apply them. We also switch from
getgroups() to getgrouplist().
|
|
If SyscallFilter was set, and subsequently cleared, the no_new_privileges flag
was not reset properly. We don't need to set this flag here, it will be
set automatically in unit_patch_contexts() if syscall_filter is set.
|
|
rename failure-action to emergency-action and use it for ctrl+alt+del burst
|
|
|
|
This stripping is contolled by a new boolean parameter. When the parameter
is true, it means that the caller does not care about the distinction between
initrd and real root, and wants to act on both rd-dot-prefixed and unprefixed
parameters in the initramfs, and only on the unprefixed parameters in real
root. If the parameter is false, behaviour is the same as before.
Changes by caller:
log.c (systemd.log_*): changed to accept rd-dot-prefix params
pid1: no change, custom logic
cryptsetup-generator: no change, still accepts rd-dot-prefix params
debug-generator: no change, does not accept rd-dot-prefix params
fsck: changed to accept rd-dot-prefix params
fstab-generator: no change, custom logic
gpt-auto-generator: no change, custom logic
hibernate-resume-generator: no change, does not accept rd-dot-prefix params
journald: changed to accept rd-dot-prefix params
modules-load: no change, still accepts rd-dot-prefix params
quote-check: no change, does not accept rd-dot-prefix params
udevd: no change, still accepts rd-dot-prefix params
I added support for "rd." params in the three cases where I think it's
useful: logging, fsck options, journald forwarding options.
|
|
- do not crash if an option without value is specified on the kernel command
line, e.g. "udev.log-priority" :P
- simplify the code a bit
- warn about unknown "udev.*" options — this should make it easier to spot
typos and reduce user confusion
|
|
This makes journald use the common option parsing functionality.
One behavioural change is implemented:
"systemd.journald.forward_to_syslog" is now equivalent to
"systemd.journald.forward_to_syslog=1".
I think it's nicer to use this way.
|
|
No functional change.
|
|
|
|
The log forward levels can be configured through kernel command line.
|
|
sed -i 's|Linux Boot Manager|Systemd Boot Manager|' src/boot/bootctl.c
|
|
|
|
|
|
|
|
Show is documented to be program-parseable, and printing the warning about
about a non-existent unit, while useful for humans, broke a lot of scripts.
Restore previous behaviour of returning success and printing empty or useless
stuff for units which do not exist, and printing empty values for properties
which do not exists.
With SYSTEMD_LOG_LEVEL=debug, hints are printed, but the return value is
still 0.
This undoes parts of e33a06a and 3dced37b7 and fixes #3856.
We might consider adding an explicit switch to fail on missing units/properties
(e.g. --ensure-exists or similar), and make -P foobar equivalent to
--ensure-exists --property=foobar.
(cherry picked from commit bd5b9f0a12dd9c1947b11534e99c395ddf44caa9)
|
|
This reverts commit affd7ed1a923b0df8479cff1bd9eafb625fdaa66.
> So it looks like make_console_stdio() has bad side effect. More specifically it
> does a TIOCSCTTY ioctl (via acquire_terminal()) which sees to disturb the
> process which was using/owning the console.
Fixes #3842.
https://bugs.debian.org/834367
https://bugzilla.redhat.com/show_bug.cgi?id=1367766
(cherry picked from commit bd64d82c1c0e3fe2a5f9b3dd9132d62834f50b2d)
|
|
If manager_dispatch_notify_fd() fails and returns an error then the handling of
service notifications will be disabled entirely leading to a compromised system.
For example pid1 won't be able to receive the WATCHDOG messages anymore and
will kill all services supposed to send such messages.
(cherry picked from commit 9987750e7a4c62e0eb8473603150596ba7c3a015)
|
|
This undoes 531ac2b234. I acked that patch without looking at the code
carefully enough. There are two problems:
- we want to process the fds anyway
- in principle empty notification messages are valid, and we should
process them as usual, including logging using log_unit_debug().
(cherry picked from commit 8523bf7dd514a3a2c6114b7b8fb8f308b4f09fc4)
|
|
Fixes #4234.
Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
(cherry picked from commit 531ac2b2349da02acc9c382849758e07eb92b020)
|
|
|
|
|
|
In particular, the font copying has no chance of succeeding as
the required functionality is not implemented, see:
drivers/video/console/dummycon.c
|
|
core: if the start command vanishes during runtime don't hit an assert
|
|
Fixes #4306
|
|
|
|
Fix expansion of %i, %u, %N, %n install specifiers
|
|
Fixes:
Oct 20 09:10:49 systemd-sysusers[144]: Direct leak of 20 byte(s) in 1 object(s) allocated from:
Oct 20 09:10:49 systemd-sysusers[144]: #0 0x7f3565a13e60 in malloc (/lib64/libasan.so.3+0xc6e60)
Oct 20 09:10:49 systemd-sysusers[144]: #1 0x7f3565526bd0 in malloc_multiply src/basic/alloc-util.h:70
Oct 20 09:10:49 systemd-sysusers[144]: #2 0x7f356552cb55 in tempfn_xxxxxx src/basic/fileio.c:1116
Oct 20 09:10:49 systemd-sysusers[144]: #3 0x7f356552c4f0 in fopen_temporary src/basic/fileio.c:1042
Oct 20 09:10:49 systemd-sysusers[144]: #4 0x7f356555e00e in fopen_temporary_label src/basic/fileio-label.c:63
Oct 20 09:10:49 systemd-sysusers[144]: #5 0x56197c4a1766 in make_backup src/sysusers/sysusers.c:209
Oct 20 09:10:49 systemd-sysusers[144]: #6 0x56197c4a6335 in write_files src/sysusers/sysusers.c:710
Oct 20 09:10:49 systemd-sysusers[144]: #7 0x56197c4ae571 in main src/sysusers/sysusers.c:1817
Oct 20 09:10:49 systemd-sysusers[144]: #8 0x7f3564dee730 in __libc_start_main (/lib64/libc.so.6+0x20730)
|