Age | Commit message (Collapse) | Author |
|
is set
Make sure that when DynamicUser= is set that we intialize the user
supplementary groups and that we also support SupplementaryGroups=
Fixes: https://github.com/systemd/systemd/issues/4539
Thanks Evgeny Vereshchagin (@evverx)
|
|
This reverts some changes introduced in d054f0a4d4.
xsprintf should be used in cases where we calculated the right buffer
size by hand (using DECIMAL_STRING_MAX and such), and never in cases where
we are printing externally specified strings of arbitrary length.
Fixes #4534.
|
|
Add "perpetual" unit concept, sysctl fixes, networkd fixes, systemctl color fixes, nspawn discard.
|
|
|
|
overly long description strings
This essentially reverts one part of d054f0a4d451120c26494263fc4dc175bfd405b1.
(We might also choose to use proper ellipsation here, but I wasn't sure the
memory allocation this requires wouöld be a good idea here...)
Fixes: #4534
|
|
Preserve stored fds over service restart
|
|
more seccomp fixes, and change of order of selinux/aa/smack and seccomp application on exec
|
|
Since service_add_fd_store() already does the check, remove the redundant check
from service_add_fd_store_set().
Also, print a warning when repopulating FDStore after daemon-reexec and we hit
the limit. This is a user visible issue, so we should not discard fds silently.
(Note that service_deserialize_item is impacted by the return value from
service_add_fd_store(), but we rely on the general error message, so the caller
does not need to be modified, and does not show up in the diff.)
|
|
Let's propagate the error here, instead of eating it up early.
In a later change we should probably also change mount_enumerate() to propagate
errors up, but that would mean we'd have to change the unit vtable, and thus
change all unit types, hence is quite an invasive change.
|
|
|
|
Now that have a proper concept of "perpetual" units, let's make the root mount
one too, since it also cannot go away.
|
|
So far "no_gc" was set on -.slice and init.scope, to units that are always
running, cannot be stopped and never exist in an "inactive" state. Since these
units are the only users of this flag, let's remodel it and rename it
"perpetual" and let's derive more funcitonality off it. Specifically, refuse
enqueing stop jobs for these units, and report that they are "unstoppable" in
the CanStop bus property.
|
|
(#4533)
Always initialize the supplementary groups of caller before checking the
unit SupplementaryGroups= option.
Fixes https://github.com/systemd/systemd/issues/4531
|
|
Seccomp is generally an unprivileged operation, changing security contexts is
most likely associated with some form of policy. Moreover, while seccomp may
influence our own flow of code quite a bit (much more than the security context
change) make sure to apply the seccomp filters immediately before executing the
binary to invoke.
This also moves enforcement of NNP after the security context change, so that
NNP cannot affect it anymore. (However, the security policy now has to permit
the NNP change).
This change has a good chance of breaking current SELinux/AA/SMACK setups, because
the policy might not expect this change of behaviour. However, it's technically
the better choice I think and should hence be applied.
Fixes: #3993
|
|
We would close all the stored fds in service_release_resources(), which of
course broke the whole concept of storing fds over service restart.
Fixes #4408.
|
|
Should help with debugging #4408.
|
|
If it was a duplicate, log nothing.
|
|
seccomp: also block shmat(..., SHM_EXEC) for MemoryDenyWriteExecute
|
|
Document NoNewPrivileges default value
|
|
|
|
This makes applying groups after applying the working directory, this
may allow some flexibility but at same it is not a big deal since we
don't execute or do anything between applying working directory and
droping groups.
|
|
Improve apply_working_directory() and lets get the current working directory
inside of it.
|
|
|
|
|
|
shmat(..., SHM_EXEC) can be used to create writable and executable
memory, so let's block it when MemoryDenyWriteExecute is set.
|
|
Various seccomp fixes and NEWS update.
|
|
callbacks
Previously, we'd synthesize the root slice unit and the init scope unit in the
enumerator callbacks for the unit type. This is problematic if either of them
is already referenced from a unit that is loaded as result of another unit
type's enumerator logic.
Let's clean this up and simply create the two objects from the enumerator
callbacks, if they are not around yet. Do the actual filling in of the settings
from the unit_load() callbacks, to match how other units are loaded.
Fixes: #4322
|
|
This allows us to unify most of the code in apply_protect_kernel_modules() and
apply_private_devices().
|
|
This adds a new seccomp_init_conservative() helper call that is mostly just a
wrapper around seccomp_init(), but turns off NNP and adds in all secondary
archs, for best compatibility with everything else.
Pretty much all of our code used the very same constructs for these three
steps, hence unifying this in one small function makes things a lot shorter.
This also changes incorrect usage of the "scmp_filter_ctx" type at various
places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type
(pretty poor choice already!) that casts implicitly to and from all other
pointer types (even poorer choice: you defined a confusing type now, and don't
even gain any bit of type safety through it...). A lot of the code assumed the
type would refer to a structure, and hence aded additional "*" here and there.
Remove that.
|
|
seccomp_add_syscall_filter_set()
Let's simplify this call, by making use of the new infrastructure.
This is actually more in line with Djalal's original patch but instead of
search the filter set in the array by its name we can now use the set index and
jump directly to it.
|
|
A variety of fixes:
- rename the SystemCallFilterSet structure to SyscallFilterSet. So far the main
instance of it (the syscall_filter_sets[] array) used to abbreviate
"SystemCall" as "Syscall". Let's stick to one of the two syntaxes, and not
mix and match too wildly. Let's pick the shorter name in this case, as it is
sufficiently well established to not confuse hackers reading this.
- Export explicit indexes into the syscall_filter_sets[] array via an enum.
This way, code that wants to make use of a specific filter set, can index it
directly via the enum, instead of having to search for it. This makes
apply_private_devices() in particular a lot simpler.
- Provide two new helper calls in seccomp-util.c: syscall_filter_set_find() to
find a set by its name, seccomp_add_syscall_filter_set() to add a set to a
seccomp object.
- Update SystemCallFilter= parser to use extract_first_word(). Let's work on
deprecating FOREACH_WORD_QUOTED().
- Simplify apply_private_devices() using this functionality
|
|
|
|
Let's prefer early-exit over deep-indented if blocks. Not behavioural change.
|
|
Commandline parsing simplification and udev fix
|
|
shared, systemctl: teach is-enabled to show install targets
|
|
Remove the assert and check the return code of sysconf(_SC_NGROUPS_MAX).
_SC_NGROUPS_MAX maps to NGROUPS_MAX which is defined in <limits.h> to
65536 these days. The value is a sysctl read-only
/proc/sys/kernel/ngroups_max and the kernel assumes that it is always
positive otherwise things may break. Follow this and support only
positive values for all other case return either -errno or -EOPNOTSUPP.
Now if there are systems that want to re-write NGROUPS_MAX then they
should not pass SupplementaryGroups= in units even if it is empty, in
this case nothing fails and we just ignore supplementary groups. However
if SupplementaryGroups= is passed even if it is empty we have to assume
that there will be groups manipulation from our side or the kernel and
since the kernel always assumes that NGROUPS_MAX is positive, then
follow that and support only positive values.
|
|
It may be desired by users to know what targets a particular service is
installed into. Improve user friendliness by teaching the is-enabled
command to show such information when used with --full.
This patch makes use of the newly added UnitFileFlags and adds
UNIT_FILE_DRY_RUN flag into it. Since the API had already been modified,
it's now easy to add the new dry-run feature for other commands as
well. As a next step, --dry-run could be added to systemctl, which in
turn might pave the way for a long requested dry-run feature when
running systemctl start.
|
|
Introduce a new enum to get rid of some boolean arguments of unit_file_*
functions. It unifies the code, makes it a bit cleaner and extensible.
|
|
This is minor but lets try to split and move bit by bit cgroups and
portable environment setup before applying the security context.
|
|
This fixes: https://github.com/systemd/systemd/issues/4357
Let's lookup and cache creds then apply them. We also switch from
getgroups() to getgrouplist().
|
|
If SyscallFilter was set, and subsequently cleared, the no_new_privileges flag
was not reset properly. We don't need to set this flag here, it will be
set automatically in unit_patch_contexts() if syscall_filter is set.
|
|
rename failure-action to emergency-action and use it for ctrl+alt+del burst
|
|
This stripping is contolled by a new boolean parameter. When the parameter
is true, it means that the caller does not care about the distinction between
initrd and real root, and wants to act on both rd-dot-prefixed and unprefixed
parameters in the initramfs, and only on the unprefixed parameters in real
root. If the parameter is false, behaviour is the same as before.
Changes by caller:
log.c (systemd.log_*): changed to accept rd-dot-prefix params
pid1: no change, custom logic
cryptsetup-generator: no change, still accepts rd-dot-prefix params
debug-generator: no change, does not accept rd-dot-prefix params
fsck: changed to accept rd-dot-prefix params
fstab-generator: no change, custom logic
gpt-auto-generator: no change, custom logic
hibernate-resume-generator: no change, does not accept rd-dot-prefix params
journald: changed to accept rd-dot-prefix params
modules-load: no change, still accepts rd-dot-prefix params
quote-check: no change, does not accept rd-dot-prefix params
udevd: no change, still accepts rd-dot-prefix params
I added support for "rd." params in the three cases where I think it's
useful: logging, fsck options, journald forwarding options.
|
|
No functional change.
|
|
Fixes #4306
|
|
|
|
This can happen when the configuration is changed and reloaded while we are
executing a service. Let's not hit an assert in this case.
Fixes: #4444
|
|
Just to make sure the next one reading this isn't surprised that the fd isn't
kept open. SAK and stuff...
Fix suggested:
https://github.com/systemd/systemd/pull/4366#issuecomment-253659162
|
|
Two unrelated patches: man page tweaks and rlimit log levels
|
|
Since we ignore the result anyway, downgrade errors to warning.
log_oom() will still emit an error, but that's mostly theoretical, so it
is not worth complicating the code to avoid the small inconsistency
|