Age | Commit message (Collapse) | Author |
|
overly long description strings
This essentially reverts one part of d054f0a4d451120c26494263fc4dc175bfd405b1.
(We might also choose to use proper ellipsation here, but I wasn't sure the
memory allocation this requires wouöld be a good idea here...)
Fixes: #4534
|
|
|
|
Preserve stored fds over service restart
|
|
Reported by @evverx in #4493.
|
|
more seccomp fixes, and change of order of selinux/aa/smack and seccomp application on exec
|
|
Since service_add_fd_store() already does the check, remove the redundant check
from service_add_fd_store_set().
Also, print a warning when repopulating FDStore after daemon-reexec and we hit
the limit. This is a user visible issue, so we should not discard fds silently.
(Note that service_deserialize_item is impacted by the return value from
service_add_fd_store(), but we rely on the general error message, so the caller
does not need to be modified, and does not show up in the diff.)
|
|
(#4533)
Always initialize the supplementary groups of caller before checking the
unit SupplementaryGroups= option.
Fixes https://github.com/systemd/systemd/issues/4531
|
|
This is a follow-up for 6309e51ea32d64524431ee65c49eecd44390da8f and makes sure
we compare test results with the right user identifier.
|
|
If execve() or socket() is filtered the service manager might get into trouble
executing the service binary, or handling any failures when this fails. Mention
this in the documentation.
The other option would be to implicitly whitelist all system calls that are
required for these codepaths. However, that appears less than desirable as this
would mean socket() and many related calls have to be whitelisted
unconditionally. As writing system call filters requires a certain level of
expertise anyway it sounds like the better option to simply document these
issues and suggest that the user disables system call filters in the service
temporarily in order to debug any such failures.
See: #3993.
|
|
Seccomp is generally an unprivileged operation, changing security contexts is
most likely associated with some form of policy. Moreover, while seccomp may
influence our own flow of code quite a bit (much more than the security context
change) make sure to apply the seccomp filters immediately before executing the
binary to invoke.
This also moves enforcement of NNP after the security context change, so that
NNP cannot affect it anymore. (However, the security policy now has to permit
the NNP change).
This change has a good chance of breaking current SELinux/AA/SMACK setups, because
the policy might not expect this change of behaviour. However, it's technically
the better choice I think and should hence be applied.
Fixes: #3993
|
|
@resources contains various syscalls that alter resource limits and memory and
scheduling parameters of processes. As such they are good candidates to block
for most services.
@basic-io contains a number of basic syscalls for I/O, similar to the list
seccomp v1 permitted but slightly more complete. It should be useful for
building basic whitelisting for minimal sandboxes
|
|
|
|
These system calls clearly fall in the @ipc category, hence should be listed
there, simply to avoid confusion and surprise by the user.
|
|
The system call is already part in @default hence implicitly allowed anyway.
Also, if it is actually blocked then systemd couldn't execute the service in
question anymore, since the application of seccomp is immediately followed by
it.
|
|
Timing and sleep are so basic operations, it makes very little sense to ever
block them, hence don't.
|
|
|
|
Switch drivers uses phys_port_name attribute to pass front panel port
name to user. Use it to generate netdev names.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
|
|
This test fails before previous commit, and passes with it.
|
|
We would close all the stored fds in service_release_resources(), which of
course broke the whole concept of storing fds over service restart.
Fixes #4408.
|
|
"Secondary arch" table for mips is entirely speculative…
|
|
Lustre is also a remote file system that wants the network to be up before it is mounted.
|
|
|
|
This introduces a new option, `tcrypt-veracrypt`, that sets the
corresponding VeraCrypt flag in the flags passed to cryptsetup.
|
|
A pendant for #4481.
|
|
systemd-escape manpage improvements
|
|
The first example wasn't phrased with "To ..." as the other three are,
and the last example was lacking the colon.
|
|
|
|
The option does more than the documentation gave it credit for.
|
|
Let's say that this was not obvious from our man page.
|
|
Should help with debugging #4408.
|
|
If it was a duplicate, log nothing.
|
|
Not sure since when this is the default behavior, but my local tree is full
of such files. Let's ignore them for clarity.
|
|
seccomp: also block shmat(..., SHM_EXEC) for MemoryDenyWriteExecute
|
|
Document NoNewPrivileges default value
|
|
|
|
Suggested by @keszybz in #4488.
|
|
core: improve mount namespace and working directory setup
|
|
detect-virt: add --private-users switch to check if a userns is active; add Condition=private-users
|
|
|
|
This makes applying groups after applying the working directory, this
may allow some flexibility but at same it is not a big deal since we
don't execute or do anything between applying working directory and
droping groups.
|
|
Improve apply_working_directory() and lets get the current working directory
inside of it.
|
|
|
|
|
|
We updated 'fn' but checked 'v' instead.
From 698c5a17
Spotted with PVS
|
|
Fix some formatting details in the merge.
|
|
The mount fails, even though CAP_SYS_ADMIN is granted.
|
|
Rewrite the function to be slightly simpler. In particular, if a specific
match is found (like ConditionVirtualization=yes), simply return an answer
immediately, instead of relying that "yes" will not be matched by any of
the virtualization names below.
No functional change.
|
|
|
|
This can be useful to silence warnings about units which fail in userns
container.
|
|
Various things don't work when we're running in a user namespace, but it's
pretty hard to reliably detect if that is true.
A function is added which looks at /proc/self/uid_map and returns false
if the default "0 0 UINT32_MAX" is found, and true if it finds anything else.
This misses the case where an 1:1 mapping with the full range was used, but
I don't know how to distinguish this case.
'systemd-detect-virt --private-users' is very similar to
'systemd-detect-virt --chroot', but we check for a user namespace instead.
|