summaryrefslogtreecommitdiff
path: root/src/core
AgeCommit message (Collapse)Author
2016-11-01core: when restarting services, don't close fdsZbigniew Jędrzejewski-Szmek
We would close all the stored fds in service_release_resources(), which of course broke the whole concept of storing fds over service restart. Fixes #4408.
2016-10-28pid1: nicely log when doing operation on stored fdsZbigniew Jędrzejewski-Szmek
Should help with debugging #4408.
2016-10-28pid1: only log about added fd if it was really addedZbigniew Jędrzejewski-Szmek
If it was a duplicate, log nothing.
2016-10-28Merge pull request #4495 from topimiettinen/block-shmat-execDjalal Harouni
seccomp: also block shmat(..., SHM_EXEC) for MemoryDenyWriteExecute
2016-10-28Merge pull request #4458 from keszybz/man-nonewprivilegesMartin Pitt
Document NoNewPrivileges default value
2016-10-27core: make unit argument const for apply seccomp functionsDjalal Harouni
2016-10-27core: lets apply working directory just after mount namespacesDjalal Harouni
This makes applying groups after applying the working directory, this may allow some flexibility but at same it is not a big deal since we don't execute or do anything between applying working directory and droping groups.
2016-10-27core: get the working directory value inside apply_working_directory()Djalal Harouni
Improve apply_working_directory() and lets get the current working directory inside of it.
2016-10-27core: move apply working directory code into its own apply_working_directory()Djalal Harouni
2016-10-27core: move the code that setups namespaces on its own functionDjalal Harouni
2016-10-26seccomp: also block shmat(..., SHM_EXEC) for MemoryDenyWriteExecuteTopi Miettinen
shmat(..., SHM_EXEC) can be used to create writable and executable memory, so let's block it when MemoryDenyWriteExecute is set.
2016-10-24Merge pull request #4450 from poettering/seccompfixesZbigniew Jędrzejewski-Szmek
Various seccomp fixes and NEWS update.
2016-10-24core: move initialization of -.slice and init.scope into the unit_load() ↵Lennart Poettering
callbacks Previously, we'd synthesize the root slice unit and the init scope unit in the enumerator callbacks for the unit type. This is problematic if either of them is already referenced from a unit that is loaded as result of another unit type's enumerator logic. Let's clean this up and simply create the two objects from the enumerator callbacks, if they are not around yet. Do the actual filling in of the settings from the unit_load() callbacks, to match how other units are loaded. Fixes: #4322
2016-10-24seccomp: add new helper call seccomp_load_filter_set()Lennart Poettering
This allows us to unify most of the code in apply_protect_kernel_modules() and apply_private_devices().
2016-10-24seccomp: add new seccomp_init_conservative() helperLennart Poettering
This adds a new seccomp_init_conservative() helper call that is mostly just a wrapper around seccomp_init(), but turns off NNP and adds in all secondary archs, for best compatibility with everything else. Pretty much all of our code used the very same constructs for these three steps, hence unifying this in one small function makes things a lot shorter. This also changes incorrect usage of the "scmp_filter_ctx" type at various places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type (pretty poor choice already!) that casts implicitly to and from all other pointer types (even poorer choice: you defined a confusing type now, and don't even gain any bit of type safety through it...). A lot of the code assumed the type would refer to a structure, and hence aded additional "*" here and there. Remove that.
2016-10-24core: rework apply_protect_kernel_modules() to use ↵Lennart Poettering
seccomp_add_syscall_filter_set() Let's simplify this call, by making use of the new infrastructure. This is actually more in line with Djalal's original patch but instead of search the filter set in the array by its name we can now use the set index and jump directly to it.
2016-10-24core: rework syscall filter set handlingLennart Poettering
A variety of fixes: - rename the SystemCallFilterSet structure to SyscallFilterSet. So far the main instance of it (the syscall_filter_sets[] array) used to abbreviate "SystemCall" as "Syscall". Let's stick to one of the two syntaxes, and not mix and match too wildly. Let's pick the shorter name in this case, as it is sufficiently well established to not confuse hackers reading this. - Export explicit indexes into the syscall_filter_sets[] array via an enum. This way, code that wants to make use of a specific filter set, can index it directly via the enum, instead of having to search for it. This makes apply_private_devices() in particular a lot simpler. - Provide two new helper calls in seccomp-util.c: syscall_filter_set_find() to find a set by its name, seccomp_add_syscall_filter_set() to add a set to a seccomp object. - Update SystemCallFilter= parser to use extract_first_word(). Let's work on deprecating FOREACH_WORD_QUOTED(). - Simplify apply_private_devices() using this functionality
2016-10-24core: move misplaced comment to the right placeLennart Poettering
2016-10-24core: simplify skip_seccomp_unavailable() a bitLennart Poettering
Let's prefer early-exit over deep-indented if blocks. Not behavioural change.
2016-10-24Merge pull request #4459 from keszybz/commandline-parsingLennart Poettering
Commandline parsing simplification and udev fix
2016-10-24Merge pull request #4406 from jsynacek/jsynacek-is-enabledLennart Poettering
shared, systemctl: teach is-enabled to show install targets
2016-10-24core: do not assert when sysconf(_SC_NGROUPS_MAX) fails (#4466)Djalal Harouni
Remove the assert and check the return code of sysconf(_SC_NGROUPS_MAX). _SC_NGROUPS_MAX maps to NGROUPS_MAX which is defined in <limits.h> to 65536 these days. The value is a sysctl read-only /proc/sys/kernel/ngroups_max and the kernel assumes that it is always positive otherwise things may break. Follow this and support only positive values for all other case return either -errno or -EOPNOTSUPP. Now if there are systems that want to re-write NGROUPS_MAX then they should not pass SupplementaryGroups= in units even if it is empty, in this case nothing fails and we just ignore supplementary groups. However if SupplementaryGroups= is passed even if it is empty we have to assume that there will be groups manipulation from our side or the kernel and since the kernel always assumes that NGROUPS_MAX is positive, then follow that and support only positive values.
2016-10-24shared, systemctl: teach is-enabled to show installation targetsJan Synacek
It may be desired by users to know what targets a particular service is installed into. Improve user friendliness by teaching the is-enabled command to show such information when used with --full. This patch makes use of the newly added UnitFileFlags and adds UNIT_FILE_DRY_RUN flag into it. Since the API had already been modified, it's now easy to add the new dry-run feature for other commands as well. As a next step, --dry-run could be added to systemctl, which in turn might pave the way for a long requested dry-run feature when running systemctl start.
2016-10-24install: introduce UnitFileFlagsJan Synacek
Introduce a new enum to get rid of some boolean arguments of unit_file_* functions. It unifies the code, makes it a bit cleaner and extensible.
2016-10-23core: lets move the setup of working directory before group enforceDjalal Harouni
This is minor but lets try to split and move bit by bit cgroups and portable environment setup before applying the security context.
2016-10-23core: first lookup and cache creds then apply them after namespace setupDjalal Harouni
This fixes: https://github.com/systemd/systemd/issues/4357 Let's lookup and cache creds then apply them. We also switch from getgroups() to getgrouplist().
2016-10-22core: do not set no_new_privileges flag in config_parse_syscall_filterZbigniew Jędrzejewski-Szmek
If SyscallFilter was set, and subsequently cleared, the no_new_privileges flag was not reset properly. We don't need to set this flag here, it will be set automatically in unit_patch_contexts() if syscall_filter is set.
2016-10-22Merge pull request #4428 from lnykryn/ctrl_v2Zbigniew Jędrzejewski-Szmek
rename failure-action to emergency-action and use it for ctrl+alt+del burst
2016-10-22tree-wide: make parse_proc_cmdline() strip "rd." prefix automaticallyZbigniew Jędrzejewski-Szmek
This stripping is contolled by a new boolean parameter. When the parameter is true, it means that the caller does not care about the distinction between initrd and real root, and wants to act on both rd-dot-prefixed and unprefixed parameters in the initramfs, and only on the unprefixed parameters in real root. If the parameter is false, behaviour is the same as before. Changes by caller: log.c (systemd.log_*): changed to accept rd-dot-prefix params pid1: no change, custom logic cryptsetup-generator: no change, still accepts rd-dot-prefix params debug-generator: no change, does not accept rd-dot-prefix params fsck: changed to accept rd-dot-prefix params fstab-generator: no change, custom logic gpt-auto-generator: no change, custom logic hibernate-resume-generator: no change, does not accept rd-dot-prefix params journald: changed to accept rd-dot-prefix params modules-load: no change, still accepts rd-dot-prefix params quote-check: no change, does not accept rd-dot-prefix params udevd: no change, still accepts rd-dot-prefix params I added support for "rd." params in the three cases where I think it's useful: logging, fsck options, journald forwarding options.
2016-10-22tree-wide: allow state to be passed through to parse_proc_cmdline_itemZbigniew Jędrzejewski-Szmek
No functional change.
2016-10-21core: use emergency_action for ctr+alt+del burstLukas Nykryn
Fixes #4306
2016-10-21failure-action: generalize failure action to emergency actionLukas Nykryn
2016-10-21core: if the start command vanishes during runtime don't hit an assertLennart Poettering
This can happen when the configuration is changed and reloaded while we are executing a service. Let's not hit an assert in this case. Fixes: #4444
2016-10-20journald,core: add short comments we we keep reopening /dev/console all the timeLennart Poettering
Just to make sure the next one reading this isn't surprised that the fd isn't kept open. SAK and stuff... Fix suggested: https://github.com/systemd/systemd/pull/4366#issuecomment-253659162
2016-10-20Merge pull request #4417 from keszybz/man-and-rlimitLennart Poettering
Two unrelated patches: man page tweaks and rlimit log levels
2016-10-19pid1: downgrade some rlimit warningsZbigniew Jędrzejewski-Szmek
Since we ignore the result anyway, downgrade errors to warning. log_oom() will still emit an error, but that's mostly theoretical, so it is not worth complicating the code to avoid the small inconsistency
2016-10-19core: let's upgrade the log level for service processes dying of signal (#4415)Lennart Poettering
As suggested in https://github.com/systemd/systemd/pull/4367#issuecomment-253670328
2016-10-17core/exec: add a named-descriptor option ("fd") for streams (#4179)Luca Bruno
This commit adds a `fd` option to `StandardInput=`, `StandardOutput=` and `StandardError=` properties in order to connect standard streams to externally named descriptors provided by some socket units. This option looks for a file descriptor named as the corresponding stream. Custom names can be specified, separated by a colon. If multiple name-matches exist, the first matching fd will be used.
2016-10-17Merge pull request #4392 from keszybz/running-timersLennart Poettering
Fix for display of elapsed timers
2016-10-17core/timer: reset next_elapse_*time when timer is not waitingZbigniew Jędrzejewski-Szmek
When the unit that is triggered by a timer is started and running, we transition to "running" state, and the timer will not elapse again until the unit has finished running. In this state "systemctl list-timers" would display the previously calculated next elapse time, which would now of course be in the past, leading to nonsensical values. Simply set the next elapse to infinity, which causes list-timers to show n/a. We cannot specify when the next elapse will happen, possibly never. Fixes #4031.
2016-10-17pid1: do not use mtime==0 as sign of masking (#4388)Zbigniew Jędrzejewski-Szmek
It is allowed for unit files to have an mtime==0, so instead of assuming that any file that had mtime==0 was masked, use the load_state to filter masked units. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1384150.
2016-10-16tree-wide: introduce free_and_replace helperZbigniew Jędrzejewski-Szmek
It's a common pattern, so add a helper for it. A macro is necessary because a function that takes a pointer to a pointer would be type specific, similarly to cleanup functions. Seems better to use a macro.
2016-10-16tree-wide: use mfree moreZbigniew Jędrzejewski-Szmek
2016-10-14core: make settings for unified cgroup hierarchy supersede the ones for ↵Tejun Heo
legacy hierarchy (#4269) There are overlapping control group resource settings for the unified and legacy hierarchies. To help transition, the settings are translated back and forth. When both versions of a given setting are present, the one matching the cgroup hierarchy type in use is used. Unfortunately, this is more confusing to use and document than necessary because there is no clear static precedence. Update the translation logic so that the settings for the unified hierarchy are always preferred. systemd.resource-control man page is updated to reflect the change and reorganized so that the deprecated settings are at the end in its own section.
2016-10-12core: make sure to dump ProtectKernelModules= valueDjalal Harouni
2016-10-12core: check protect_kernel_modules and private_devices in order to setup NNPDjalal Harouni
2016-10-12core:sandbox: lets make /lib/modules/ inaccessible on ProtectKernelModules=Djalal Harouni
Lets go further and make /lib/modules/ inaccessible for services that do not have business with modules, this is a minor improvment but it may help on setups with custom modules and they are limited... in regard of kernel auto-load feature. This change introduce NameSpaceInfo struct which we may embed later inside ExecContext but for now lets just reduce the argument number to setup_namespace() and merge ProtectKernelModules feature.
2016-10-12core:sandbox: remove CAP_SYS_RAWIO on PrivateDevices=yesDjalal Harouni
The rawio system calls were filtered, but CAP_SYS_RAWIO allows to access raw data through /proc, ioctl and some other exotic system calls...
2016-10-12core:sandbox: Add ProtectKernelModules= optionDjalal Harouni
This is useful to turn off explicit module load and unload operations on modular kernels. This option removes CAP_SYS_MODULE from the capability bounding set for the unit, and installs a system call filter to block module system calls. This option will not prevent the kernel from loading modules using the module auto-load feature which is a system wide operation.
2016-10-12Allow block and char classes in DeviceAllow bus properties (#4353)Zbigniew Jędrzejewski-Szmek
Allowed paths are unified betwen the configuration file parses and the bus property checker. The biggest change is that the bus code now allows "block-" and "char-" classes. In addition, path_startswith("/dev") was used in the bus code, and startswith("/dev") was used in the config file code. It seems reasonable to use path_startswith() which allows a slightly broader class of strings. Fixes #3935.