Age | Commit message (Collapse) | Author |
|
Fix preset-all
|
|
A masked unit is listed in Also=:
$ systemctl cat test1 test2
→# /etc/systemd/system/test1.service
[Unit]
Description=test service 1
[Service]
Type=oneshot
ExecStart=/usr/bin/true
[Install]
WantedBy=multi-user.target
Also=test2.service
Alias=alias1.service
→# /dev/null
$ systemctl --root=/ enable test1
(before)
Created symlink /etc/systemd/system/alias1.service → /etc/systemd/system/test1.service.
Created symlink /etc/systemd/system/multi-user.target.wants/test1.service → /etc/systemd/system/test1.service.
The unit files have no installation config (WantedBy, RequiredBy, Also, Alias
settings in the [Install] section, and DefaultInstance for template units).
This means they are not meant to be enabled using systemctl.
Possible reasons for having this kind of units are:
1) A unit may be statically enabled by being symlinked from another unit's
.wants/ or .requires/ directory.
2) A unit's purpose may be to act as a helper for some other unit which has
a requirement dependency on it.
3) A unit may be started when needed via activation (socket, path, timer,
D-Bus, udev, scripted systemctl call, ...).
4) In case of template units, the unit is meant to be enabled with some
instance name specified.
(after)
Created symlink /etc/systemd/system/alias1.service → /etc/systemd/system/test1.service.
Created symlink /etc/systemd/system/multi-user.target.wants/test1.service → /etc/systemd/system/test1.service.
Unit /etc/systemd/system/test2.service is masked, ignoring.
|
|
Running preset-all on a system installed from rpms or even created
using make install would remove and recreate a lot of symlinks, changing
relative to absolute symlinks. In general relative symlinks are nicer,
so there is no reason to change them, and those spurious changes were
obscuring more interesting stuff.
$ make install DESTDIR=/var/tmp/inst1
$ systemctl preset-all --root=/var/tmp/inst1
(before)
Removed /var/tmp/inst1/etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service.
Created symlink /var/tmp/inst1/etc/systemd/system/ctrl-alt-del.target → /usr/lib/systemd/system/exit.target.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/remote-fs.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/remote-fs.target → /usr/lib/systemd/system/remote-fs.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/machines.target → /usr/lib/systemd/system/machines.target.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-journal-remote.socket → /usr/lib/systemd/system/systemd-journal-remote.socket.
Removed /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-networkd.socket.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-networkd.socket → /usr/lib/systemd/system/systemd-networkd.socket.
Removed /var/tmp/inst1/etc/systemd/system/getty.target.wants/getty@tty1.service.
Created symlink /var/tmp/inst1/etc/systemd/system/getty.target.wants/getty@tty1.service → /usr/lib/systemd/system/getty@.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-journal-upload.service → /usr/lib/systemd/system/systemd-journal-upload.service.
Removed /var/tmp/inst1/etc/systemd/system/sysinit.target.wants/systemd-timesyncd.service.
Created symlink /var/tmp/inst1/etc/systemd/system/sysinit.target.wants/systemd-timesyncd.service → /usr/lib/systemd/system/systemd-timesyncd.service.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-resolved.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-resolved.service → /usr/lib/systemd/system/systemd-resolved.service.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-networkd.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-networkd.service → /usr/lib/systemd/system/systemd-networkd.service.
(after)
Removed /var/tmp/inst1/etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service.
Created symlink /var/tmp/inst1/etc/systemd/system/ctrl-alt-del.target → /usr/lib/systemd/system/exit.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/machines.target → /usr/lib/systemd/system/machines.target.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-journal-remote.socket → /usr/lib/systemd/system/systemd-journal-remote.socket.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-journal-upload.service → /usr/lib/systemd/system/systemd-journal-upload.service.
|
|
No functional change intended.
|
|
Before, when interating over unit files during preset-all, behaviour was the
following:
- if we hit the real unit name first, presets were queried for that name, and
that unit was enabled or disabled accordingly,
- if we hit an alias first (one of the symlinks chaining to the real unit), we
checked the presets using the symlink name, and then proceeded to enable or
disable the real unit.
E.g. for systemd-networkd.service we have the alias dbus-org.freedesktop.network1.service
(/usr/lib/systemd/system/dbus-org.freedesktop.network1.service), but the preset
is only for the systemd-networkd.service name. The service would be enabled or
disabled pseudorandomly depending on the order of iteration.
For "preset", behaviour was analogous: preset on the alias name disabled the
service (following the default disable policy), preset on the "real" name
applied the presets.
With the patch, for "preset" and "preset-all" we silently skip symlinks. This
gives mostly the right behaviour, with the limitation that presets on aliases
are ignored. I think that presets on aliases are not that common (at least my
preset files on Fedora don't exhibit any such usage), and should not be
necessary, since whoever installs the preset can just refer to the real unit
file. It would be possible to overcome this limitation by gathering a list of
names of a unit first, and then checking whether *any* of the names matches the
presets list. That would require a significant redesign of the code, and be
a lot slower (since we would have to fully read all unit directories to preset
one unit) to so I'm not doing that for now.
With this patch, two properties are satisfied:
- preset-all and preset are idempotent, and the second and subsequent invocations
do not produce any changes,
- preset-all and preset for a specific name produce the same state for that unit.
Fixes #3616.
|
|
|
|
If the directory is missing, we can assume that those pesky symlinks are gone too.
|
|
|
|
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e. all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.
This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.
In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
|
|
|
|
|
|
core: add cgroup CPU controller support on the unified hierarchy
(zj: merging not squashing to make it clear against which upstream this patch was developed.)
|
|
Under NixOS, the config_path /etc/systemd/system is a symlink to
/etc/static/systemd/system. Commands such as `systemctl list-unit-files`
and `systemctl is-enabled` did not work as the symlink was not followed.
This does not affect how symlinks are treated within the config_path
directory.
|
|
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches. The situation is explained in
the following documentation.
https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu
While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads. This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.
On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config
options are added with the usual compat translation. CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.
v2: - Error in man page corrected.
- CPU config application in cgroup_context_apply() refactored.
- CPU accounting now works on unified hierarchy.
|
|
|
|
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.
No functional changes.
|
|
This permits CPUQuota to accept greater values as documented.
|
|
This new output mode formats all timestamps using the usual format_timestamp()
call we use pretty much everywhere else. Timestamps formatted this way are some
ways more useful than traditional syslog timestamps as they include weekday,
month and timezone information, while not being much longer. They are also not
locale-dependent. The primary advantage however is that they may be passed
directly to journalctl's --since= and --until= switches as soon as #3869 is
merged.
While we are at it, let's also add "short-unix" to shell completion.
|
|
This setting adds minimal user namespacing support to a service. When set the invoked
processes will run in their own user namespace. Only a trivial mapping will be
set up: the root user/group is mapped to root, and the user/group of the
service will be mapped to itself, everything else is mapped to nobody.
If this setting is used the service runs with no capabilities on the host, but
configurable capabilities within the service.
This setting is particularly useful in conjunction with RootDirectory= as the
need to synchronize /etc/passwd and /etc/group between the host and the service
OS tree is reduced, as only three UID/GIDs need to match: root, nobody and the
user of the service itself. But even outside the RootDirectory= case this
setting is useful to substantially reduce the attack surface of a service.
Example command to test this:
systemd-run -p PrivateUsers=1 -p User=foobar -t /bin/sh
This runs a shell as user "foobar". When typing "ps" only processes owned by
"root", by "foobar", and by "nobody" should be visible.
|
|
|
|
User expectations are broken when "systemctl enable /some/path/service.service"
behaves differently to "systemctl link ..." followed by "systemctl enable".
From user's POV, "enable" with the full path just combines the two steps into
one.
Fixes #3010.
|
|
|
|
service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).
For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.
A simple way to test this is:
systemd-run -p DynamicUser=1 /bin/sleep 99999
|
|
That way, we can neatly keep this in line with the new TasksMaxScale= option.
Note that we didn't release a version with MemoryLimitByPhysicalMemory= yet,
hence this change should be unproblematic without breaking API.
|
|
This adds support for a TasksMax=40% syntax for specifying values relative to
the system's configured maximum number of processes. This is useful in order to
neatly subdivide the available room for tasks within containers.
|
|
|
|
namespace: unify limit behavior on non-directory paths
|
|
This patch renames Read{Write,Only}Directories= and InaccessibleDirectories=
to Read{Write,Only}Paths= and InaccessiblePaths=, previous names are kept
as aliases but they are not advertised in the documentation.
Renamed variables:
`read_write_dirs` --> `read_write_paths`
`read_only_dirs` --> `read_only_paths`
`inaccessible_dirs` --> `inaccessible_paths`
|
|
json output
With this change, binary record data is formatted as string if --all is
specified when using json output. This is inline with the effect of --all on
the other available output modes.
Fixes: #3416
|
|
strv_make_nulstr was creating a nulstr which was not a valid nulstr,
because it was missing the terminating NUL. This didn't cause any issues,
because strv_parse_nulstr correctly parsed the result, using the
separately specified length.
But it's confusing to have something called nulstr which really isn't.
It is likely that somebody will try to use strv_make_nulstr() in
some other place, incorrectly.
This patch changes strv_parse_nulstr() to produce a valid nulstr, and
changes the output length parameter to be the minimum number of bytes
which can be later on parsed by strv_parse_nulstr(). This allows the
only user in ask-password-api to be slightly simplified.
Based-on-patch-by: Jean-Sébastien Bour <jean-sebastien@bour.name>
Fixes #3689.
|
|
|
|
* networkd: condition_test() can return a negative error, handle that
If a condition check fails with an error we should not consider the check
successful. Fix that.
We should probably also improve logging in this case, but for now, let's just
unbreak this breakage.
Fixes: #3236
* condition: handle unrecognized architectures nicer
When we encounter a check for an architecture we don't know we should not
let the condition check fail with an error code, but instead simply return
false. After all the architecture might just be newer than the ones we know, in
which case it's certainly not our local one.
Fixes: #3236
|
|
resolved: more fixes, among them "systemctl-resolve --status" to see DNS configuration in effect, and a local DNS stub listener on 127.0.0.53
|
|
It makes use of the sd_listen_fds() call, and as such should live in
src/shared, as the distinction between src/basic and src/shared is that the
latter may use libsystemd APIs, the former does not.
Note that btrfs-util.[ch] and log.[ch] also include header files from
libsystemd, but they only need definitions, they do not invoke any function
from it. Hence they may stay in src/basic.
|
|
Do not ellipsize cgroups when showing slices in --full mode
|
|
sd-bus generally exposes bools as "int" instead of "bool" in the public API.
This is relevant when unmarshaling booleans, as the relevant functions expect
an int* pointer and no bool* pointer. Since sizeof(bool) is not necessarily the
same as sizeof(int) this is problematic and might result in memory corruption.
Let's fix this, and make sure bus_map_all_properties() handles booleans as
ints, as the rest of sd-bus, and make all users of it expect the right thing.
|
|
Delete the dbus1 generator and some critical wiring. This prevents
kdbus from being loaded or detected. As such, it will never be used,
even if the user still has a useful kdbus module loaded on their system.
Sort of fixes #3480. Not really, but it's better than the current state.
|
|
various changes, most importantly regarding memory metrics
|
|
If for whatever reason the file system is "corrupted", we want
to be resilient and ignore the error, as long as we can load the units
from a different place.
Arch bug https://bugs.archlinux.org/task/49547.
A user had an ntfs symlink (essentially a file) instead of a directory after
restoring from backup. We should just ignore that like we would treat a missing
directory, for general resiliency.
We should treat permission errors similarly. For example an unreadable
/usr/local/lib directory would prevent (user) instances of systemd from
loading any units. It seems better to continue.
|
|
The unit files already accept relative, percent-based memory limit
specification, let's make sure "systemctl set-property" support this too.
Since we want the physical memory size of the destination machine to apply we
pass the percentage in a new set of properties that only exist for this
purpose, and can only be set.
|
|
And port a couple of users over to it.
|
|
This adds three new seccomp syscall groups: @keyring for kernel keyring access,
@cpu-emulation for CPU emulation features, for exampe vm86() for dosemu and
suchlike, and @debug for ptrace() and related calls.
Also, the @clock group is updated with more syscalls that alter the system
clock. capset() is added to @privileged, and pciconfig_iobase() is added to
@raw-io.
Finally, @obsolete is a cleaned up. A number of syscalls that never existed on
Linux and have no number assigned on any architecture are removed, as they only
exist in the man pages and other operating sytems, but not in code at all.
create_module() is moved from @module to @obsolete, as it is an obsolete system
call. mem_getpolicy() is removed from the @obsolete list, as it is not
obsolete, but simply a NUMA API.
|
|
Let's add a generic parser for VLAN ids, which should become handy as
preparation for PR #3428. Let's also make sure we use uint16_t for the vlan ID
type everywhere, and that validity checks are already applied at the time of
parsing, and not only whne we about to prepare a netdev.
Also, establish a common definition VLANID_INVALID we can use for
non-initialized VLAN id fields.
|
|
Now we don't support parsing double at map_basic.
when trying to use bus_message_map_all_properties with a double
this fails. Let's add it.
|
|
Assorted stuff
|
|
New exec boolean MemoryDenyWriteExecute, when set, installs
a seccomp filter to reject mmap(2) with PAGE_WRITE|PAGE_EXEC
and mprotect(2) with PAGE_EXEC.
|
|
Recently added cgroup unified hierarchy support uses "max" in configurations
for no upper limit. While consistent with what the kernel uses for no upper
limit, it is inconsistent with what systemd uses for other controllers such as
memory or pids. There's no point in introducing another term. Update cgroup
unified hierarchy support so that "infinity" is the only term that systemd
uses for no upper limit.
|
|
Implement sets of system calls to help constructing system call
filters. A set starts with '@' to distinguish from a system call.
Closes: #3053, #3157
|
|
As suggested here:
https://bugs.freedesktop.org/show_bug.cgi?id=64737#c8
This adds a new call terminal_is_dumb() and makes use of this where
appropriate.
|
|
|