Age | Commit message (Collapse) | Author |
|
ProtectControlGroups=
If enabled, these will block write access to /sys, /proc/sys and
/proc/sys/fs/cgroup.
|
|
Network file dropins
|
|
This way we don't have to create a nulstr just to unpack it in a moment.
|
|
In preparation for adding a version which takes a strv.
|
|
condition: ignore nanoseconds in timestamps for ConditionNeedsUpdate=
Fixes #4130.
|
|
to prevent false-positives
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=90192 and #4130
for real. Also, remove timestamp check in update-done.c altogether since
the whole operation is idempotent.
|
|
According to its manual page, flags given to mkostemp(3) shouldn't include
O_RDWR, O_CREAT or O_EXCL flags as these are always included. Beyond
those, the only flag that all callers (except a few tests where it
probably doesn't matter) use is O_CLOEXEC, so set that unconditionally.
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=1374371
When root was empty or equal to "/", chroot_symlinks_same was called with
root==NULL, and strjoina returned "", so the code thought both paths are equal
even if they were not. Fix that by always providing a non-null first argument
to strjoina.
|
|
One trailing dot is valid, but more than one isn't. This also fixes glibc's
posix/tst-getaddrinfo5 test.
Fixes #3978.
|
|
In https://github.com/systemd/systemd/pull/4004 , a runtime detection
method for seccomp was added. However, it does not detect the case
where CONFIG_SECCOMP=y but CONFIG_SECCOMP_FILTER=n. This is possible
if the architecture does not support filtering yet.
Add a check for that case too.
While at it, change get_proc_field usage to use PR_GET_SECCOMP prctl,
as that should save a few system calls and (unnecessary) allocations.
Previously, reading of /proc/self/stat was done as recommended by
prctl(2) as safer. However, given that we need to do the prctl call
anyway, lets skip opening, reading and parsing the file.
Code for checking inspired by
https://outflux.net/teach-seccomp/autodetect.html
|
|
This patch allows to configure AgeingTimeSec, Priority and DefaultPVID for
bridge interfaces.
|
|
|
|
permit bus clients to pin units to avoid automatic GC
|
|
Fixes #3882
|
|
bus_connect_transport() is exclusively used from our command line tools, hence
let's set exit-on-disconnect for all of them, making behaviour a bit nicer in
case dbus-daemon goes down.
|
|
|
|
Make sure we return proper errors for types not understood yet.
|
|
Instead of ignoring empty strings retrieved via the bus, treat them as NULL, as
it's customary in systemd.
|
|
Let's make sure we can read the exit code/status properties exposed by PID 1
properly. Let's reuse the existing code for unsigned fields, as we just use it
to copy words around, and don't calculate it.
|
|
A lot of basic code wants to know the stack size, and it is safe if they do,
hence let's permit getrlimit() (but not setrlimit()) by default.
See: #3970
|
|
When told to enable a template unit, and the DefaultInstance specified in that
unit was masked, we would do this. Such a unit cannot be started or loaded, so
reporting successful enabling is misleading and unexpected.
$ systemctl mask getty@tty1
Created symlink /etc/systemd/system/getty@tty1.service → /dev/null.
$ systemctl --root=/ enable getty@tty1
(unchanged)
Failed to enable unit, unit /etc/systemd/system/getty@tty1.service is masked.
$ systemctl --root=/ enable getty@
(before)
Created symlink /etc/systemd/system/getty.target.wants/getty@tty1.service → /usr/lib/systemd/system/getty@.service.
(now)
Failed to enable unit, unit /etc/systemd/system/getty@tty1.service is masked.
The same error is emitted for enable and preset. And an error is emmited, not a
warning, so the failure to enable DefaultInstance is treated the same as if the
instance was specified on the command line. I think that this makes most sense,
for most template units.
Fixes #2513.
|
|
add a new tool for creating transient mount and automount units
|
|
Fix preset-all
|
|
A masked unit is listed in Also=:
$ systemctl cat test1 test2
→# /etc/systemd/system/test1.service
[Unit]
Description=test service 1
[Service]
Type=oneshot
ExecStart=/usr/bin/true
[Install]
WantedBy=multi-user.target
Also=test2.service
Alias=alias1.service
→# /dev/null
$ systemctl --root=/ enable test1
(before)
Created symlink /etc/systemd/system/alias1.service → /etc/systemd/system/test1.service.
Created symlink /etc/systemd/system/multi-user.target.wants/test1.service → /etc/systemd/system/test1.service.
The unit files have no installation config (WantedBy, RequiredBy, Also, Alias
settings in the [Install] section, and DefaultInstance for template units).
This means they are not meant to be enabled using systemctl.
Possible reasons for having this kind of units are:
1) A unit may be statically enabled by being symlinked from another unit's
.wants/ or .requires/ directory.
2) A unit's purpose may be to act as a helper for some other unit which has
a requirement dependency on it.
3) A unit may be started when needed via activation (socket, path, timer,
D-Bus, udev, scripted systemctl call, ...).
4) In case of template units, the unit is meant to be enabled with some
instance name specified.
(after)
Created symlink /etc/systemd/system/alias1.service → /etc/systemd/system/test1.service.
Created symlink /etc/systemd/system/multi-user.target.wants/test1.service → /etc/systemd/system/test1.service.
Unit /etc/systemd/system/test2.service is masked, ignoring.
|
|
Running preset-all on a system installed from rpms or even created
using make install would remove and recreate a lot of symlinks, changing
relative to absolute symlinks. In general relative symlinks are nicer,
so there is no reason to change them, and those spurious changes were
obscuring more interesting stuff.
$ make install DESTDIR=/var/tmp/inst1
$ systemctl preset-all --root=/var/tmp/inst1
(before)
Removed /var/tmp/inst1/etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service.
Created symlink /var/tmp/inst1/etc/systemd/system/ctrl-alt-del.target → /usr/lib/systemd/system/exit.target.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/remote-fs.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/remote-fs.target → /usr/lib/systemd/system/remote-fs.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/machines.target → /usr/lib/systemd/system/machines.target.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-journal-remote.socket → /usr/lib/systemd/system/systemd-journal-remote.socket.
Removed /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-networkd.socket.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-networkd.socket → /usr/lib/systemd/system/systemd-networkd.socket.
Removed /var/tmp/inst1/etc/systemd/system/getty.target.wants/getty@tty1.service.
Created symlink /var/tmp/inst1/etc/systemd/system/getty.target.wants/getty@tty1.service → /usr/lib/systemd/system/getty@.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-journal-upload.service → /usr/lib/systemd/system/systemd-journal-upload.service.
Removed /var/tmp/inst1/etc/systemd/system/sysinit.target.wants/systemd-timesyncd.service.
Created symlink /var/tmp/inst1/etc/systemd/system/sysinit.target.wants/systemd-timesyncd.service → /usr/lib/systemd/system/systemd-timesyncd.service.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-resolved.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-resolved.service → /usr/lib/systemd/system/systemd-resolved.service.
Removed /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-networkd.service.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-networkd.service → /usr/lib/systemd/system/systemd-networkd.service.
(after)
Removed /var/tmp/inst1/etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service.
Created symlink /var/tmp/inst1/etc/systemd/system/ctrl-alt-del.target → /usr/lib/systemd/system/exit.target.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/machines.target → /usr/lib/systemd/system/machines.target.
Created symlink /var/tmp/inst1/etc/systemd/system/sockets.target.wants/systemd-journal-remote.socket → /usr/lib/systemd/system/systemd-journal-remote.socket.
Created symlink /var/tmp/inst1/etc/systemd/system/multi-user.target.wants/systemd-journal-upload.service → /usr/lib/systemd/system/systemd-journal-upload.service.
|
|
No functional change intended.
|
|
Before, when interating over unit files during preset-all, behaviour was the
following:
- if we hit the real unit name first, presets were queried for that name, and
that unit was enabled or disabled accordingly,
- if we hit an alias first (one of the symlinks chaining to the real unit), we
checked the presets using the symlink name, and then proceeded to enable or
disable the real unit.
E.g. for systemd-networkd.service we have the alias dbus-org.freedesktop.network1.service
(/usr/lib/systemd/system/dbus-org.freedesktop.network1.service), but the preset
is only for the systemd-networkd.service name. The service would be enabled or
disabled pseudorandomly depending on the order of iteration.
For "preset", behaviour was analogous: preset on the alias name disabled the
service (following the default disable policy), preset on the "real" name
applied the presets.
With the patch, for "preset" and "preset-all" we silently skip symlinks. This
gives mostly the right behaviour, with the limitation that presets on aliases
are ignored. I think that presets on aliases are not that common (at least my
preset files on Fedora don't exhibit any such usage), and should not be
necessary, since whoever installs the preset can just refer to the real unit
file. It would be possible to overcome this limitation by gathering a list of
names of a unit first, and then checking whether *any* of the names matches the
presets list. That would require a significant redesign of the code, and be
a lot slower (since we would have to fully read all unit directories to preset
one unit) to so I'm not doing that for now.
With this patch, two properties are satisfied:
- preset-all and preset are idempotent, and the second and subsequent invocations
do not produce any changes,
- preset-all and preset for a specific name produce the same state for that unit.
Fixes #3616.
|
|
|
|
If the directory is missing, we can assume that those pesky symlinks are gone too.
|
|
|
|
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e. all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.
This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.
In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
|
|
|
|
|
|
This is done exactly the same way a couple of times at various places, let's
unify this into one version.
|
|
core: add cgroup CPU controller support on the unified hierarchy
(zj: merging not squashing to make it clear against which upstream this patch was developed.)
|
|
Under NixOS, the config_path /etc/systemd/system is a symlink to
/etc/static/systemd/system. Commands such as `systemctl list-unit-files`
and `systemctl is-enabled` did not work as the symlink was not followed.
This does not affect how symlinks are treated within the config_path
directory.
|
|
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches. The situation is explained in
the following documentation.
https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu
While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads. This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.
On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config
options are added with the usual compat translation. CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.
v2: - Error in man page corrected.
- CPU config application in cgroup_context_apply() refactored.
- CPU accounting now works on unified hierarchy.
|
|
|
|
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.
No functional changes.
|
|
This permits CPUQuota to accept greater values as documented.
|
|
This new output mode formats all timestamps using the usual format_timestamp()
call we use pretty much everywhere else. Timestamps formatted this way are some
ways more useful than traditional syslog timestamps as they include weekday,
month and timezone information, while not being much longer. They are also not
locale-dependent. The primary advantage however is that they may be passed
directly to journalctl's --since= and --until= switches as soon as #3869 is
merged.
While we are at it, let's also add "short-unix" to shell completion.
|
|
This setting adds minimal user namespacing support to a service. When set the invoked
processes will run in their own user namespace. Only a trivial mapping will be
set up: the root user/group is mapped to root, and the user/group of the
service will be mapped to itself, everything else is mapped to nobody.
If this setting is used the service runs with no capabilities on the host, but
configurable capabilities within the service.
This setting is particularly useful in conjunction with RootDirectory= as the
need to synchronize /etc/passwd and /etc/group between the host and the service
OS tree is reduced, as only three UID/GIDs need to match: root, nobody and the
user of the service itself. But even outside the RootDirectory= case this
setting is useful to substantially reduce the attack surface of a service.
Example command to test this:
systemd-run -p PrivateUsers=1 -p User=foobar -t /bin/sh
This runs a shell as user "foobar". When typing "ps" only processes owned by
"root", by "foobar", and by "nobody" should be visible.
|
|
|
|
User expectations are broken when "systemctl enable /some/path/service.service"
behaves differently to "systemctl link ..." followed by "systemctl enable".
From user's POV, "enable" with the full path just combines the two steps into
one.
Fixes #3010.
|
|
|
|
service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).
For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.
A simple way to test this is:
systemd-run -p DynamicUser=1 /bin/sleep 99999
|
|
That way, we can neatly keep this in line with the new TasksMaxScale= option.
Note that we didn't release a version with MemoryLimitByPhysicalMemory= yet,
hence this change should be unproblematic without breaking API.
|
|
This adds support for a TasksMax=40% syntax for specifying values relative to
the system's configured maximum number of processes. This is useful in order to
neatly subdivide the available room for tasks within containers.
|
|
|
|
namespace: unify limit behavior on non-directory paths
|