Age | Commit message (Collapse) | Author |
|
There is a slight change in behaviour: the user manager for root will create a
temporary file in /run/systemd, not /tmp. I don't think this matters, but
simplifies implementation.
|
|
CID #778045.
|
|
CID #1368238.
|
|
hierarchy
Currently the hybrid mode mounts cgroup v2 on /sys/fs/cgroup instead of the v1
name=systemd hierarchy. While this works fine for systemd itself, it breaks
tools which expect cgroup v1 hierarchy on /sys/fs/cgroup/systemd.
This patch updates the hybrid mode so that it mounts v2 hierarchy on
/sys/fs/cgroup/unified and keeps v1 "name=systemd" hierarchy on
/sys/fs/cgroup/systemd for compatibility. systemd itself doesn't depend on the
"name=systemd" hierarchy at all. All operations take place on the v2 hierarchy
as before but the v1 hierarchy is kept in sync so that any tools which expect
it to be there can keep doing so. This allows systemd to take advantage of
cgroup v2 process management without requiring other tools to be aware of the
hybrid mode.
The hybrid mode is implemented by mapping the special systemd controller to
/sys/fs/cgroup/unified and making the basic cgroup utility operations -
cg_attach(), cg_create(), cg_rmdir() and cg_trim() - also operate on the
/sys/fs/cgroup/systemd hierarchy whenever the cgroup2 hierarchy is updated.
While a bit messy, this will allow dropping complications from using cgroup v1
for process management a lot sooner than otherwise possible which should make
it a net gain in terms of maintainability.
v2: Fixed !cgns breakage reported by @evverx and renamed the unified mount
point to /sys/fs/cgroup/unified as suggested by @brauner.
v3: chown the compat hierarchy too on delegation. Suggested by @evverx.
v4: [zj]
- drop the change to default, full "legacy" is still the default.
|
|
SYSTEMD_CGROUP_CONTROLLER is currently defined as "name=systemd" which cgroup
utility functions interpret as a named cgroup hierarchy with the specified
named. With the planned cgroup hybrid mode changes, SYSTEMD_CGROUP_CONTROLLER
would map to different hierarchy names.
This patch makes SYSTEMD_CGROUP_CONTROLLER a special string "_systemd" which is
substituted to "name=systemd" by the cgroup utility functions. This allows the
callers to address the systemd hierarchy without actually specifying the
hierarchy name allowing the cgroup utility functions to map it to whatever is
appropriate.
Note that SYSTEMD_CGROUP_CONTROLLER was already special on full unified cgroup
hierarchy even before this patch.
|
|
cg_[all_]unified() test whether a specific controller or all controllers are on
the unified hierarchy. While what's being asked is a simple binary question,
the callers must assume that the functions may fail any time, which
unnecessarily complicates their usages. This complication is unnecessary.
Internally, the test result is cached anyway and there are only a few places
where the test actually needs to be performed.
This patch simplifies cg_[all_]unified().
* cg_[all_]unified() are updated to return bool. If the result can't be
decided, assertion failure is triggered. Error handlings from their callers
are dropped.
* cg_unified_flush() is updated to calculate the new result synchrnously and
return whether it succeeded or not. Places which need to flush the test
result are updated to test for failure. This ensures that all the following
cg_[all_]unified() tests succeed.
* Places which expected possible cg_[all_]unified() failures are updated to
call and test cg_unified_flush() before calling cg_[all_]unified(). This
includes functions used while setting up mounts during boot and
manager_setup_cgroup().
|
|
machined userns fixes
|
|
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.
|
|
When /etc/hostname isn't set, default to the configured compile-time
fallback hostname instead of "localhost" for the kernel hostname.
|
|
Collect interpreter backtraces in systemd-coredump
|
|
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
"conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.
Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:
$ size .libs/systemd{,-logind,-journald}{.old,}
text data bss dec hex filename
1265204 149564 4808 1419576 15a938 .libs/systemd.old
1260268 149564 4808 1414640 1595f0 .libs/systemd
246805 13852 209 260866 3fb02 .libs/systemd-logind.old
240973 13852 209 255034 3e43a .libs/systemd-logind
146839 4984 34 151857 25131 .libs/systemd-journald.old
146391 4984 34 151409 24f71 .libs/systemd-journald
It is also much easier to check if a certain binary uses a certain MESSAGE_ID:
$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
|
|
start operation of a unit
Let's make sure we verify that all BindsTo= are in order before we actually go
and dispatch a start operation to a unit. Normally the job queue should already
have made sure all deps are in order, but this might not have been sufficient
in two cases: a) when the user changes deps during runtime and reloads the
daemon, and b) when the user placed BindsTo= dependencies without matching
After= dependencies, so that we don't actually wait for the bound to unit to be
up before upping also the binding unit.
See: #4725
|
|
Silence gcc warnings
|
|
a good number of resolved fixes
|
|
src/core/dbus.c: In function 'find_unit':
src/core/dbus.c:334:15: warning: 'u' may be used uninitialized in this function [-Wmaybe-uninitialized]
*unit = u;
^
src/core/dbus.c:301:15: note: 'u' was declared here
Unit *u;
^
|
|
At -O3, this was printed a hundred times for various callers of
manager_add_job_by_name(). AFAICT, there is no error and `unit` is always
intialized. Nevertheless, add explicit initialization to silence the noise.
src/core/manager.c: In function 'manager_start_target':
src/core/manager.c:1413:16: warning: 'unit' may be used uninitialized in this function [-Wmaybe-uninitialized]
return manager_add_job(m, type, unit, mode, e, ret);
^
src/core/manager.c:1401:15: note: 'unit' was declared here
Unit *unit;
^
|
|
We were inconsistent, manager_load_unit_prepare() would crash if _ret was ever NULL.
But none of the callers use NULL. So simplify things and require it to be non-NULL.
|
|
PermissionsStartOnly= (#5309)
ReadOnlyPaths=, ProtectHome=, InaccessiblePaths= and ProtectSystem= are
about restricting access and little more, hence they should be disabled
if PermissionsStartOnly= is used or ExecStart= lines are prefixed with a
"+". Do that.
(Note that we will still create namespaces and stuff, since that's about
a lot more than just permissions. We'll simply disable the effect of
the four options mentioned above, but nothing else mount related.)
This also adds a test for this, to ensure this works as intended.
No documentation updates, as the documentation are already vague enough
to support the new behaviour ("If true, the permission-related execution
options…"). We could clarify this further, but I think we might want to
extend the switches' behaviour a bit more in future, hence leave it at
this for now.
Fixes: #5308
|
|
a small number of install and unit management related fixes
|
|
We would warn and continue after failure in manager_startup, but there's no
way we can continue. We must fail.
|
|
It's a fairly specialized function. Let's make new files for it and the tests.
|
|
In some cases there might be unit symlinks in .wants/ or .requires/
directories even though the unit is otherwise fully removed. In this
case, don't fail removal, but still remove the symlinks.
This reworks the symlink marking logic to always add unit files that we
are missing to the changes list, but proceed with any symlink removal
for them. This way we'll still generate useful hints that a unit is
missing if you invoke "systemctl disable idontexist.service", but also
still remove any link to it.
Fixes: #4995
|
|
We protect less interetsing stuff with selinux "status", let's do that
here too.
|
|
off the bus (#5294)
Fixes: #4528
|
|
Previously, we'd refuse the GetUnitProcesses() bus call if the unit file
couldn't be loaded. Which is wrong, as admins should be able to inspect
services whose unit files was deleted. Change this logic, so that we
permit introspecting the processes of any unit that is loaded,
regardless if it has a unit file or not.
(Note that we won't load unit files in GetUnitProcess(), but only
operate on already loaded ones. That's because only loaded units can
have processes — as that's how our GC logic works — and hence loading
the unit just for the process tree is pointless, as it would be empty).
See: #4995
|
|
|
|
Fixes: #5125
|
|
|
|
WorkingDirectory=~ is
Or actually, try to to do the right thing depending on what is
available:
- If we know $HOME from User=, then use that.
- If the UID for the service is 0, hardcode that WorkingDirectory=~ means WorkingDirectory=/root
- In any other case (which will be the unprivileged --user case), use
get_home_dir() to find the $HOME of the user we are running as.
- Otherwise fail.
Fixes: #5246 #5124
|
|
This reverts commit 8b89628a10af3863bfc97872912e9da4076a5929.
This broke #5246
|
|
Add new MountAPIVFS= boolean unit file setting + RootImage=
|
|
Mask individual .wants/.requires symlinks
|
|
Feb 04 22:35:42 systemd[1462]: foo.service: Wants dependency dropin /home/zbyszek/.config/systemd/user/foo.service.wants/diffname.service target ../barbar.service has different name
Feb 04 22:35:42 systemd[1462]: foo.service: Wants dependency dropin /home/zbyszek/.config/systemd/user/foo.service.wants/wrongname is not a valid unit name, ignoring
|
|
Fixes #1169.
Fixes #4830.
Example log errors:
Feb 04 22:13:28 systemd[1462]: foo.service: Wants dependency on empty_file.service is masked by /home/zbyszek/.config/systemd/user/foo.service.wants/empty_file.service, ignoring
Feb 04 22:13:28 systemd[1462]: foo.service: Wants dependency on masked.service is masked by /home/zbyszek/.config/systemd/user/foo.service.wants/masked.service, ignoring
|
|
dropins
Essentially, instead of sequentially adding deps based on all symlinks
encountered in .wants and .requires dirs for each name and each unit file load
path, iteratate over the load paths and unit names gathering symlinks, then
order them based on priority, and then iterate over the final list, adding
dependencies.
This patch doesn't change the logic too much, except that the order in which
dependencies are applied might be different. It wasn't defined before, so that
not really a change. Adding filtering on the symlinks is left for later
patches.
|
|
This makes nspawn's logic of automatically discovering the root hash of
an image file generic, and then reuses it in systemd-dissect and in
PID1's RootImage= logic, so that verity is automatically set up whenever
we can.
|
|
directory for a service
This is similar to RootDirectory= but mounts the root file system from a
block device or loopback file instead of another directory.
This reuses the image dissector code now used by nspawn and
gpt-auto-discovery.
|
|
ReadWritablePaths= work
5327c910d2fc1ae91bd0b891be92b30379c7467b claimed to add support for "+"
for prefixing paths with the configured RootDirectory=. But actually it
only implemented it in the backend, it did not add support for it to the
configuration file parsers. Fix that now.
|
|
conjunction with RootDirectory=
This adds a boolean unit file setting MountAPIVFS=. If set, the three
main API VFS mounts will be mounted for the service. This only has an
effect on RootDirectory=, which it makes a ton times more useful.
(This is basically the /dev + /proc + /sys mounting code posted in the
original #4727, but rebased on current git, and with the automatic logic
replaced by explicit logic controlled by a unit file setting)
|
|
The source_malloc field wants to be freed, too.
|
|
If we can, use a memfd for serializing state during a daemon reload or
reexec. Fall back to a file in /run/systemd or /tmp only if memfds are
not available.
See: #5016
|
|
Let's add an extra safety check: before entering a reload/reexec, let's
verify that there's enough room in /run for it.
Fixes: #5016
|
|
Fix WorkDir=~ with empty User=
|
|
This seems like something that shouldn't be higher then debug level, even
if it does not get emitted too often.
Fixes #5228.
|
|
Before previous commit, username would be NULL for root, and set only
for other users. So the argument passed to utmp_put_init_process()
would be "root" for other users and NULL for root. Seems strange.
Instead, always pass the username if available.
|
|
This changes the environment for services running as root from:
LANG=C.utf8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
INVOCATION_ID=ffbdec203c69499a9b83199333e31555
JOURNAL_STREAM=8:1614518
to
LANG=C.utf8
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
HOME=/root
LOGNAME=root
USER=root
SHELL=/bin/sh
INVOCATION_ID=15a077963d7b4ca0b82c91dc6519f87c
JOURNAL_STREAM=8:1616718
Making the environment special for the root user complicates things
unnecessarily. This change simplifies both our logic (by making the setting
of the variables unconditional), and should also simplify the logic in
services (particularly scripts).
Fixes #5124.
|
|
The general rule is:
- code in shared/ should take an "original_root" argument (possibly NULL)
and pass it along down to chase_symlinks
- code in core/ should always use specify original_root==NULL, since we
don't support running the manager from non-root directory
- code in systemctl and other tools should pass arg_root.
For any code that is called from tools which support --root, chase_symlinks
must be used to look up paths.
|
|
|
|
Cleanup of error code mismatch for masked units
|
|
units
The warning "Cannot add dependency job, ignoring" was downgraded to info in one
place, but not in the other.
C.f. #5179.
|