Age | Commit message (Collapse) | Author |
|
Also, make it slightly more powerful, by accepting a flags argument, and
make it safe for handling if more than one cmsg attribute happens to be
attached.
|
|
|
|
Tested with a dummy service running 'sleep', modifying its CPUAffinity,
restarting the service and checking the ^Cpus_allowed entries in the
/proc/PID/status file.
|
|
Use the new code in config_parse_cpu_affinity2.
Tested by modifying CPUAffinity=... setting in /etc/systemd/system.conf
and reloading the daemon, then checking ^Cpus_allowed in /proc/1/status
to confirm the correct CPU mask is in place.
|
|
core: make setup_pam() synchronous
|
|
This cleans up exec_child() function by moving mac_smack_apply_pid()
and setup_pam() to the same condition block, since both of them have
the same condition (i.e params->apply_permissions). It improves
readability without changing its operation.
|
|
When 'SmackProcessLabel=' is used in user@.service file, all processes
launched in systemd user session should be labeled as the designated name
of 'SmackProcessLabel' directive. However, if systemd has its own smack
label using '--with-smack-run-label' configuration, '(sd-pam)' is
labeled as the specific name of '--with-smack-run-label'. If
'SmackProcessLabel=' is used in user@.service file without
'--with-smack-run-label' configuration, (sd-pam) is labeled as "_" since
systemd (i.e. pid=1) is labeled as "_".
This is mainly because setup_pam() function is called before applying
smack label to child process. This patch fixes it by calling setup_pam()
after setting the smack label.
|
|
Hook more properties for transient units
|
|
systemd-run can now launch units with WorkingDirectory, RootDirectory set.
|
|
If we spawn a unit with a non-empty 'PAMName=', we fork off a
child-process _inside_ the unit, known as '(sd-pam)', which watches the
session. It waits for the main-process to exit and then finishes it via
pam_close_session(3).
However, the '(sd-pam)' setup is highly asynchronous. There is no
guarantee that process gets spawned before we finish the unit setup.
Therefore, there might be a root-owned process inside of the cgroup of
the unit, thus causing cg_migrate() to error-out with EPERM.
This patch makes setup_pam() synchronous and waits for the '(sd-pam)'
setup to finish before continuing. This guarantees that setresuid(2) was
at least tried before we continue with the child setup of the real unit.
Note that if setresuid(2) fails, we already warn loudly about it. You
really must make sure that you own the passed user if using 'PAMName='.
It seems very plausible to rely on that assumption.
|
|
Shutting down a user session currently fails with:
Sep 22 22:35:38 david-t2 systemd[640]: Reached target Shutdown.
Sep 22 22:35:38 david-t2 systemd[640]: Starting Exit the Session...
Sep 22 22:35:38 david-t2 systemd[640]: Received SIGRTMIN+24 from PID 659 (kill).
Sep 22 22:35:38 david-t2 systemd[640]: Shutting down.
Sep 22 22:35:38 david-t2 systemd[640]: Not executed by init (PID 1).
Sep 22 22:35:38 david-t2 systemd[640]: Critical error while doing system shutdown: Operation not permitted
This is a regression from:
commit 287419c119ef961db487a281162ab037eba70c61
Author: Alban Crequy <alban.crequy@gmail.com>
Date: Fri Sep 18 13:37:34 2015 +0200
containers: systemd exits with non-zero code
Make sure we never ever execute systemd-shutdown from within a
user-manager. Restore the previous behavior by partially reverting given
commit.
|
|
A variety of mostly unrelated fixes
|
|
core: add support for usb functionfs v3
|
|
By using these parameters functionfs service can specify ffs descriptors
and strings which should be written to ep0.
|
|
For handling functionfs endpoints additional socket type is added.
|
|
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
|
|
|
|
mount: use libmount to monitor mountinfo & utab
|
|
Some additional files related to single socket may appear in the
filesystem and they should be opened and passed to related service.
This commit adds optional list of file descriptors, which are
dynamically discovered and opened.
|
|
Make sure to propagate error codes from mount-loops correctly. Right now,
we return the return-code of the first mount that did _something_. This is
not what we want. Make sure we return an error if _any_ mount fails (and
then make sure to return the first error to not hide proper errors due to
consequential errors like -ENOTDIR).
Reported by cee1 <fykcee1@gmail.com>.
|
|
When Group is set in the unit, the runtime directories are owned by
this group and not the default group of the user (same for cgroup paths
and standard outputs)
Fix #1231
|
|
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:
- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
called on baremetal: it is only allowed in containers or in user
session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
existing code for user session.
- Add exit.target and systemd-exit.service to the system instance.
- Change main() to actually call systemd-shutdown to exit() with the
correct value.
- Add verb 'exit' in systemd-shutdown with parameter --exit-code
- Update systemctl manpage.
I used the following to test it:
| $ sudo rkt --debug --insecure-skip-verify run \
| --mds-register=false --local docker://busybox \
| --exec=/bin/chroot -- /proc/1/root \
| systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42
Fixes https://github.com/systemd/systemd/issues/1290
|
|
|
|
|
|
cgroup: add support for net_cls controllers
|
|
Add a new config directive called NetClass= to CGroup enabled units.
Allowed values are positive numbers for fix assignments and "auto" for
picking a free value automatically, for which we need to keep track of
dynamically assigned net class IDs of units. Introduce a hash table for
this, and also record the last ID that was given out, so the allocator
can start its search for the next 'hole' from there. This could
eventually be optimized with something like an irb.
The class IDs up to 65536 are considered reserved and won't be
assigned automatically by systemd. This barrier can be made a config
directive in the future.
Values set in unit files are stored in the CGroupContext of the
unit and considered read-only. The actually assigned number (which
may have been chosen dynamically) is stored in the unit itself and
is guaranteed to remain stable as long as the unit is active.
In the CGroup controller, set the configured CGroup net class to
net_cls.classid. Multiple unit may share the same net class ID,
and those which do are linked together.
|
|
Hook more properties for transient units
|
|
The current implementation directly monitor /proc/self/mountinfo and
/run/mount/utab files. It's really not optimal because utab file is
private libmount stuff without any official guaranteed semantic.
The libmount since v2.26 provides API to monitor mount kernel &
userspace changes and since v2.27 the monitor is usable for
non-root users too.
This patch replaces the current implementation with libmount based
solution.
Signed-off-by: Karel Zak <kzak@redhat.com>
|
|
Let's make sure that we follow the same codepaths when adjusting a
cgroup property via the dbus SetProperty() call, and when we execute the
StartupCPUShares= effect.
|
|
value for a reason
|
|
In all occasions when this function is called we do so anyway, so let's
move this inside, to make things easier.
|
|
|
|
Not strictly necessary, but makes clear the fds are invalidated. Make
sure we do the same here as in most other cases.
|
|
There's a good chance we never needs these sets, hence allocate them
only when needed.
|
|
Let's stop using the "unsigned long" type for weights/shares, and let's
just use uint64_t for this, as that's what we expose on the bus.
Unify parsers, and always validate the range for these fields.
Correct the default blockio weight to 500, since that's what the kernel
actually uses.
When parsing the weight/shares settings from unit files accept the empty
string as a way to reset the weight/shares value. When getting it via
the bus, uniformly map (uint64_t) -1 to unset.
Open up StartupCPUShares= and StartupBlockIOWeight= to transient units.
|
|
systemd-run can now launch units with PrivateTmp, PrivateDevices,
PrivateNetwork, NoNewPrivileges set.
|
|
|
|
core: add support for the "pids" cgroup controller
|
|
This adds support for the new "pids" cgroup controller of 4.3 kernels.
It allows accounting the number of tasks in a cgroup and enforcing
limits on it.
This adds two new setting TasksAccounting= and TasksMax= to each unit,
as well as a gloabl option DefaultTasksAccounting=.
This also updated "cgtop" to optionally make use of the new
kernel-provided accounting.
systemctl has been updated to show the number of tasks for each service
if it is available.
This patch also adds correct support for undoing memory limits for units
using a MemoryLimit=infinity syntax. We do the same for TasksMax= now
and hence keep things in sync here.
|
|
off_t is a really weird type as it is usually 64bit these days (at least
in sane programs), but could theoretically be 32bit. We don't support
off_t as 32bit builds though, but still constantly deal with safely
converting from off_t to other types and back for no point.
Hence, never use the type anymore. Always use uint64_t instead. This has
various benefits, including that we can expose these values directly as
D-Bus properties, and also that the values parse the same in all cases.
|
|
And set_free() too.
Another Coccinelle patch.
|
|
Another Coccinelle patch.
|
|
util: introduce safe_fclose() and port everything over to it
|
|
Adds a coccinelle script to port things over automatically.
|
|
Coccinelle fixes 2
|
|
Let's also clean up single-line while and for blocks.
|
|
The previous coccinelle semantic patch that improved usage of
log_error_errno()'s return value, only looked for log_error_errno()
invocations with a single parameter after the error parameter. Update
the patch to handle arbitrary numbers of additional arguments.
|
|
core: freeze execution if /etc/mtab exists
|
|
The mount monitor that was added to libmount v2.27 requires /etc/mtab to be
non-existant. As systemd now uses that functionality, we cannot monitor any
mounts anymore, and hence not support .mount units.
Systems that have /etc/mtab around as regular file are unsupported by
systemd since a long time.
This patch makes that condition fatal, so we do not boot up with
non-working mount monitor support.
|
|
Even though systemd has its own smack label since
'--with-smack-run-label' configuration is set, the smack label of each
CGROUP root directory should have the star (i.e. *) label. This is
mainly because current Linux Kernel set the label in this way.
(Refer to smack_d_instantiate() in security/smack/smack_lsm.c)
However, if systemd has its own smack label and arg_join_controllers is
explicitly set or initialized by initialize_join_controllers() function,
current systemd creates the symlink in CGROUP root directory with its
own smack label as below.
lrwxrwxrwx. 1 root root System 11 Dec 31 16:00 cpu -> cpu,cpuacct
dr-xr-xr-x. 4 root root * 0 Dec 31 16:01 cpu,cpuacct
lrwxrwxrwx. 1 root root System 11 Dec 31 16:00 cpuacct -> cpu,cpuacct
This patch fixes that bug by copying the smack label from the origin.
|