Age | Commit message (Collapse) | Author |
|
|
|
Network file dropins
|
|
In preparation for adding a version which takes a strv.
|
|
When a docker container is confined with AppArmor [1] and happens to run
on top of a kernel that supports mount mediation [2], e.g. any Ubuntu
kernel, mount(2) returns EACCES instead of EPERM. This then leads to:
systemd-logind[33]: Failed to mount per-user tmpfs directory /run/user/1000: Permission denied
login[42]: pam_systemd(login:session): Failed to create session: Access denied
and user sessions don't start.
This also applies to selinux that too returns EACCES on mount denial.
[1] https://github.com/docker/docker/blob/master/docs/security/apparmor.md#understand-the-policies
[2] http://bazaar.launchpad.net/~apparmor-dev/apparmor/master/view/head:/kernel-patches/4.7/0025-UBUNTU-SAUCE-apparmor-Add-the-ability-to-mediate-mou.patch
|
|
Add documentation to #3924
|
|
The parsing functions for [User]TasksMax were inconsistent. Empty string and
"infinity" were interpreted as no limit for TasksMax but not accepted for
UserTasksMax. Update them so that they're consistent with other knobs.
* Empty string indicates the default value.
* "infinity" indicates no limit.
While at it, replace opencoded (uint64_t) -1 with CGROUP_LIMIT_MAX in TasksMax
handling.
v2: Update empty string to indicate the default value as suggested by Zbigniew
Jędrzejewski-Szmek.
v3: Fixed empty UserTasksMax handling.
|
|
This adds the boolean RemoveIPC= setting to service, socket, mount and swap
units (i.e. all unit types that may invoke processes). if turned on, and the
unit's user/group is not root, all IPC objects of the user/group are removed
when the service is shut down. The life-cycle of the IPC objects is hence bound
to the unit life-cycle.
This is particularly relevant for units with dynamic users, as it is essential
that no objects owned by the dynamic users survive the service exiting. In
fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set.
In order to communicate the UID/GID of an executed process back to PID 1 this
adds a new "user lookup" socket pair, that is inherited into the forked
processes, and closed before the exec(). This is needed since we cannot do NSS
from PID 1 due to deadlock risks, However need to know the used UID/GID in
order to clean up IPC owned by it if the unit shuts down.
|
|
This reverts commit 8121f4d209eca85dcb11830800483cdfafbef9b7.
The special 'key handling' inhibitors should always work regardless of
any *IgnoreInhibited settings – otherwise they're nearly useless.
Reverts: #3470
Fixes: #3897
|
|
config_parse_user_tasks_max() was incorrectly accepting percentage value
between 1 and 99. Update it to accept 0% and 100%. This brings it in line
with TasksMax handling in systemd.
|
|
|
|
Let's change from a fixed value of 12288 tasks per user to a relative value of
33%, which with the kernel's default of 32768 translates to 10813. This is a
slight decrease of the limit, for no other reason than "33%" sounding like a nice
round number that is close enough to 12288 (which would translate to 37.5%).
(Well, it also has the nice effect of still leaving a bit of room in the PID
space if there are 3 cooperating evil users that try to consume all PIDs...
Also, I like my bikesheds blue).
Since the new value is taken relative, and machined's TasksMax= setting
defaults to 16384, 33% inside of containers is usually equivalent to 5406,
which should still be ample space.
To summarize:
| on the host | in the container
old default | 12288 | 12288
new default | 10813 | 5406
|
|
This way systemd is informed that we consider everything inside the scope as
"left-over", and systemd can log about killing it.
With this change systemd will log about all processes killed due to the session
clean-up on KillUserProcesses=yes.
|
|
|
|
|
|
sd-bus generally exposes bools as "int" instead of "bool" in the public API.
This is relevant when unmarshaling booleans, as the relevant functions expect
an int* pointer and no bool* pointer. Since sizeof(bool) is not necessarily the
same as sizeof(int) this is problematic and might result in memory corruption.
Let's fix this, and make sure bus_map_all_properties() handles booleans as
ints, as the rest of sd-bus, and make all users of it expect the right thing.
|
|
Delete the dbus1 generator and some critical wiring. This prevents
kdbus from being loaded or detected. As such, it will never be used,
even if the user still has a useful kdbus module loaded on their system.
Sort of fixes #3480. Not really, but it's better than the current state.
|
|
the pager (#3550)
If "systemctl -H" is used, let's make sure we first terminate the bus
connection, and only then close the pager. If done in this order ssh will get
an EOF on stdin (as we speak D-Bus through ssh's stdin/stdout), and then
terminate. This makes sure the standard error we were invoked on is released by
ssh, and only that makes sure we don't deadlock on the pager which waits for
all clients closing its input pipe.
(Similar fixes for the various other xyzctl tools that support both pagers and
-H)
Fixes: #3543
|
|
The various bits of code did the scaling all different, let's unify this,
given that the code is not trivial.
|
|
And port a couple of users over to it.
|
|
|
|
|
|
We need to explicitly define authorizations for allow_inactive and
allow_active. Otherwise one is getting "Access denied" when run from a
local console:
$ loginctl enable-linger
Could not enable linger: Access denied
|
|
This reverts commit 483d8bbb4c0190f419bf9fba57fb0feb1a56bea6.
In [1] Michel Dänzer and Daniel Vetter wrote:
>> The scenario you describe isn't possible if the Wayland compositor
>> directly uses the KMS API of /dev/dri/card*, but it may be possible if
>> the Wayland compositor uses the fbdev API of /dev/fb* instead (e.g. if
>> weston uses its fbdev backend).
>
> Yeah, if both weston and your screen grabber uses native fbdev API you can
> now screenshot your desktop. And since fbdev has no concept of "current
> owner of the display hw" like the drm master, I think this is not fixable.
> At least not just in userspace. Also even with native KMS compositors
> fbdev still doesn't have the concept of ownership, which is why it doesn't
> bother clearing it's buffer before KMS takes over. I agree that this
> should be reverted or at least hidden better.
TBH, I think that privilege separation between processes running under the same
UID is tenuous. Even with drm, in common setups any user process can ptrace the
"current owner of the display" and call DROP_MASTER or do whatever. It *is*
possible to prevent that, e.g. by disabling ptrace using yama.ptrace_scope, or
selinux, and so on, but afaik this is not commonly done. E.g. all Fedora
systems pull in elfutils-default-yama-scope.rpm through dependencies which sets
yama.ptrace_scope=0. And even assuming that ptrace was disabled, it is trivial
to modify files on disk, communicate through dbus, etc; there is just to many
ways for a non-sandboxed process to interact maliciously with the display shell
to close them all off. To achieve real protection, some sort of sandboxing
must be implemented, and in that case there is no need to rely on access mode
on the device files, since much more stringent measures have to be implemented
anyway.
The situation is similar for framebuffer devices. It is common to add
framebuffer users to video group to allow them unlimited access to /dev/fb*.
Using uaccess would be better solution in that case. Also, since there is no
"current owner" limitation like in DRM, processes running under the same UID
should be able to access /proc/<pid-of-display-server>/fd/* and gain access to
the devices. Nevertheless, weston implements a suid wrapper to access the
devices and then drop privileges, and this patch would make this daemon
pointless. So if the weston developers feel that this change reduces security,
I prefer to revert it.
[1] https://lists.freedesktop.org/archives/wayland-devel/2016-May/029017.html
|
|
Desktop environments can keep this property up to date to allow
applications to easily track session's Lock status.
|
|
That function doesn't draw anything on it's own, just returns a string, which
sometimes is more than one character. Also remove "DRAW_" prefix from character
names, TREE_* and ARROW and BLACK_CIRCLE are unambigous on their own, don't
draw anything, and are always used as an argument to special_glyph().
Rename "DASH" to "MDASH", as there's more than one type of dash.
|
|
core: use an AF_UNIX/SOCK_DGRAM socket for cgroup agent notification
|
|
|
|
For similar reasons as the recent addition of a limit on sessions.
Note that we don't enforce a limit on inhibitors per-user currently, but
there's an implicit one, since each inhibitor takes up one fd, and fds are
limited via RLIMIT_NOFILE, and the limit on the number of processes per user.
|
|
|
|
If we have a lot of simultaneous sessions we really shouldn't send the full
list of active sessions with each PropertyChanged message for user and seat
objects, as that can become quite substantial data, we probably shouldn't dump
on the bus on each login and logout.
Note that the global list of sessions doesn't send out changes like this
either, it only supports requesting the session list with ListSessions().
If cients want to get notified about sessions coming and going they should
subscribe to SessionNew and SessionRemoved signals, and clients generally do
that already.
This is kind of an API break, but then again the fact that this was included
was never documented.
|
|
Let's make sure we process session and inhibitor pipe fds (that signal
sessions/inhibtors going away) at a higher priority
than new bus calls that might create new sessions or inhibitors. This helps
ensuring that the number of open sessions stays minimal.
|
|
We really should put limits on all resources we manage, hence add one to the
number of concurrent sessions, too. This was previously unbounded, hence set a
relatively high limit of 8K by default.
Note that most PAM setups will actually invoke pam_systemd prefixed with "-",
so that the return code of pam_systemd is ignored, and the login attempt
succeeds anyway. On systems like this the session will be created but is not
tracked by systemd.
|
|
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.
This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
|
|
This way the user service will have a loginuid, and it will be inherited by
child services. This shouldn't change anything as far as systemd itself is
concerned, but is nice for various services spawned from by systemd --user
that expect a loginuid.
pam_loginuid(8) says that it should be enabled for "..., crond and atd".
user@.service should behave similarly to those two as far as audit is
concerned.
https://bugzilla.redhat.com/show_bug.cgi?id=1328947#c28
|
|
Make this an output flag instead, so that our function prototypes can lose one
parameter
|
|
This ports over machinectl and loginctl to also use the new GetProcesses() bus
call to show the process tree of a container or login session. This is similar
to how systemctl already has been ported over in a previous commit.
|
|
Kill user session scope by default
|
|
zbyszek (1002)
Since: Tue 2016-04-12 23:11:46 EDT; 23min ago
State: active
Sessions: *3
Linger: yes
Unit: user-1002.slice
├─user@1002.service
│ └─init.scope
│ ├─38 /usr/lib/systemd/systemd --user
│ └─39 (sd-pam)
└─session-3.scope
├─ 31 login -- zbyszek
├─ 44 -bash
├─15076 loginctl user-status zbyszek
└─15077 less
|
|
We enable lingering for anyone who wants this. It is still disabled by
default to avoid keeping long-running processes accidentally.
Admins might want to customize this policy on multi-user sites.
|
|
Instead of KillOnlyUsers being a filter for KillUserProcesses, it can now be
used to specify users to kill, independently of the KillUserProcesses
setting. Having the settings orthogonal seems to make more sense. It also
makes KillOnlyUsers symmetrical to KillExcludeUsers.
|
|
|
|
This ensures that users sessions are properly cleaned up after.
The admin can still enable or disable linger for specific users to allow
them to run processes after they log out. Doing that through the user
session is much cleaner and provides better control.
dbus daemon can now be run in the user session (with --enable-user-session,
added in 1.10.2), and most distributions opted to pick this configuration.
In the normal case it makes a lot of sense to kill remaining processes.
The exception is stuff like screen and tmux. But it's easy enough to
work around, a simple example was added to the man page in previous
commit. In the long run those services should integrate with the systemd
users session on their own.
https://bugs.freedesktop.org/show_bug.cgi?id=94508
https://github.com/systemd/systemd/issues/2900
|
|
v2:
- fix setting of kill_user_processes and
*_ignore_inhibited settings
|
|
|
|
The coccinelle patch didn't work in some places, I have no idea why.
|
|
numbers
And port all code over to use it.
|
|
Add `--value` option to systemctl and loginctl to only print values
|
|
|
|
With this option, systemctl will only print the rhs in show:
$ systemctl show -p Wants,After systemd-journald --value
systemd-journald.socket ...
systemd-journald-dev-log.socket ...
This is useful in scripts, because the need to call awk or similar
is removed.
|
|
It's possible that sd_bus_creds_get_tty() fails and thus
scheduled_shutdown_tty is NULL in method_schedule_shutdown().
Fix logind_wall_tty_filter() to get along with that, by showing the message on
all TTYs, instead of crashing in strcmp().
https://launchpad.net/bugs/1553040
|