Age | Commit message (Collapse) | Author |
|
Let's make this an excercise in dogfooding: let's turn on more security
features for all our long-running services.
Specifically:
- Turn on RestrictRealtime=yes for all of them
- Turn on ProtectKernelTunables=yes and ProtectControlGroups=yes for most of
them
- Turn on RestrictAddressFamilies= for all of them, but different sets of
address families for each
Also, always order settings in the unit files, that the various sandboxing
features are close together.
Add a couple of missing, older settings for a numbre of unit files.
Note that this change turns off AF_INET/AF_INET6 from udevd, thus effectively
turning of networking from udev rule commands. Since this might break stuff
(that is already broken I'd argue) this is documented in NEWS.
|
|
Specifically "machinectl shell" (or its OpenShell() bus call) is implemented by
entering the file system namespace of the container and opening a TTY there.
In order to enter the file system namespace, chroot() is required, which is
filtered by SystemCallFilter='s @mount group. Hence, let's make this work again
and drop @mount from the filter list.
|
|
Take away kernel keyring access, CPU emulation system calls and various debug
system calls from the various daemons we have.
|
|
Add a line
SystemCallFilter=~@clock @module @mount @obsolete @raw-io ptrace
for daemons shipped by systemd. As an exception, systemd-timesyncd
needs @clock system calls and systemd-localed is not privileged.
ptrace(2) is blocked to prevent seccomp escapes.
|
|
Secure daemons shipped by systemd by enabling MemoryDenyWriteExecute.
Closes: #3459
|
|
Container images from Debian or suchlike contain device nodes in /dev. Let's
make sure we can clone them properly, hence pass CAP_MKNOD to machined.
Fixes: #2867 #465
|
|
Apparently, disk IO issues are more frequent than we hope, and 1min
waiting for disk IO happens, so let's increase the watchdog timeout a
bit, for all our services.
See #1353 for an example where this triggers.
|
|
Otherwise copying full directory trees between container and host won't
work, as we cannot access some fiels and cannot adjust the ownership
properly on the destination.
Of course, adding these many caps to the daemon kinda defeats the
purpose of the caps lock-down... but well...
Fixes #433
|
|
machined
This extends the bus interface, adding BindMountMachine() for bind
mounting directories from the host into the container.
|
|
This reverts commit 6a716208b346b742053cfd01e76f76fb27c4ea47.
Apparently this doesn't work.
http://lists.freedesktop.org/archives/systemd-devel/2015-February/028212.html
|
|
No setuid programs are expected to be executed, so add
SecureBits=noroot noroot-locked
to unit files.
|
|
This adds a new bus call to machined that enumerates /var/lib/container
and returns all trees stored in it, distuingishing three types:
- GPT disk images, which are files suffixed with ".gpt"
- directory trees
- btrfs subvolumes
|
|
|
|
|
|
|
|
also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.
With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
|
|
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.
ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.
This patch also enables these settings for all our long-running services.
Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
|
|
this is useful
|
|
long-running daemons
|
|
then
|
|
Adds a new call sd_event_set_watchdog() that can be used to hook up the
event loop with the watchdog supervision logic of systemd. If enabled
and $WATCHDOG_USEC is set the event loop will ping the invoking systemd
daemon right after coming back from epoll_wait() but not more often than
$WATCHDOG_USEC/4. The epoll_wait() will sleep no longer than
$WATCHDOG_USEC/4*3, to make sure the service manager is called in time.
This means that setting WatchdogSec= in a .service file and calling
sd_event_set_watchdog() in your daemon is enough to hook it up with the
watchdog logic.
|
|
|
|
|
|
Embedded folks don't need the machine registration stuff, hence it's
nice to make this optional. Also, I'd expect that machinectl will grow
additional commands quickly, for example to join existing containers and
suchlike, hence it's better keeping that separate from loginctl.
|