Age | Commit message (Collapse) | Author |
|
Let's make this an excercise in dogfooding: let's turn on more security
features for all our long-running services.
Specifically:
- Turn on RestrictRealtime=yes for all of them
- Turn on ProtectKernelTunables=yes and ProtectControlGroups=yes for most of
them
- Turn on RestrictAddressFamilies= for all of them, but different sets of
address families for each
Also, always order settings in the unit files, that the various sandboxing
features are close together.
Add a couple of missing, older settings for a numbre of unit files.
Note that this change turns off AF_INET/AF_INET6 from udevd, thus effectively
turning of networking from udev rule commands. Since this might break stuff
(that is already broken I'd argue) this is documented in NEWS.
|
|
Take away kernel keyring access, CPU emulation system calls and various debug
system calls from the various daemons we have.
|
|
Add a line
SystemCallFilter=~@clock @module @mount @obsolete @raw-io ptrace
for daemons shipped by systemd. As an exception, systemd-timesyncd
needs @clock system calls and systemd-localed is not privileged.
ptrace(2) is blocked to prevent seccomp escapes.
|
|
Secure daemons shipped by systemd by enabling MemoryDenyWriteExecute.
Closes: #3459
|
|
|
|
Otherwise we might run into deadlocks, when journald blocks on the
notify socket on PID 1, and PID 1 blocks on IPC to dbus-daemon and
dbus-daemon blocks on logging to journald. Break this cycle by making
sure that journald never ever blocks on PID 1.
Note that this change disables support for event loop watchdog support,
as these messages are sent in blocking style by sd-event. That should
not be a big loss though, as people reported frequent problems with the
watchdog hitting journald on excessively slow IO.
Fixes: #1505.
|
|
Apparently, disk IO issues are more frequent than we hope, and 1min
waiting for disk IO happens, so let's increase the watchdog timeout a
bit, for all our services.
See #1353 for an example where this triggers.
|
|
This reverts commit 6a716208b346b742053cfd01e76f76fb27c4ea47.
Apparently this doesn't work.
http://lists.freedesktop.org/archives/systemd-devel/2015-February/028212.html
|
|
No setuid programs are expected to be executed, so add
SecureBits=noroot noroot-locked
to unit files.
|
|
When there are a lot of split out journal files, we might run out of fds
quicker then we want. Hence: bump RLIMIT_NOFILE to 16K if possible.
Do these even for journalctl. On Fedora the soft RLIMIT_NOFILE is at 1K,
the hard at 4K by default for normal user processes, this code hence
bumps this up for users to 4K.
https://bugzilla.redhat.com/show_bug.cgi?id=1179980
|
|
Making use of the fd storage capability of the previous commit, allow
restarting journald by serilizing stream state to /run, and pushing open
fds to PID 1.
|
|
It already calls sd_notify(), so it looks like an oversight.
Without it, its ordering to systemd-journal-flush.service is
non-deterministic and the SIGUSR1 from flushing may kill journald before
it has its signal handlers set up.
https://bugs.freedesktop.org/show_bug.cgi?id=85871
https://bugzilla.redhat.com/show_bug.cgi?id=1159641
|
|
|
|
systemd-journald check the cgroup id to support rate limit option for
every messages. so journald should be available to access cgroup node in
each process send messages to journald.
In system using SMACK, cgroup node in proc is assigned execute label
as each process's execute label.
so if journald don't want to denied for every process, journald
should have all of access rule for all process's label.
It's too heavy. so we could give special smack label for journald te get
all accesses's permission.
'^' label.
When assign '^' execute smack label to systemd-journald,
systemd-journald need to add CAP_MAC_OVERRIDE capability to get that smack privilege.
so I want to notice this information and set default capability to
journald whether system use SMACK or not.
because that capability affect to only smack enabled kernel
|
|
also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.
With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
|
|
This way we can make the socket also available for sandboxed apps that
have their own private /dev. They can now simply symlink the socket from
/dev.
|
|
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.
ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.
This patch also enables these settings for all our long-running services.
Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
|
|
|
|
In the initrd we don't need the flush service hence don't attempt to
pull it in.
|
|
boot phase
|
|
These services should be restarted as quickly as possible if they fail,
and the extra safety net of the holdoff time is not necessary.
|
|
The old automatism that the flushing of the journal from /run to /var
was triggered by the appearance of /var/log/journal is broken if that
directory is mounted from another host and hence always available to be
useful as mount point. To avoid probelsm with this, introduce a new unit
that is explicitly orderer after all mounte files systems and triggers
the flushing.
|
|
|
|
|
|
|
|
This should help making the boot process a bit easier to explore and
understand for the administrator. The simple idea is that "systemctl
status" now shows a link to documentation alongside the other status and
decriptionary information of a service.
This patch adds the necessary fields to all our shipped units if we have
proper documentation for them.
|
|
We finally got the OK from all contributors with non-trivial commits to
relicense systemd from GPL2+ to LGPL2.1+.
Some udev bits continue to be GPL2+ for now, but we are looking into
relicensing them too, to allow free copy/paste of all code within
systemd.
The bits that used to be MIT continue to be MIT.
The big benefit of the relicensing is that closed source code may now
link against libsystemd-login.so and friends.
|
|
we can fake SCM_CREDENTIALS
|
|
|
|
socket queues syslog messages from early boot on
|
|
|