Age | Commit message (Collapse) | Author |
|
It's not safe invoking NSS from PID 1, hence fork off worker processes
that upload the policy into the kernel for busnames.
|
|
This would otherwise unconditionally trigger any /boot autofs mount,
which we probably should avoid.
ProtectSystem= will now only cover /usr and (optionally) /etc, both of
which cannot be autofs anyway.
ProtectHome will continue to cover /run/user and /home. The former
cannot be autofs either. /home could be, however is frequently enough
used (unlikey /boot) so that it isn't too problematic to simply trigger
it unconditionally via ProtectHome=.
|
|
system
This is relatively complex, as we cannot invoke NSS from PID 1, and thus
need to fork a helper process temporarily.
|
|
|
|
Otherwise .netwrok matching on MAC address will not work.
Based on patch by Dave Reisner, and bug originally reported by Max Pray.
|
|
also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.
With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
|
|
|
|
Now that we moved the actual syslog socket to
/run/systemd/journal/dev-log we can actually make /dev/log a symlink to
it, when PrivateDevices= is used, thus making syslog available to
services using PrivateDevices=.
|
|
This way we can make the socket also available for sandboxed apps that
have their own private /dev. They can now simply symlink the socket from
/dev.
|
|
|
|
With Symlinks= we can manage one or more symlinks to AF_UNIX or FIFO
nodes in the file system, with the same lifecycle as the socket itself.
This has two benefits: first, this allows us to remove /dev/log and
/dev/initctl from /dev, thus leaving only symlinks, device nodes and
directories in the /dev tree. More importantly however, this allows us
to move /dev/log out of /dev, while still making it accessible there, so
that PrivateDevices= can provide /dev/log too.
|
|
The kernel will return 0 for REREADPT when no partition table
is found, we have to send out "change" ourselves.
|
|
|
|
mounted partitions:
# dd if=/dev/zero of=/dev/sda bs=1 count=1
UDEV [4157.369250] change .../0:0:0:0/block/sda (block)
UDEV [4157.375059] change .../0:0:0:0/block/sda/sda1 (block)
UDEV [4157.397088] change .../0:0:0:0/block/sda/sda2 (block)
UDEV [4157.404842] change .../0:0:0:0/block/sda/sda4 (block)
unmounted partitions:
# dd if=/dev/zero of=/dev/sdb bs=1 count=1
UDEV [4163.450217] remove .../target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV [4163.593167] change .../target6:0:0/6:0:0:0/block/sdb (block)
UDEV [4163.713982] add .../target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
|
|
|
|
Reported by Kay.
|
|
This should make sure that fdisk-like programs will automatically
cause an update of all partitions, just like mkfs-like programs cause
an update of the partition.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=79576#c5
|
|
|
|
Either become uid/gid of the client we have been forked for, or become
the "systemd-bus-proxy" user if the client was root. We retain
CAP_IPC_OWNER so that we can tell kdbus we are actually our own client.
|
|
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.
ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.
This patch also enables these settings for all our long-running services.
Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
|
|
Configuration will be in
root:root /run/systemd/network
and state will be in
systemd-network:systemd-network /run/systemd/netif
This matches what we do for logind's seat/session state.
|
|
|
|
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=79576
|
|
On systems which cannot receive unicast packets until its IP stack has been configured
we need to request broadcast packets. We are currently not able to reliably detect when
this is necessary, so set it unconditionally for now.
This is set on all packets, but the DHCP server will only broadcast the packets that are
necessary, and unicast the rest.
For more information please refer to this thread in CoreOS: https://github.com/coreos/bugs/issues/12
[tomegun: rephrased commit message]
|
|
This service is not yet network facing, but let's prepare nonetheless.
Currently all caps are dropped, but some may need to be kept in the
future.
|
|
Rely on modules being built-in or autoloaded on-demand.
As networkd is a network facing service, we want to limits its capabilities,
as much as possible. Also, we may not have CAP_SYS_MODULE in a container,
and we want networkd to work the same there.
Module autoloading does not always work, but should be fixed by the kernel
patch f98f89a0104454f35a: 'net: tunnels - enable module autoloading', which
is currently in net-next and which people may consider backporting if they
want tunneling support without compiling in the modules.
Early adopters may also use a module-load.d snippet and order
systemd-modules-load.service before networkd to force the module
loading of tunneling modules.
This sholud fix the various build issues people have reported.
|
|
This patch enables vti tunnel support.
example conf:
file : vti.netdev
[NetDev]
Name=vti-tun
Kind=vti
MTUBytes=1480
[Tunnel]
Local=X.X.X.X
Remote=X.X.X.X
file: vti.network
[Match]
Name=em1
[Network]
Tunnel=vti-tun
TODO:
Add more attributes for vti tunnel
IFLA_VTI_IKEY
IFLA_VTI_OKEY
|
|
This patch adds path of mtu discovery for sit tunnel.
To enable/disable DiscoverPathMTU is introduced.
Example configuration
file: sit.netdev
[NetDev]
Name=sit-tun
Kind=sit
MTUBytes=1480
[Tunnel]
DiscoverPathMTU=1
Local=X.X.X.X
Remote=X.X.X.X
By default pmtudisc is turned on , if DiscoverPathMTU
is missing from the config. To turn it off
DiscoverPathMTU=0 needs to be set.
|
|
This patch enables gre tunnel support.
example conf:
file : gre.netdev
[NetDev]
Name=gre-tun
Kind=gre
MTUBytes=1480
[Tunnel]
Local=X.X.X.X
Remote=X.X.X.X
file: gre.network
[Match]
Name=em1
[Network]
Tunnel=gre-tun
TODO:
Add more attributes for gre tunnel
IFLA_GRE_IFLAGS
IFLA_GRE_IFLAGS
IFLA_GRE_IKEY
IFLA_GRE_OKEY
|
|
|
|
This patch adds veth device support to networkd.
Example conf:
File: veth.netdev
[NetDev]
Name=veth-test
Kind=veth
[Peer]
Name=veth-peer
|
|
|
|
This allows us to run networkd mostly unpriviliged with the exception of
CAP_NET_* and CAP_SYS_MODULE. I'd really like to get rid of the latter
though...
|
|
make use of it from other daemons too
This is preparation to make networkd work as unpriviliged user.
|
|
|
|
I am getting
"Error calling EVIOCSKEYCODE (scan code 0xc022d, key code 418): Invalid
argument", the error message does not tell on which specific device the
problem is, add that info.
|
|
ignore_file currently allows any file ending with '~' while it
seems that the opposite was intended:
a228a22fda4faa9ecb7c5a5e499980c8ae5d2a08
|
|
|
|
Instead of accessing /proc/1/environ directly, trying to read the
$container variable from it, let's make PID 1 save the contents of that
variable to /run/systemd/container. This allows us to detect containers
without the need for CAP_SYS_PTRACE, which allows us to drop it from a
number of daemons and from the file capabilities of systemd-detect-virt.
Also, don't consider chroot a container technology anymore. After all,
we don't consider file system namespaces container technology anymore,
and hence chroot() should be considered a container even less.
|
|
|
|
It is almost always incorrect to allow DHCP or other sources of
transient host names to override an explicitly configured static host
name.
This commit changes things so that if a static host name is set, this
will override the transient host name (eg: provided via DHCP). Transient
host names can still be used to provide host names for machines that have
not been explicitly configured with a static host name.
The exception to this rule is if the static host name is set to
"localhost". In those cases we act as if no
static host name has been explicitly set.
As discussed elsewhere, systemd may want to have an fd based ownership
of the transient name. That part is not included in this commit.
|
|
|
|
Both systemd-analyze and systemd-run only access org.freedesktop.systemd1
on the bus. This patch allows using systemd-run --user and systemd-analyze
--user even if the user session's bus is not properly integrated with the
systemd user unit.
https://bugs.freedesktop.org/show_bug.cgi?id=79252 and other reports...
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=49316
|
|
|
|
nspawn and the container child use eventfd to wait and notify each other
that they are ready so the container setup can be completed.
However in its current form the wait/notify event ignore errors that
may especially affect the child (container).
On errors the child will jump to the "child_fail" label and terminate
with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd
is created without the "EFD_NONBLOCK" flag, this leaves the parent
blocking on the eventfd_read() call. The container can also be killed
at any moment before execv() and the parent will not receive
notifications.
We can fix this by using cheap mechanisms, the new high level eventfd
API and handle SIGCHLD signals:
* Keep the cheap eventfd and EFD_NONBLOCK flag.
* Introduce eventfd states for parent and child to sync.
Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or
EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the
parent from waiting on an event that will never come.
* If the child is killed before execv() or before notifying the parent,
we install a NOP handler for SIGCHLD which will interrupt blocking calls
with EINTR. This gives a chance to the parent to call wait() and
terminate in main().
* If there are no errors, parent will block SIGCHLD, restore default
handler and notify child which will do execv(), then parent will pass
control to process_pty() to do its magic.
This was exposed in part by:
https://bugs.freedesktop.org/show_bug.cgi?id=76193
Reported-by: Tobias Hunger tobias.hunger@gmail.com
|
|
Move the container wait logic into its own wait_for_container() function
and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED.
The status will be stored in its argument, this way we handle:
a) Return negative on failures.
b) Return zero on success and set the status to either
CONTAINER_REBOOTED or CONTAINER_TERMINATED.
These status codes are used to terminate nspawn or loop again in case of
CONTAINER_REBOOTED.
|