Age | Commit message (Collapse) | Author |
|
|
|
|
|
mask/handlers
Also, when the child is potentially long-running make sure to set a
death signal.
Also, ignore the result of the reset operations explicitly by casting
them to (void).
|
|
No functional changes.
|
|
This makes path_is_mount_point() consistent with fd_is_mount_point() wrt.
flags.
|
|
This was a typo, swapping prefix_root() in place of prefix_roota().
Fixes CID 1299640.
|
|
Simplify the code a bit, at the cost of potentially duplicating some
memory unneccessarily.
Fixes CID 1299641.
|
|
These have no effect.
Fixes CID 1299643.
|
|
Rather than checking the return of asprintf() we are checking if buf gets allocated,
make it clear that it is ok to ignore the return value.
Fixes CID 1299644.
|
|
Allowed interface name is relatively small. Lets not make
users go in to the source code to figure out what happened.
--machine=debian-tree conflicts with
--machine=debian-tree2
ex: Failed to add new veth \
interfaces (host0, vb-debian-tree): File exists
|
|
Unless CAP_SYSLOG is explicitly passed block all access to kmg
|
|
|
|
|
|
|
|
When systemd-nspawn gets exec*()ed, it inherits the followings file
descriptors:
- 0, 1, 2: stdin, stdout, stderr
- SD_LISTEN_FDS_START, ... SD_LISTEN_FDS_START+LISTEN_FDS: file
descriptors passed by the system manager (useful for socket
activation). They are passed to the child process (process leader).
- extra lock fd: rkt passes a locked directory as an extra fd, so the
directory remains locked as long as the container is alive.
systemd-nspawn used to close all open fds except 0, 1, 2 and the
SD_LISTEN_FDS_START..SD_LISTEN_FDS_START+LISTEN_FDS. This patch delays
the close just before the exec so the nspawn process (parent) keeps the
extra fds open.
This patch supersedes the previous attempt ("cloexec extraneous fds"):
http://lists.freedesktop.org/archives/systemd-devel/2015-May/031608.html
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=90385
|
|
If a symlink to a combined cgroup hierarchy already exists and points to
the right path, skip it. This avoids an error when the cgroups are set
manually before calling nspawn.
|
|
This allows the user to set the cgroups manually before calling nspawn.
|
|
Previously all bind mount mounts were applied in the order specified,
followed by all tmpfs mounts in the order specified. This is
problematic, if bind mounts shall be placed within tmpfs mounts.
This patch hence reworks the custom mount point logic, and alwas applies
them in strict prefix-first order. This means the order of mounts
specified on the command line becomes irrelevant, the right operation
will always be executed.
While we are at it this commit also adds native support for overlayfs
mounts, as supported by recent kernels.
|
|
Let's just pass on what the user set for us.
|
|
|
|
on the command line
|
|
When --ephemeral is used there's no need to keep the image read-only, so
let's not do that then.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Move to its own file rm-rf.c
- Change parameters into a single flags parameter
- Remove "honour sticky" logic, it's unused these days
|
|
Some systems abusively restrict mknod, even when the device node already
exists in /dev. This is unfortunate because it prevents systemd-nspawn
from creating the basic devices in /dev in the container.
This patch implements a workaround: when mknod fails, fallback on bind
mounts.
Additionally, /dev/console was created with a mknod with the same
major/minor as /dev/null before bind mounting a pts on it. This patch
removes the mknod and creates an empty regular file instead.
In order to test this patch, I used the following configuration, which I
think should replicate the system with the abusive restriction on mknod:
# grep devices /proc/self/cgroup
4:devices:/user.slice/restrict
# cat /sys/fs/cgroup/devices/user.slice/restrict/devices.list
c 1:9 r
c 5:2 rw
c 136:* rw
# systemd-nspawn --register=false -D .
v2:
- remove "bind", it is not needed since there is already MS_BIND
v3:
- fix error management when calling touch()
- fix lowercase in error message
|
|
We have no such check in any of the other tools, hence don't have one in
nspawn either.
(This should make things nicer for Rocket, among other things)
Note: removing this check does not mean that we support running nspawn
on non-systemd. We explicitly don't. It just means that we remove the
check for running it like that. You are still on your own if you do...
|
|
Try to keep syscalls as minimal as possible.
|
|
CID #1271353.
|
|
Replace ENOTSUP by EOPNOTSUPP as this is what linux actually uses.
|
|
CID #1257765.
|
|
This change makes it so all seccomp filters are mapped
to the appropriate capability and are only added if that
capability was not requested when running the container.
This unbreaks the remaining use cases broken by the
addition of seccomp filters without respecting requested
capabilities.
Co-Authored-By: Clif Houck <me@clifhouck.com>
[zj: - adapt to our coding style, make struct anonymous]
|
|
|
|
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
|
|
|
|
|
|
|
|
(This is incomplete, /proc and /sys are still owned by root from outside
the container, not inside)
|
|
Previously we always invoked the container PID 1 on /dev/console of the
container. With this change we do so only if nspawn was invoked
interactively (i.e. its stdin/stdout was connected to a TTY). In all other
cases we directly pass through the fds unmodified.
This has the benefit that nspawn can be added into shell pipelines.
https://bugs.freedesktop.org/show_bug.cgi?id=87732
|
|
This is similar to systemd-run's --property= setting.
|
|
nspawn containers currently block module loading in all cases, with
no option to disable it. This allows an admin, specifically setting
capability=CAP_SYS_MODULE or capability=all to load modules.
|