Age | Commit message (Collapse) | Author |
|
|
|
/usr/bin/getent instead of in-process
When the container runs a different native architecture than the host we
shouldn't attempt to load the container's NSS modules with the host's
libc. Instead, resolve UID/GID by invoking /usr/bin/getent in the
container. The tool should be fairly universally available and allows us
to do resolving of the UID/GID with the container's libc in a parsable
format.
https://bugs.freedesktop.org/show_bug.cgi?id=75733
|
|
child after the parent added us to the device cgroup
|
|
We overmount /dev/console with an external pty anyway, hence there's no
point in using the real major/minor when we create the node to
overmount. Instead, use the one of /dev/null now.
This fixes a race against the cgroup device controller setup we are
using. In case /dev/console was create before the cgroup policy was
applied all was good, but if created in the opposite order the mknod()
would fail, since creating /dev/console is not allowed by it. Creating
/dev/null instances is however permitted, and hence use it.
|
|
Discoverable Partitions Specification
|
|
Running 'systemd-nspawn -D /srv/Fedora/' gave me this error:
Failed to read /proc/self/loginuid: No such file or directory
Container Fedora failed with error code 1.
This patch fixes the problem.
|
|
|
|
container
|
|
|
|
"ve-" interface name prefix
This way we can recognize the interfaces later on to apply different
host-side configuration to them.
|
|
first (or second)
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
|
|
If -flto is used then gcc will generate a lot more warnings than before,
among them a number of use-without-initialization warnings. Most of them
without are false positives, but let's make them go away, because it
doesn't really matter.
|
|
processes
|
|
containers on a 64bit host
|
|
seccomp setup
|
|
And make use of it where appropriate for executing services and for
nspawn.
|
|
Arch Linux uses nspawn as a container for building packages and needs
to be able to start a 32bit chroot from a 64bit host. 24fb11120756
disrupted this feature when seccomp handling was added.
|
|
This mimics the sd-bus api, as we may need it in the future.
|
|
|
|
|
|
This adds the host side of the veth link to the given bridge.
Also refactor the creation of the veth interfaces a bit to set it up
from the host rather than the container. This simplifies the addition
to the bridge, but otherwise the behavior is unchanged.
|
|
|
|
We can always know the size based on the type, so let's do this inside the library.
|
|
|
|
When invoked without -D in an arbitrary directory we should not try to
execute anything, make some validity checks first.
|
|
containers
The kernel still doesn't support audit in containers, so let's make use
of seccomp and simply turn it off entirely. We can get rid of this big
as soon as the kernel is fixed again.
|
|
|
|
|
|
one operation
|
|
|
|
sd_rtnl_xxx_new_yyy()
So far we followed the rule to always indicate the "flavour" of
constructors after the "_new_" or "_open_" in the function name, so
let's keep things in sync here for rtnl and do the same.
|
|
The "sd_" prefix is supposed to be used on exported symbols only, and
not in the middle of names. Let's drop it from the cleanup macros hence,
to make things simpler.
The bus cleanup macros don't carry the "sd_" either, so this brings the
APIs a bit nearer.
|
|
into the container
|
|
|
|
of this
|
|
or services) as machine with machined
|
|
the container with machined
|
|
namespacing
|
|
Let's always call the security labels the same way:
SMACK: "Smack Label"
SELINUX: "SELinux Security Context"
And the low-level encapsulation is called "seclabel". Now let's hope we
stick to this vocabulary in future, too, and don't mix "label"s and
"security contexts" and so on wildly.
|
|
/etc/os-release is expected for the case for booting a full system, and
need not be required for thin container execution.
|
|
the API file systems, nothing else
|
|
|
|
|
|
|
|
- As suggested, prefix argument variables with "arg_" how we do this
usually.
- As suggested, don't involve memory allocations when storing command
line arguments.
- Break --help text at 80 chars
- man: explain that this is about SELinux
- don't do unnecessary memory allocations when putting together mount
option string
|
|
This patch adds to new options:
-Z PROCESS_LABEL
This specifies the process label to run on processes run within the container.
-L FILE_LABEL
The file label to assign to memory file systems created within the container.
For example if you wanted to wrap an container with SELinux sandbox labels, you could execute a command line the following
chcon system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -R /srv/container
systemd-nspawn -L system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -Z system_u:system_r:svirt_lxc_net_t:s0:c0,c1 -D /srv/container /bin/sh
|
|
|
|
|
|
Similar to PrivateNetwork=, PrivateTmp= introduce PrivateDevices= that
sets up a private /dev with only the API pseudo-devices like /dev/null,
/dev/zero, /dev/random, but not any physical devices in them.
|
|
namespace
On kdbus user credentials are not translated across PID namespaces, but
simply invalidated if sender and receiver namespaces don't match. This
makes it impossible to properly authenticate requests from different PID
namespaces (which is probably a good thing). Hence, register the machine
in the parent and not the client and properly synchronize this.
|