summaryrefslogtreecommitdiff
path: root/src/nspawn/nspawn.c
AgeCommit message (Collapse)Author
2014-05-25nspawn: make nspawn robust to container failureDjalal Harouni
nspawn and the container child use eventfd to wait and notify each other that they are ready so the container setup can be completed. However in its current form the wait/notify event ignore errors that may especially affect the child (container). On errors the child will jump to the "child_fail" label and terminate with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd is created without the "EFD_NONBLOCK" flag, this leaves the parent blocking on the eventfd_read() call. The container can also be killed at any moment before execv() and the parent will not receive notifications. We can fix this by using cheap mechanisms, the new high level eventfd API and handle SIGCHLD signals: * Keep the cheap eventfd and EFD_NONBLOCK flag. * Introduce eventfd states for parent and child to sync. Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the parent from waiting on an event that will never come. * If the child is killed before execv() or before notifying the parent, we install a NOP handler for SIGCHLD which will interrupt blocking calls with EINTR. This gives a chance to the parent to call wait() and terminate in main(). * If there are no errors, parent will block SIGCHLD, restore default handler and notify child which will do execv(), then parent will pass control to process_pty() to do its magic. This was exposed in part by: https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger tobias.hunger@gmail.com
2014-05-25nspawn: move container wait logic into wait_for_container()Djalal Harouni
Move the container wait logic into its own wait_for_container() function and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED. The status will be stored in its argument, this way we handle: a) Return negative on failures. b) Return zero on success and set the status to either CONTAINER_REBOOTED or CONTAINER_TERMINATED. These status codes are used to terminate nspawn or loop again in case of CONTAINER_REBOOTED.
2014-05-25Use %m instead of strerror(errno) where appropiateCristian Rodríguez
2014-05-22nspawn: restore journal directory is empty checkLennart Poettering
This undoes part of commit e6a4a517befe559adf6d1dbbadf425c3538849c9. Instead of removing the error message about non-empty journal bind mount directories, simply downgrade the message to a warning and proceed.
2014-05-22nspawn: allow to bind mount journal on top of a non empty container journal ↵Djalal Harouni
dentry Currently if nspawn was called with --link-journal=host or --link-journal=auto and the right /var/log/journal/machine-id/ exists then the bind mount the subdirectory into the container might fail due to the ~/mycontainer/var/log/journal/machine-id/ of the container not being empty. There is no reason to check if the container journal subdir is empty since there will be a bind mount on top of it. The user asked for a bind mount so give it. Note: a next call with --link-journal=guest may fail due to the /var/log/journal/machine-id/ on the host not being empty. https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger <tobias.hunger@gmail.com>
2014-05-19fix spelling of privilegeNis Martensen
2014-05-16nspawn: properly format container_uuid in UUID formatLennart Poettering
http://lists.freedesktop.org/archives/systemd-devel/2014-April/018971.html
2014-04-10nspawn: Fix erroneous OOM when building group listPhilip Lorenz
change_uid_gid() never initialises sz which may cause greedy_realloc to skip the initial buffer allocation.
2014-03-28sd-rtnl: rework rtnl type systemTom Gundersen
Use a static table with all the typing information, rather than repeated switch statements. This should make it a lot simpler to add new types. We need to keep all the type info to be able to create containers without exposing their implementation details to the users of the library. As a freebee we verify the types of appended/read attributes. The API is extended to nicely deal with unions of container types.
2014-03-24util: replace close_pipe() with new safe_close_pair()Lennart Poettering
safe_close_pair() is more like safe_close(), except that it handles pairs of fds, and doesn't make and misleading allusion, as it works similarly well for socketpairs() as for pipe()s...
2014-03-18util: replace close_nointr_nofail() by a more useful safe_close()Lennart Poettering
safe_close() automatically becomes a NOP when a negative fd is passed, and returns -1 unconditionally. This makes it easy to write lines like this: fd = safe_close(fd); Which will close an fd if it is open, and reset the fd variable correctly. By making use of this new scheme we can drop a > 200 lines of code that was required to test for non-negative fds or to reset the closed fd variable afterwards.
2014-03-16nspawn: UP the host side of the veth pair after adding it to a bridgeTom Gundersen
2014-03-13nspawn: remove unused variableDave Reisner
2014-03-14nspawn: allow -EEXIST on mkdir_safe /home/${uid}Brandon Philips
With systemd 211 nspawn attempts to create the home directory for the given uid. However, if the home directory already exists then it will fail. Don't error out on -EEXIST.
2014-03-13nspawn: make host0's MAC address persistentTom Gundersen
We still need to make sure that no two MAC addresses are the same, so we use a logic similar to what is used in udev to generate MAC addresses, and base it on a hash of the host's machine ID and thecontainer's name.
2014-03-13nspawn: honour GPT partition flags when mounting file systems following the ↵Lennart Poettering
discoverable partitions spec
2014-03-11nspawn: fix argv[0] for getentMantas Mikulėnas
2014-03-11nspawn: allow using kdbus from nspawn containersLennart Poettering
2014-03-11nspawn: fix getent fallbackLennart Poettering
2014-03-11nspawn: when resoliving UIDs/GIDs for "-u", do so in forked off ↵Lennart Poettering
/usr/bin/getent instead of in-process When the container runs a different native architecture than the host we shouldn't attempt to load the container's NSS modules with the host's libc. Instead, resolve UID/GID by invoking /usr/bin/getent in the container. The tool should be fairly universally available and allows us to do resolving of the UID/GID with the container's libc in a parsable format. https://bugs.freedesktop.org/show_bug.cgi?id=75733
2014-03-11nspawn: make sure we don't try to mount the container block device in the ↵Lennart Poettering
child after the parent added us to the device cgroup
2014-03-10nspawn: don't try mknod() of /dev/console with the correct major/minorLennart Poettering
We overmount /dev/console with an external pty anyway, hence there's no point in using the real major/minor when we create the node to overmount. Instead, use the one of /dev/null now. This fixes a race against the cgroup device controller setup we are using. In case /dev/console was create before the cgroup policy was applied all was good, but if created in the opposite order the mknod() would fail, since creating /dev/console is not allowed by it. Creating /dev/null instances is however permitted, and hence use it.
2014-03-10nspawn: add --image= switch to boot GPT disk images that follow the ↵Lennart Poettering
Discoverable Partitions Specification
2014-02-28nspawn: fix detection of missing /proc/self/loginuidTero Roponen
Running 'systemd-nspawn -D /srv/Fedora/' gave me this error: Failed to read /proc/self/loginuid: No such file or directory Container Fedora failed with error code 1. This patch fixes the problem.
2014-02-26nspawn: no need for duplicate checks against EEXISTLennart Poettering
2014-02-25nspawn: add new switch --network-macvlan= to add a macvlan device to the ↵Lennart Poettering
container
2014-02-24nspawn: make use of the devices cgroup controller by defaultLennart Poettering
2014-02-21nspawn: when adding a veth interface to a bridge, use the "vb-" rather than ↵Lennart Poettering
"ve-" interface name prefix This way we can recognize the interfaces later on to apply different host-side configuration to them.
2014-02-20api: in constructor function calls, always put the returned object pointer ↵Lennart Poettering
first (or second) Previously the returned object of constructor functions where sometimes returned as last, sometimes as first and sometimes as second parameter. Let's clean this up a bit. Here are the new rules: 1. The object the new object is derived from is put first, if there is any 2. The object we are creating will be returned in the next arguments 3. This is followed by any additional arguments Rationale: For functions that operate on an object we always put that object first. Constructors should probably not be too different in this regard. Also, if the additional parameters might want to use varargs which suggests to put them last. Note that this new scheme only applies to constructor functions, not to all other functions. We do give a lot of freedom for those. Note that this commit only changes the order of the new functions we added, for old ones we accept the wrong order and leave it like that.
2014-02-19make gcc shut upLennart Poettering
If -flto is used then gcc will generate a lot more warnings than before, among them a number of use-without-initialization warnings. Most of them without are false positives, but let's make them go away, because it doesn't really matter.
2014-02-19core: add Personality= option for units to set the personality for spawned ↵Lennart Poettering
processes
2014-02-18nspawn: add new --personality= switch to make it easier to run 32bit ↵Lennart Poettering
containers on a 64bit host
2014-02-18nspawn: x86 is special with its socketcall() semantics, be permissive in the ↵Lennart Poettering
seccomp setup
2014-02-18seccomp: add helper call to add all secondary archs to a seccomp filterLennart Poettering
And make use of it where appropriate for executing services and for nspawn.
2014-02-18nspawn: allow 32-bit chroots from 64-bit hostsDave Reisner
Arch Linux uses nspawn as a container for building packages and needs to be able to start a 32bit chroot from a 64bit host. 24fb11120756 disrupted this feature when seccomp handling was added.
2014-02-18sd-rtnl-message: store reference to the bus in the messageTom Gundersen
This mimics the sd-bus api, as we may need it in the future.
2014-02-17nspawn: netns_fd can be removed nowLennart Poettering
2014-02-16nspawn: typo fix in helpThomas Hindoe Paaboel Andersen
2014-02-16nspawn: add new --network-bridge= switchTom Gundersen
This adds the host side of the veth link to the given bridge. Also refactor the creation of the veth interfaces a bit to set it up from the host rather than the container. This simplifies the addition to the bridge, but otherwise the behavior is unchanged.
2014-02-15sd-rtnl: always include linux/rtnetlink.hTom Gundersen
2014-02-15sd-rtnl: message_open_container - don't take a 'size' argumentTom Gundersen
We can always know the size based on the type, so let's do this inside the library.
2014-02-14nspawn: if we don't find bash, try shLennart Poettering
2014-02-14nspawn: don't accept just any tree to executeLennart Poettering
When invoked without -D in an arbitrary directory we should not try to execute anything, make some validity checks first.
2014-02-13nspawn: make socket(AF_NETLINK, *, NETLINK_AUDIT) fail with EAFNOTSUPPORT in ↵Lennart Poettering
containers The kernel still doesn't support audit in containers, so let's make use of seccomp and simply turn it off entirely. We can get rid of this big as soon as the kernel is fixed again.
2014-02-13nspawn: add new --network-veth switch to add a virtual ethernet link to the hostLennart Poettering
2014-02-13nspawn: check with udev before we take possession of an interfaceLennart Poettering
2014-02-13nspawn: no need to subscribe to netlink messages if we just want to execute ↵Lennart Poettering
one operation
2014-02-13nspawn: --private-network should imply CAP_NET_ADMINLennart Poettering
2014-02-13rtnl: rename constructors from the form sd_rtnl_xxx_yyy_new() to ↵Lennart Poettering
sd_rtnl_xxx_new_yyy() So far we followed the rule to always indicate the "flavour" of constructors after the "_new_" or "_open_" in the function name, so let's keep things in sync here for rtnl and do the same.
2014-02-13rtnl: drop "sd_" prefix from cleanup macrosLennart Poettering
The "sd_" prefix is supposed to be used on exported symbols only, and not in the middle of names. Let's drop it from the cleanup macros hence, to make things simpler. The bus cleanup macros don't carry the "sd_" either, so this brings the APIs a bit nearer.