summaryrefslogtreecommitdiff
path: root/src/nspawn/nspawn.c
AgeCommit message (Collapse)Author
2014-07-04nspawn: add new --volatile switch for booting containers in volatile ↵Lennart Poettering
(ephemeral) mode Two modes are supported: --volatile=yes mounts only /usr into the container, and a tmpfs as root directory. --volatile=state mounts the full OS tree in, but overmounts /var with a tmpfs. --volatile=yes hence boots with an unpopulated /etc and /var, starting with pristine configuration and state. --volatile=state hence boots with an unpopulated /var, only starting with pristine state.
2014-07-03nspawn: when running in a service unit, use systemd for restartsLennart Poettering
THis way we can remove cgroup priviliges after setup, but get them back for the next restart, as we need it.
2014-06-30nspawn: block open_by_handle_at() and others via seccompLennart Poettering
Let's protect ourselves against the recently reported docker security issue. Our man page makes clear that we do not make any security promises anyway, but well, this one is easy to mitigate, so let's do it. While we are at it block a couple of more syscalls that are no good in containers, too.
2014-06-30nspawn: let's avoid using goto to wildly for non-cleanup purposesLennart Poettering
2014-06-30nspawn: simplify exit condition checkLennart Poettering
2014-06-30nspawn: log a warning on failure from wait_for_terminate()Luke Shumaker
This is at the suggestion of Djalal Harouni on the mailing list, and reflects the behavior of shared/util.c:wait_for_terminate_and_warn().
2014-06-30nspawn: Fix regression with exit statusLuke Shumaker
Commit 113cea8 introduced a bug that caused the exit code of systemd-nspawn to not reflect the exit code of the program executed in the container.
2014-06-24switch-root: create essential base directories at system bootupKay Sievers
This allows us to bootup a rootfs with a /usr directory only.
2014-06-24nspawn: create essential base directories at system bootupKay Sievers
This allows us to bootup a rootfs with a /usr directory only.
2014-06-22consistently order cleanup attribute before typeThomas Hindoe Paaboel Andersen
2014-06-13os-release: define /usr/lib/os-release as fallback for /etc/os-releaseLennart Poettering
The file should have been in /usr/lib/ in the first place, since it describes the OS container in /usr (and not the configuration in /etc), hence, let's support os-release files in /usr/lib as fallback if no version in /etc exists, following the usual override logic. A prior commit already enabled tmpfiles to create /etc/os-release as a symlink to /usr/lib/os-release should it be missing, thus providing nice compatibility with applications only checking in /etc. While it's probably a good idea if all apps check both locations via a fallback logic, it is only necessary in the early boot process, as long as the /etc/os-release symlink has not been restored, in case we boot with an empty /etc.
2014-06-11nspawn: add new --tmpfs= option to mount a tmpfs on specific directories, ↵Lennart Poettering
such as /var
2014-06-10tmpfiles: add new "C" line for copying files or directoriesLennart Poettering
2014-06-07nspawn: split long message into two linesZbigniew Jędrzejewski-Szmek
For names like /var/lib/container/something, the message becomes quite long. Better to split it. Also reword the message not to suggest that ^]^]^] only works in the beginning.
2014-06-06namespace: beef up read-only bind mount logicLennart Poettering
Instead of blindly creating another bind mount for read-only mounts, check if there's already one we can use, and if so, use it. Also, recursively mark all submounts read-only too. Also, ignore autofs mounts when remounting read-only unless they are already triggered.
2014-05-25nspawn: make nspawn robust to container failureDjalal Harouni
nspawn and the container child use eventfd to wait and notify each other that they are ready so the container setup can be completed. However in its current form the wait/notify event ignore errors that may especially affect the child (container). On errors the child will jump to the "child_fail" label and terminate with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd is created without the "EFD_NONBLOCK" flag, this leaves the parent blocking on the eventfd_read() call. The container can also be killed at any moment before execv() and the parent will not receive notifications. We can fix this by using cheap mechanisms, the new high level eventfd API and handle SIGCHLD signals: * Keep the cheap eventfd and EFD_NONBLOCK flag. * Introduce eventfd states for parent and child to sync. Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the parent from waiting on an event that will never come. * If the child is killed before execv() or before notifying the parent, we install a NOP handler for SIGCHLD which will interrupt blocking calls with EINTR. This gives a chance to the parent to call wait() and terminate in main(). * If there are no errors, parent will block SIGCHLD, restore default handler and notify child which will do execv(), then parent will pass control to process_pty() to do its magic. This was exposed in part by: https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger tobias.hunger@gmail.com
2014-05-25nspawn: move container wait logic into wait_for_container()Djalal Harouni
Move the container wait logic into its own wait_for_container() function and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED. The status will be stored in its argument, this way we handle: a) Return negative on failures. b) Return zero on success and set the status to either CONTAINER_REBOOTED or CONTAINER_TERMINATED. These status codes are used to terminate nspawn or loop again in case of CONTAINER_REBOOTED.
2014-05-25Use %m instead of strerror(errno) where appropiateCristian Rodríguez
2014-05-22nspawn: restore journal directory is empty checkLennart Poettering
This undoes part of commit e6a4a517befe559adf6d1dbbadf425c3538849c9. Instead of removing the error message about non-empty journal bind mount directories, simply downgrade the message to a warning and proceed.
2014-05-22nspawn: allow to bind mount journal on top of a non empty container journal ↵Djalal Harouni
dentry Currently if nspawn was called with --link-journal=host or --link-journal=auto and the right /var/log/journal/machine-id/ exists then the bind mount the subdirectory into the container might fail due to the ~/mycontainer/var/log/journal/machine-id/ of the container not being empty. There is no reason to check if the container journal subdir is empty since there will be a bind mount on top of it. The user asked for a bind mount so give it. Note: a next call with --link-journal=guest may fail due to the /var/log/journal/machine-id/ on the host not being empty. https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger <tobias.hunger@gmail.com>
2014-05-19fix spelling of privilegeNis Martensen
2014-05-16nspawn: properly format container_uuid in UUID formatLennart Poettering
http://lists.freedesktop.org/archives/systemd-devel/2014-April/018971.html
2014-04-10nspawn: Fix erroneous OOM when building group listPhilip Lorenz
change_uid_gid() never initialises sz which may cause greedy_realloc to skip the initial buffer allocation.
2014-03-28sd-rtnl: rework rtnl type systemTom Gundersen
Use a static table with all the typing information, rather than repeated switch statements. This should make it a lot simpler to add new types. We need to keep all the type info to be able to create containers without exposing their implementation details to the users of the library. As a freebee we verify the types of appended/read attributes. The API is extended to nicely deal with unions of container types.
2014-03-24util: replace close_pipe() with new safe_close_pair()Lennart Poettering
safe_close_pair() is more like safe_close(), except that it handles pairs of fds, and doesn't make and misleading allusion, as it works similarly well for socketpairs() as for pipe()s...
2014-03-18util: replace close_nointr_nofail() by a more useful safe_close()Lennart Poettering
safe_close() automatically becomes a NOP when a negative fd is passed, and returns -1 unconditionally. This makes it easy to write lines like this: fd = safe_close(fd); Which will close an fd if it is open, and reset the fd variable correctly. By making use of this new scheme we can drop a > 200 lines of code that was required to test for non-negative fds or to reset the closed fd variable afterwards.
2014-03-16nspawn: UP the host side of the veth pair after adding it to a bridgeTom Gundersen
2014-03-13nspawn: remove unused variableDave Reisner
2014-03-14nspawn: allow -EEXIST on mkdir_safe /home/${uid}Brandon Philips
With systemd 211 nspawn attempts to create the home directory for the given uid. However, if the home directory already exists then it will fail. Don't error out on -EEXIST.
2014-03-13nspawn: make host0's MAC address persistentTom Gundersen
We still need to make sure that no two MAC addresses are the same, so we use a logic similar to what is used in udev to generate MAC addresses, and base it on a hash of the host's machine ID and thecontainer's name.
2014-03-13nspawn: honour GPT partition flags when mounting file systems following the ↵Lennart Poettering
discoverable partitions spec
2014-03-11nspawn: fix argv[0] for getentMantas Mikulėnas
2014-03-11nspawn: allow using kdbus from nspawn containersLennart Poettering
2014-03-11nspawn: fix getent fallbackLennart Poettering
2014-03-11nspawn: when resoliving UIDs/GIDs for "-u", do so in forked off ↵Lennart Poettering
/usr/bin/getent instead of in-process When the container runs a different native architecture than the host we shouldn't attempt to load the container's NSS modules with the host's libc. Instead, resolve UID/GID by invoking /usr/bin/getent in the container. The tool should be fairly universally available and allows us to do resolving of the UID/GID with the container's libc in a parsable format. https://bugs.freedesktop.org/show_bug.cgi?id=75733
2014-03-11nspawn: make sure we don't try to mount the container block device in the ↵Lennart Poettering
child after the parent added us to the device cgroup
2014-03-10nspawn: don't try mknod() of /dev/console with the correct major/minorLennart Poettering
We overmount /dev/console with an external pty anyway, hence there's no point in using the real major/minor when we create the node to overmount. Instead, use the one of /dev/null now. This fixes a race against the cgroup device controller setup we are using. In case /dev/console was create before the cgroup policy was applied all was good, but if created in the opposite order the mknod() would fail, since creating /dev/console is not allowed by it. Creating /dev/null instances is however permitted, and hence use it.
2014-03-10nspawn: add --image= switch to boot GPT disk images that follow the ↵Lennart Poettering
Discoverable Partitions Specification
2014-02-28nspawn: fix detection of missing /proc/self/loginuidTero Roponen
Running 'systemd-nspawn -D /srv/Fedora/' gave me this error: Failed to read /proc/self/loginuid: No such file or directory Container Fedora failed with error code 1. This patch fixes the problem.
2014-02-26nspawn: no need for duplicate checks against EEXISTLennart Poettering
2014-02-25nspawn: add new switch --network-macvlan= to add a macvlan device to the ↵Lennart Poettering
container
2014-02-24nspawn: make use of the devices cgroup controller by defaultLennart Poettering
2014-02-21nspawn: when adding a veth interface to a bridge, use the "vb-" rather than ↵Lennart Poettering
"ve-" interface name prefix This way we can recognize the interfaces later on to apply different host-side configuration to them.
2014-02-20api: in constructor function calls, always put the returned object pointer ↵Lennart Poettering
first (or second) Previously the returned object of constructor functions where sometimes returned as last, sometimes as first and sometimes as second parameter. Let's clean this up a bit. Here are the new rules: 1. The object the new object is derived from is put first, if there is any 2. The object we are creating will be returned in the next arguments 3. This is followed by any additional arguments Rationale: For functions that operate on an object we always put that object first. Constructors should probably not be too different in this regard. Also, if the additional parameters might want to use varargs which suggests to put them last. Note that this new scheme only applies to constructor functions, not to all other functions. We do give a lot of freedom for those. Note that this commit only changes the order of the new functions we added, for old ones we accept the wrong order and leave it like that.
2014-02-19make gcc shut upLennart Poettering
If -flto is used then gcc will generate a lot more warnings than before, among them a number of use-without-initialization warnings. Most of them without are false positives, but let's make them go away, because it doesn't really matter.
2014-02-19core: add Personality= option for units to set the personality for spawned ↵Lennart Poettering
processes
2014-02-18nspawn: add new --personality= switch to make it easier to run 32bit ↵Lennart Poettering
containers on a 64bit host
2014-02-18nspawn: x86 is special with its socketcall() semantics, be permissive in the ↵Lennart Poettering
seccomp setup
2014-02-18seccomp: add helper call to add all secondary archs to a seccomp filterLennart Poettering
And make use of it where appropriate for executing services and for nspawn.
2014-02-18nspawn: allow 32-bit chroots from 64-bit hostsDave Reisner
Arch Linux uses nspawn as a container for building packages and needs to be able to start a 32bit chroot from a 64bit host. 24fb11120756 disrupted this feature when seccomp handling was added.