summaryrefslogtreecommitdiff
path: root/src/nspawn/nspawn.c
AgeCommit message (Collapse)Author
2014-11-22nspawn: ignore EEXIST when mounting tmpfsRichard Schütz
commit 79d80fc1466512d0ca211f4bfcd9de5f2f816a5a introduced a regression that prevents mounting a tmpfs if the mount point already exits in the container's root file system. This commit fixes the problem by ignoring EEXIST.
2014-11-21nspawn: Add try-{host,guest} journal link modesMartin Pitt
--link-journal={host,guest} fail if the host does not have persistent journalling enabled and /var/log/journal/ does not exist. Even worse, as there is no stdout/err any more, there is no error message to point that out. Introduce two new modes "try-host" and "try-guest" which don't fail in this case, and instead just silently skip the guest journal setup. Change -j to mean "try-guest" instead of "guest", and fix the wrong --help output for it (it said "host" before). Change systemd-nspawn@.service.in to use "try-guest" so that this unit works with both persistent and non-persistent journals on the host without failing. https://bugs.debian.org/770275
2014-11-13sd-bus: sync with kdbus upstream (ABI break)Daniel Mack
kdbus has seen a larger update than expected lately, most notably with kdbusfs, a file system to expose the kdbus control files: * Each time a file system of this type is mounted, a new kdbus domain is created. * The layout inside each mount point is the same as before, except that domains are not hierarchically nested anymore. * Domains are therefore also unnamed now. * Unmounting a kdbusfs will automatically also detroy the associated domain. * Hence, the action of creating a kdbus domain is now as privileged as mounting a filesystem. * This way, we can get around creating dev nodes for everything, which is last but not least something that is not limited by 20-bit minor numbers. The kdbus specific bits in nspawn have all been dropped now, as nspawn can rely on the container OS to set up its own kdbus domain, simply by mounting a new instance. A new set of mounts has been added to mount things *after* the kernel modules have been loaded. For now, only kdbus is in this set, which is invoked with mount_setup_late().
2014-11-04barrier: explicitly ignore return values of barrier_place()David Herrmann
The barrier implementation tracks remote states internally. There is no need to check the return value of any barrier_*() function if the caller is not interested in the result. The barrier helpers only return the state of the remote side, which is usually not interesting as later calls to barrier_sync() will catch this, anyway. Shut up coverity by explicitly ignoring return values of barrier_place() if we're not interested in it.
2014-10-31ptyforward: rework PTY forwarder logic used by nspawn to utilize the normal ↵Lennart Poettering
event loop We really should not run manual event loops anymore, but standardize on sd_event, so that we can run sd_bus connections from it eventually.
2014-10-31units: don't order journal flushing afte remote-fs.targetLennart Poettering
Instead, only depend on the actual file systems we need. This should solve dep loops on setups where remote-fs.target is moved into late boot.
2014-10-31nspawn: don't make up -1 as error codeLennart Poettering
2014-10-29nspawn: ignore EEXIST when creating mount pointDave Reisner
A combination of commits f3c80515c and 79d80fc14 cause nspawn to silently fail with a commandline such as: # systemd-nspawn -D /build/extra-x86_64 --bind=/usr strace shows the culprit: [pid 27868] writev(2, [{"Failed to create mount point /build/extra-x86_64/usr: File exists", 82}, {"\n", 1}], 2) = 83
2014-10-27util: introduce sethostname_idempotentMichal Sekletar
Function queries system hostname and applies changes only when necessary. Also, migrate all client of sethostname to sethostname_idempotent while at it.
2014-10-17nspawn: fix DeviceAllow listDaniel Mack
Commit 864e17068 ("nspawn: actually allow access to /dev/net/tun in the container") added "/dev/net/tun" to the list of allowed devices but forgot to tweak the array length, which caused "/dev/kdbus/*" to be missed.
2014-10-10nspawn: actually allow access to /dev/net/tun in the containerLennart Poettering
It's not sufficient to just copy the device node over, we need to update the policy for it too.
2014-10-08nspawn: copy /dev/net/tun from hostTom Gundersen
This enables tuntap support in the container (assumning the necessary capabilities are in place).
2014-09-29nspawn: log when tearing down of loop device failsTom Gundersen
2014-09-25nspawn: check some more return valuesTom Gundersen
Most of these failures would anyway get caught later on, but now the error messages are a bit more specific.
2014-09-19nspawn: don't try to create veth link with too long ifnameTom Gundersen
Reported by: James Lott <james@lottspot.com>
2014-08-28nspawn: fix --network-interfaceTom Gundersen
Use SETLINK when modifying an existing link.
2014-08-26util: make use of newly added reset_signal_mask() call wherever appropriateLennart Poettering
2014-08-21notify: send STOPPING=1 from our daemonsLennart Poettering
2014-08-04nspawn: make sure that when --network-veth is used both the host and the ↵Lennart Poettering
container side get fixed MAC addresses
2014-08-04bus: always explicitly close bus from main programsLennart Poettering
Since b5eca3a2059f9399d1dc52cbcf9698674c4b1cf0 we don't attempt to GC busses anymore when unsent messages remain that keep their reference, when they otherwise are not referenced anymore. This means that if we explicitly want connections to go away, we need to close them. With this change we will no do so explicitly wherver we connect to the bus from a main program (and thus know when the bus connection should go away), or when we create a private bus connection, that really should go away after our use. This fixes connection leaks in the NSS and PAM modules.
2014-08-03Unify parse_argv styleZbigniew Jędrzejewski-Szmek
getopt is usually good at printing out a nice error message when commandline options are invalid. It distinguishes between an unknown option and a known option with a missing arg. It is better to let it do its job and not use opterr=0 unless we actually want to suppress messages. So remove opterr=0 in the few places where it wasn't really useful. When an error in options is encountered, we should not print a lengthy help() and overwhelm the user, when we know precisely what is wrong with the commandline. In addition, since help() prints to stdout, it should not be used except when requested with -h or --help. Also, simplify things here and there.
2014-08-03nspawn: fix truncation of machine names in interface namesZbigniew Jędrzejewski-Szmek
Based on patch by Michael Marineau <michael.marineau@coreos.com>: When deriving the network interface name from machine name strncpy was not properly null terminating the string and the maximum string size as returned by strlen() is actually IFNAMSIZ-1, not IFNAMSIZ.
2014-07-31Reject invalid quoted stringsZbigniew Jędrzejewski-Szmek
String which ended in an unfinished quote were accepted, potentially with bad memory accesses. Reject anything which ends in a unfished quote, or contains non-whitespace characters right after the closing quote. _FOREACH_WORD now returns the invalid character in *state. But this return value is not checked anywhere yet. Also, make 'word' and 'state' variables const pointers, and rename 'w' to 'word' in various places. Things are easier to read if the same name is used consistently. mbiebl_> am I correct that something like this doesn't work mbiebl_> ExecStart=/usr/bin/encfs --extpass='/bin/systemd-ask-passwd "Unlock EncFS"' mbiebl_> systemd seems to strip of the quotes mbiebl_> systemctl status shows mbiebl_> ExecStart=/usr/bin/encfs --extpass='/bin/systemd-ask-password Unlock EncFS $RootDir $MountPoint mbiebl_> which is pretty weird
2014-07-18barrier: initalize file descriptors with -1Zbigniew Jędrzejewski-Szmek
Explicitly initalize descriptors using explicit assignment like bus_error. This makes barriers follow the same conventions as everything else and makes things a bit simpler too. Rename barier_init to barier_create so it is obvious that it is not about initialization. Remove some parens, etc.
2014-07-17nspawn: fix barrier-destroy callDavid Herrmann
I dropped the cleanup-helper before pushing so use _cleanup_() directly.
2014-07-17nspawn: use Barrier API instead of eventfd-utilDavid Herrmann
The Barrier-API simplifies cross-fork() synchronization a lot. Replace the hard-coded eventfd-util implementation and drop it. Compared to the old API, Barriers also handle exit() of the remote side as abortion. This way, segfaults will not cause the parent to deadlock. EINTR handling is currently ignored for any barrier-waits. This can easily be added, but it isn't needed so far so I dropped it. EINTR handling in general is ugly, anyway. You need to deal with pselect/ppoll/... variants and make sure not to unblock signals at the wrong times. So genrally, there's little use in adding it.
2014-07-10nspawn: register external network interface with machinedLennart Poettering
2014-07-04nspawn: add new --volatile switch for booting containers in volatile ↵Lennart Poettering
(ephemeral) mode Two modes are supported: --volatile=yes mounts only /usr into the container, and a tmpfs as root directory. --volatile=state mounts the full OS tree in, but overmounts /var with a tmpfs. --volatile=yes hence boots with an unpopulated /etc and /var, starting with pristine configuration and state. --volatile=state hence boots with an unpopulated /var, only starting with pristine state.
2014-07-03nspawn: when running in a service unit, use systemd for restartsLennart Poettering
THis way we can remove cgroup priviliges after setup, but get them back for the next restart, as we need it.
2014-06-30nspawn: block open_by_handle_at() and others via seccompLennart Poettering
Let's protect ourselves against the recently reported docker security issue. Our man page makes clear that we do not make any security promises anyway, but well, this one is easy to mitigate, so let's do it. While we are at it block a couple of more syscalls that are no good in containers, too.
2014-06-30nspawn: let's avoid using goto to wildly for non-cleanup purposesLennart Poettering
2014-06-30nspawn: simplify exit condition checkLennart Poettering
2014-06-30nspawn: log a warning on failure from wait_for_terminate()Luke Shumaker
This is at the suggestion of Djalal Harouni on the mailing list, and reflects the behavior of shared/util.c:wait_for_terminate_and_warn().
2014-06-30nspawn: Fix regression with exit statusLuke Shumaker
Commit 113cea8 introduced a bug that caused the exit code of systemd-nspawn to not reflect the exit code of the program executed in the container.
2014-06-24switch-root: create essential base directories at system bootupKay Sievers
This allows us to bootup a rootfs with a /usr directory only.
2014-06-24nspawn: create essential base directories at system bootupKay Sievers
This allows us to bootup a rootfs with a /usr directory only.
2014-06-22consistently order cleanup attribute before typeThomas Hindoe Paaboel Andersen
2014-06-13os-release: define /usr/lib/os-release as fallback for /etc/os-releaseLennart Poettering
The file should have been in /usr/lib/ in the first place, since it describes the OS container in /usr (and not the configuration in /etc), hence, let's support os-release files in /usr/lib as fallback if no version in /etc exists, following the usual override logic. A prior commit already enabled tmpfiles to create /etc/os-release as a symlink to /usr/lib/os-release should it be missing, thus providing nice compatibility with applications only checking in /etc. While it's probably a good idea if all apps check both locations via a fallback logic, it is only necessary in the early boot process, as long as the /etc/os-release symlink has not been restored, in case we boot with an empty /etc.
2014-06-11nspawn: add new --tmpfs= option to mount a tmpfs on specific directories, ↵Lennart Poettering
such as /var
2014-06-10tmpfiles: add new "C" line for copying files or directoriesLennart Poettering
2014-06-07nspawn: split long message into two linesZbigniew Jędrzejewski-Szmek
For names like /var/lib/container/something, the message becomes quite long. Better to split it. Also reword the message not to suggest that ^]^]^] only works in the beginning.
2014-06-06namespace: beef up read-only bind mount logicLennart Poettering
Instead of blindly creating another bind mount for read-only mounts, check if there's already one we can use, and if so, use it. Also, recursively mark all submounts read-only too. Also, ignore autofs mounts when remounting read-only unless they are already triggered.
2014-05-25nspawn: make nspawn robust to container failureDjalal Harouni
nspawn and the container child use eventfd to wait and notify each other that they are ready so the container setup can be completed. However in its current form the wait/notify event ignore errors that may especially affect the child (container). On errors the child will jump to the "child_fail" label and terminate with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd is created without the "EFD_NONBLOCK" flag, this leaves the parent blocking on the eventfd_read() call. The container can also be killed at any moment before execv() and the parent will not receive notifications. We can fix this by using cheap mechanisms, the new high level eventfd API and handle SIGCHLD signals: * Keep the cheap eventfd and EFD_NONBLOCK flag. * Introduce eventfd states for parent and child to sync. Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the parent from waiting on an event that will never come. * If the child is killed before execv() or before notifying the parent, we install a NOP handler for SIGCHLD which will interrupt blocking calls with EINTR. This gives a chance to the parent to call wait() and terminate in main(). * If there are no errors, parent will block SIGCHLD, restore default handler and notify child which will do execv(), then parent will pass control to process_pty() to do its magic. This was exposed in part by: https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger tobias.hunger@gmail.com
2014-05-25nspawn: move container wait logic into wait_for_container()Djalal Harouni
Move the container wait logic into its own wait_for_container() function and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED. The status will be stored in its argument, this way we handle: a) Return negative on failures. b) Return zero on success and set the status to either CONTAINER_REBOOTED or CONTAINER_TERMINATED. These status codes are used to terminate nspawn or loop again in case of CONTAINER_REBOOTED.
2014-05-25Use %m instead of strerror(errno) where appropiateCristian Rodríguez
2014-05-22nspawn: restore journal directory is empty checkLennart Poettering
This undoes part of commit e6a4a517befe559adf6d1dbbadf425c3538849c9. Instead of removing the error message about non-empty journal bind mount directories, simply downgrade the message to a warning and proceed.
2014-05-22nspawn: allow to bind mount journal on top of a non empty container journal ↵Djalal Harouni
dentry Currently if nspawn was called with --link-journal=host or --link-journal=auto and the right /var/log/journal/machine-id/ exists then the bind mount the subdirectory into the container might fail due to the ~/mycontainer/var/log/journal/machine-id/ of the container not being empty. There is no reason to check if the container journal subdir is empty since there will be a bind mount on top of it. The user asked for a bind mount so give it. Note: a next call with --link-journal=guest may fail due to the /var/log/journal/machine-id/ on the host not being empty. https://bugs.freedesktop.org/show_bug.cgi?id=76193 Reported-by: Tobias Hunger <tobias.hunger@gmail.com>
2014-05-19fix spelling of privilegeNis Martensen
2014-05-16nspawn: properly format container_uuid in UUID formatLennart Poettering
http://lists.freedesktop.org/archives/systemd-devel/2014-April/018971.html
2014-04-10nspawn: Fix erroneous OOM when building group listPhilip Lorenz
change_uid_gid() never initialises sz which may cause greedy_realloc to skip the initial buffer allocation.