summaryrefslogtreecommitdiff
path: root/src/shared
AgeCommit message (Collapse)Author
2015-01-17missing: add macros for OFD locksMichael Marineau
2015-01-15nspawn,machined: change default container image location from ↵Lennart Poettering
/var/lib/container to /var/lib/machines Given that this is also the place to store raw disk images which are very much bootable with qemu/kvm it sounds like a misnomer to call the directory "container". Hence, let's change this sooner rather than later, and use the generic name, in particular since we otherwise try to use the generic "machine" preferably over the more specific "container" or "vm".
2015-01-15import: rename "gpt" disk image type to "raw"Lennart Poettering
After all, nspawn can now dissect MBR partition levels, too, hence ".gpt" appears a misnomer. Moreover, the the .raw suffix for these files is already pretty popular (the Fedora disk images use it for example), hence sounds like an OK scheme to adopt.
2015-01-14nspawn: add file system locks for controlling access to container imagesLennart Poettering
This adds three kinds of file system locks for container images: a) a file system lock next to the actual image, in a .lck file in the same directory the image is located. This lock has the benefit of usually being located on the same NFS share as the image itself, and thus allows locking container images across NFS shares. b) a file system lock in /run, named after st_dev and st_ino of the root of the image. This lock has the advantage that it is unique even if the same image is bind mounted to two different places at the same time, as the ino/dev stays constant for them. c) a file system lock that is only taken when a new disk image is about to be created, that ensures that checking whether the name is already used across the search path, and actually placing the image is not interrupted by other code taking the name. a + b are read-write locks. When a container is booted in read-only mode a read lock is taken, otherwise a write lock. Lock b is always taken after a, to avoid ABBA problems. Lock c is mostly relevant when renaming or cloning images.
2015-01-14pty: minor modernizationLennart Poettering
We initialize structs during declartion if possible
2015-01-14machined: use the FS_IMMUTABLE_FL file flag, if available, to implement a ↵Lennart Poettering
"read-only" concept for raw disk images, too
2015-01-14util: the chattr flags field is actually unsigned, judging by kernel sourcesLennart Poettering
Unlike some client code suggests...
2015-01-14ptyfw: add missing error checkLennart Poettering
2015-01-13networkd: make IP forwarding for IPv4 and IPv6 individually configurableLennart Poettering
2015-01-13fw-util: fix errno typo for !HAVE_LIBIPTCDaniel Mack
2015-01-13networkd: add minimal IP forwarding and masquerading support to .network filesLennart Poettering
This adds two new settings to networkd's .network files: IPForwarding=yes and IPMasquerade=yes. The former controls the "forwarding" sysctl setting of the interface, thus controlling whether IP forwarding shall be enabled on the specific interface. The latter controls whether a firewall rule shall be installed that exposes traffic coming from the interface as coming from the local host to all other interfaces. This also enables both options by default for container network interfaces, thus making "systemd-nspawn --network-veth" have network connectivity out of the box.
2015-01-13shared: add minimal firewall manipulation helpers for establishing NAT ↵Lennart Poettering
rules, using libiptc
2015-01-11fstab-util: fix priority parsing and add testZbigniew Jędrzejewski-Szmek
2015-01-11shared/util: respect buffer boundary on incomplete escape sequencesZbigniew Jędrzejewski-Szmek
cunescape_length_with_prefix() is called with the length as an argument, so it cannot rely on the buffer being NUL terminated. Move the length check before accessing the memory. When an incomplete escape sequence was given at the end of the buffer, c_l_w_p() would read past the end of the buffer. Fix this and add a test.
2015-01-11Support negated fstab optionsZbigniew Jędrzejewski-Szmek
We would ignore options like "fail" and "auto", and for any option which takes a value the first assignment would win. Repeated and options equivalent to the default are rarely used, but they have been documented forever, and people might use them. Especially on the kernel command line it is easier to append a repeated or negated option at the end.
2015-01-11fstab-util: detect out-of-range pri= assignmentsZbigniew Jędrzejewski-Szmek
We would silently ignore them. One would have to be crazy to do assign an out of range value, but simply ignoring it bothers me.
2015-01-11Add new function to filter fstab optionsZbigniew Jędrzejewski-Szmek
This fixes parsing of options in shared/generator.c. Existing code had some issues: - it would treate whitespace and semicolons as seperators. fstab(5) is pretty clear that only commas matter. And the syntax does not allow for spaces to be inserted in the field in fstab. Whitespace might be escaped, but then it should not seperate options. Treat whitespace and semicolons as any other character. - it assumed that x-systemd.device-timeout would always be followed by "=". But this is not guaranteed, hasmntopt will return this option even if there's no value. Uninitialized memory could be read. - some error paths would log, and inconsistently, some would just return an error code. Filtering is split out to a separate function and tests are added. Similar code paths in other places are adjusted to use the new function.
2015-01-11shared/list: add LIST_APPENDZbigniew Jędrzejewski-Szmek
2015-01-11path-lookup: allow /run to override /etc in generator searchZbigniew Jędrzejewski-Szmek
Generators are different than unit files: they are never automatically generated, so there's no point in allowing /etc to override /run. On the other hand, overriding /etc might be useful in some cases.
2015-01-11Implement masking and overriding of generatorsZbigniew Jędrzejewski-Szmek
Sometimes it is necessary to stop a generator from running. Either because of a bug, or for testing, or some other reason. The only way to do that would be to rename or chmod the generator binary, which is inconvenient and does not survive upgrades. Allow masking and overriding generators similarly to units and other configuration files. For the systemd instance, masking would be more common, rather than overriding generators. For the user instances, it may also be useful for users to have generators in $XDG_CONFIG_HOME to augment or override system-wide generators. Directories are searched according to the usual scheme (/usr/lib, /usr/local/lib, /run, /etc), and files with the same name in higher priority directories override files with the same name in lower priority directories. Empty files and links to /dev/null mask a given name. https://bugs.freedesktop.org/show_bug.cgi?id=87230
2015-01-11Simplify execute_directory()Zbigniew Jędrzejewski-Szmek
Remove the optional sepearate opening of the directory, it would be just too complicated with the change to multiple directories. Move the middle of execute_directory() to a seperate function to make it easier to grok.
2015-01-11log: fix log_full_errno() with custom facilitiesDavid Herrmann
Make sure to extract the log-priority when comparing against max-log-level, otherwise, we will always drop those messages. This fixes bus-proxyd to properly send warnings on policy blocks.
2015-01-09logind: unify how we cast between uid_t and pointers for hashmap keysLennart Poettering
2015-01-08machined: when cloning a raw disk image, also set the NOCOW flagLennart Poettering
2015-01-08loginctl: show the 10 most recent log user/session log lines in "loginctl ↵Lennart Poettering
user-status" and "loginctl session-status"
2015-01-08path-util: plug leakTom Gundersen
2015-01-08journal: bump RLIMIT_NOFILE when journal files to 16K (if possible)Lennart Poettering
When there are a lot of split out journal files, we might run out of fds quicker then we want. Hence: bump RLIMIT_NOFILE to 16K if possible. Do these even for journalctl. On Fedora the soft RLIMIT_NOFILE is at 1K, the hard at 4K by default for normal user processes, this code hence bumps this up for users to 4K. https://bugzilla.redhat.com/show_bug.cgi?id=1179980
2015-01-08util: make it easy to initialize the crtime from the current time in ↵Lennart Poettering
fd_setcrtime()
2015-01-08journald: turn off COW for journal files on btrfsLennart Poettering
btrfs' COW logic results in heavily fragment journal files, which is detrimental for perfomance. Hence, turn off COW for journal files as we create them. Turning off COW comes at the cost of data integrity guarantees, but this should be acceptable, given that we do our own checksumming, and generally have a pretty conservative write pattern. Also see discussion on linux-btrfs: http://www.spinics.net/lists/linux-btrfs/msg41001.html
2015-01-07util: upgrade default $TERM from vt102 to vt220 if we have no idea about the ↵Lennart Poettering
connected terminal So far, if we had no knowledge about the correct $TERM we defaulted to v102, as a safe, conservative choice. However, the terminfo data for vt102 is not aware of pageup/pagedown, which makes "less" much harder work with than necessary. Setting vt220 allows them to work correctly. "vt220" should be a sufficiently safe choice too, given that xterm, gnome-terminal and the linux console all strive to implement vt220 as baseline, already to pass pageup/pagedown correctly to apps. Effectively, with this change "journalctl -e" run inside a "systemd-nspawn" terminal will now run a pager where pageup/pagedown works, which is quite an improvement of usability for containers.
2015-01-07conf-parse: make syntax logging functions behave more like other log functonsLennart Poettering
In particular, don't patch the error number to EINVAL if 0, and don't negate it. (Also, add do {} while (false) around multi-line macro)
2015-01-07ptyfwd: simplify how we handle vhangups a bitLennart Poettering
2015-01-07btrfs-util: rework how we iterate through the results of the TREE_SEARCH resultsLennart Poettering
Let's introduce some syntactic sugar with iteration macros, and add correct key increment calls.
2015-01-07machinectl: make sure that "machinectl login" exits immediately when the ↵Lennart Poettering
machine it is connected to dies
2015-01-07util: make use of kcmp() to compare fds, if it is availableLennart Poettering
2015-01-07util: don't fail recursive bind mounting if we cannot read the mount flags ↵Lennart Poettering
from an obstructed mounted
2015-01-06journald: whenever we rotate a file, btrfs defrag itLennart Poettering
Our write pattern is quite awful for CoW file systems (btrfs...), as we keep updating file parts in the beginning of the file. This results in fragmented journal files. Hence: when rotating files, defragment them, since at that point we know that no further write accesses will be made.
2015-01-06tree-wide: remove unnecessary LOG_PRIZbigniew Jędrzejewski-Szmek
LOG_DEBUG is already a log level, there is no need to use LOG_PRI which is for filtering out the facility.
2015-01-06core: add new logic for services to store file descriptors in PID 1Lennart Poettering
With this change it is possible to send file descriptors to PID 1, via sd_pid_notify_with_fds() which PID 1 will store individually for each service, and pass via the usual fd passing logic on next invocation. This is useful for enable daemon reload schemes where daemons serialize their state to /run, push their fds into PID 1 and terminate, restoring their state on next start from the data in /run and passed in from PID 1. The fds are kept by PID 1 as long as no POLLHUP or POLLERR is seen on them, and the service they belong to are either not dead or failed, or have a job queued.
2015-01-05path-lookup, systemctl: export lookup_paths_init_from_scope() from ↵Ivan Shapovalov
shared/install.c and use it
2015-01-05util: Do not clear parent mount flags when setting up namespacesTopi Miettinen
When setting up a namespace, mount flags like noexec, nosuid and nodev are cleared, so the mounts always have exec, suid and dev flags enabled. Copy source directory mount flags to target mount when remounting the bind mounts.
2015-01-05util: Fix signedness error in lines(), match implementationsColin Walters
Regression introduced by ed757c0cb03eef50e8d9aeb4682401c3e9486f0b Mirror the implementation of columns(), since the fd_columns() functions returns a negative integer for errors. Also fix columns() to return the unsigned variable instead of the signed intermediary (they're the same, but better to be explicit).
2015-01-05journald: process SIGBUS for the memory maps we set upLennart Poettering
Even though we use fallocate() it appears that file systems like btrfs will trigger SIGBUS on certain low-disk-space situation. We should handle that, hence catch the signal, add it to a list of invalidated pages, and replace the page with an empty memory area. After each write check if SIGBUS was triggered, and consider the write invalid if it was. This should make journald a lot more robust with file systems where fallocate() is not reliable, for example all CoW file systems (btrfs...), where changing written data can fail with disk full errors. https://bugzilla.redhat.com/show_bug.cgi?id=1045810
2015-01-05nspawn: mount most of the cgroup tree read-only in nspawn containers except ↵Lennart Poettering
for the container's own subtree in the name=systemd hierarchy More specifically mount all other hierarchies in their entirety and the name=systemd above the container's subtree read-only.
2015-01-01missing: add __NR_renameat2Zbigniew Jędrzejewski-Szmek
2014-12-30tree-wide: spelling fixesVeres Lajos
https://github.com/vlajos/misspell_fixer https://github.com/torstehu/systemd/commit/b6fdeb618cf2f3ce1645b3315f15f482710c7ffa Thanks to Torstein Husebo <torstein@huseboe.net>.
2014-12-30macro: add DIV_ROUND_UP()David Herrmann
This macro calculates A / B but rounds up instead of down. We explicitly do *NOT* use: (A + B - 1) / A as it suffers from an integer overflow, even though the passed values are properly tested against overflow. Our test-cases show this behavior. Instead, we use: A / B + !!(A % B) Note that on "Real CPUs" this does *NOT* result in two divisions. Instead, instructions like idivl@x86 provide both, the quotient and the remainder. Therefore, both algorithms should perform equally well (I didn't verify this, though).
2014-12-29capability: use /proc/sys/kernel/cap_last_capDavid Herrmann
This file was introduced with linux-3.2, use it instead of probing for it via prctl(PR_CAPBSET_READ). For now, keep the old code for backwards compat. We can drop it once 3.2 is our lowest requirement. The test-cap-list code is extended to verify cap_last_cap() is the same as we'd get via prctl probing and /proc.
2014-12-28util: treat -1 as special size in format_bytes()Lennart Poettering
2014-12-28machined: add support for reporting image size via btrfs quotaLennart Poettering