Age | Commit message (Collapse) | Author |
|
btrfs' COW logic results in heavily fragment journal files, which is
detrimental for perfomance. Hence, turn off COW for journal files as we
create them.
Turning off COW comes at the cost of data integrity guarantees, but this
should be acceptable, given that we do our own checksumming, and
generally have a pretty conservative write pattern.
Also see discussion on linux-btrfs:
http://www.spinics.net/lists/linux-btrfs/msg41001.html
|
|
connected terminal
So far, if we had no knowledge about the correct $TERM we defaulted to
v102, as a safe, conservative choice. However, the terminfo data for
vt102 is not aware of pageup/pagedown, which makes "less" much harder
work with than necessary. Setting vt220 allows them to work correctly.
"vt220" should be a sufficiently safe choice too, given that xterm,
gnome-terminal and the linux console all strive to implement vt220 as
baseline, already to pass pageup/pagedown correctly to apps.
Effectively, with this change "journalctl -e" run inside a
"systemd-nspawn" terminal will now run a pager where pageup/pagedown
works, which is quite an improvement of usability for containers.
|
|
In particular, don't patch the error number to EINVAL if 0, and don't
negate it.
(Also, add do {} while (false) around multi-line macro)
|
|
|
|
Let's introduce some syntactic sugar with iteration macros, and add
correct key increment calls.
|
|
machine it is connected to dies
|
|
|
|
from an obstructed mounted
|
|
Our write pattern is quite awful for CoW file systems (btrfs...), as we
keep updating file parts in the beginning of the file. This results in
fragmented journal files. Hence: when rotating files, defragment them,
since at that point we know that no further write accesses will be made.
|
|
LOG_DEBUG is already a log level, there is no need to use LOG_PRI which
is for filtering out the facility.
|
|
With this change it is possible to send file descriptors to PID 1, via
sd_pid_notify_with_fds() which PID 1 will store individually for each
service, and pass via the usual fd passing logic on next invocation.
This is useful for enable daemon reload schemes where daemons serialize
their state to /run, push their fds into PID 1 and terminate, restoring
their state on next start from the data in /run and passed in from PID
1.
The fds are kept by PID 1 as long as no POLLHUP or POLLERR is seen on
them, and the service they belong to are either not dead or failed, or
have a job queued.
|
|
shared/install.c and use it
|
|
When setting up a namespace, mount flags like noexec, nosuid and
nodev are cleared, so the mounts always have exec, suid and dev
flags enabled.
Copy source directory mount flags to target mount when remounting
the bind mounts.
|
|
Regression introduced by ed757c0cb03eef50e8d9aeb4682401c3e9486f0b
Mirror the implementation of columns(), since the fd_columns()
functions returns a negative integer for errors.
Also fix columns() to return the unsigned variable instead of the
signed intermediary (they're the same, but better to be explicit).
|
|
Even though we use fallocate() it appears that file systems like btrfs
will trigger SIGBUS on certain low-disk-space situation. We should
handle that, hence catch the signal, add it to a list of invalidated
pages, and replace the page with an empty memory area. After each write
check if SIGBUS was triggered, and consider the write invalid if it was.
This should make journald a lot more robust with file systems where
fallocate() is not reliable, for example all CoW file systems
(btrfs...), where changing written data can fail with disk full errors.
https://bugzilla.redhat.com/show_bug.cgi?id=1045810
|
|
for the container's own subtree in the name=systemd hierarchy
More specifically mount all other hierarchies in their entirety and the
name=systemd above the container's subtree read-only.
|
|
|
|
https://github.com/vlajos/misspell_fixer
https://github.com/torstehu/systemd/commit/b6fdeb618cf2f3ce1645b3315f15f482710c7ffa
Thanks to Torstein Husebo <torstein@huseboe.net>.
|
|
This macro calculates A / B but rounds up instead of down. We explicitly
do *NOT* use:
(A + B - 1) / A
as it suffers from an integer overflow, even though the passed values are
properly tested against overflow. Our test-cases show this behavior.
Instead, we use:
A / B + !!(A % B)
Note that on "Real CPUs" this does *NOT* result in two divisions. Instead,
instructions like idivl@x86 provide both, the quotient and the remainder.
Therefore, both algorithms should perform equally well (I didn't verify
this, though).
|
|
This file was introduced with linux-3.2, use it instead of probing for it
via prctl(PR_CAPBSET_READ).
For now, keep the old code for backwards compat. We can drop it once 3.2
is our lowest requirement.
The test-cap-list code is extended to verify cap_last_cap() is the same as
we'd get via prctl probing and /proc.
|
|
|
|
|
|
|
|
machine images
|
|
|
|
use of it from nspawn
|
|
|
|
|
|
The new test-cap-list introduced in commit 2822da4fb7f891 uses the included
table of capabilities. However, it uses cap_last_cap() which probes the kernel
for the last available capability. On an older kernel (e.g. 3.10 from RHEL 7)
that causes the test to fail with the following message:
Assertion '!capability_to_name(cap_last_cap()+1)' failed at src/test/test-cap-list.c:30, function main(). Aborting.
Fix it by exporting the size of the static table and using it in the test
instead of the dynamic one from the current kernel.
Tested by successfully running ./test-cap-list and the whole `make check` test
suite with this patch on a RHEL 7 host.
|
|
subvolumes
We make use of the btrfs subvol crtime for this, and for gpt images of a
manually managed xattr, if we can.
|
|
|
|
There is alot of cleanup that will have to happen to turn on
-fstrict-aliasing, but I think our code should be "correct" to the rule.
|
|
After all, pretty much all our tools include it, and it should hence be
shared.
Also move sysfs-show.h from core/ to login/, since it has no point to
exist in core.
|
|
on a pty and returns the pty master fd to the client
This is a one-stop solution for "machinectl login", and should simplify
getting logins in containers.
|
|
|
|
|
|
|
|
|
|
For that, ask machined for a container PTY and use that.
|
|
|
|
We originally only supported escaping ucs2 encoded characters (as \uxxxx). This
only covers the BMP. Support escaping also utf16 surrogate pairs (on the form
\uxxxx\uyyyy) to cover all of unicode.
|
|
We originally only supported the BMP (i.e., we treated UTF-16 as UCS-2).
|
|
Originally we only supported ucs2, so move the ucs4 version from libsystemd-terminal to shared
and use that everywhere.
|
|
|
|
|
|
hidden_file() is a bit more precise, since dot files usually shouldn't
be ignored, but certainly be considered hidden.
|
|
the verb in it
That way the dispatcher calls know how they got called.
|
|
This adds a new bus call to machined that enumerates /var/lib/container
and returns all trees stored in it, distuingishing three types:
- GPT disk images, which are files suffixed with ".gpt"
- directory trees
- btrfs subvolumes
|
|
extra "#" to the name
That way, we have a simple, somewhat reliable way to detect such
temporary files, by simply checking if they start with ".#".
|
|
containers and install them locally
This adds a simply but powerful tool for downloading container images
from the most popular container solution used today. Use it like
this:
# systemd-import pull-dck mattdm/fedora
# systemd-nspawn -M fedora
This will donwload the layers for "mattdm/fedora", and make them
available locally as /var/lib/container/fedora.
The tool is pretty complete, as long as it's only about pulling down
images, or updating them. Pushing or searching is not supported yet.
|