Age | Commit message (Collapse) | Author |
|
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.
A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).
It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.
The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.
This patch also removes cg_delete() which is unused now.
On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.
This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.
This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.
The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.
To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.
This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.
When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
|
|
In all cases where the function (or cg_is_empty_recursive()) ignoring
the calling process is actually wrong, as a process keeps a cgroup busy
regardless if its the current one or another. Hence, let's simplify
things and drop the "ignore_self" parameter.
|
|
|
|
This is specifically useful for appending the mDNS ".local" suffix to a
single-label hostname in the most correct way. (used in later commit)
|
|
This modifies the strv in-place, replacing strings with their escaped
version. It's mostly just a convenience function for when you need to
join a strv together because it's passed as a string to something, and
the separator needs escaping.
|
|
This is for shell-style \ escaping rather than quoting, which while it
has the same effect in produced shell commands, is not exclusively
useful for shell commands.
shell_escape would be useful for producing sed commands, as you would be
able to \ escape the normal special characters, plus whichever argument
separator was chosen; or it could be used to escape arguments passed to
the overlayfs mount command.
|
|
strv_split_extract is to strv_split_quotes as extract_first_word was to
unquote_first_word.
Now there's extract_first_word for extracting a single argument,
extract_many_words for extracting a bounded number of arguments,
and strv_split_extract for extracting an arbitrary number of arguments.
|
|
If EXTRACT_DONT_COALESCE_SEPARATORS is passed, then leading separators,
trailing separators and spans of multiple separators aren't skipped, and
empty arguments from before, after or between separators may be extracted.
|
|
This adds an EXTRACT_QUOTES option to allow the previous behaviour, of
not interpreting any character inside ' or " quotes as separators.
|
|
It now takes a separators argument, which defaults to WHITESPACE if NULL
is passed.
|
|
This is so that, when called in a loop, unquote_first_word can
distinguish between reaching the end of a string because it has consumed
all the input before the end, and consuming all the input.
This is important because we later add a flag that allows
char *in = "";
char *out;
unquote_first_word(&in, &out, flags);
To put "" in out, and set in = NULL, so the trailing empty string of the
input can be consumed, and mark that the input has been consumed.
|
|
Manual merge of https://github.com/systemd/systemd/pull/751.
|
|
All users are now setting lowercase=false.
|
|
Tests are modified to check behaviour with relax and without relax.
New tests are added for hostname_cleanup().
Tests are moved a new file (test-hostname-util) because there's
now a bunch of them.
New parameter is not used anywhere, except in tests, so there should
be no observable change.
|
|
|
|
Similar in function to LIST_INSERT_AFTER, this will insert a new element
into the list before the specified position. If the specified position
is NULL, the element is added as the tail of the list.
|
|
Add some more tests
|
|
tree-wide: introduce mfree()
|
|
|
|
Add tests for safe_ato[iu]16() and some more unbase32hexmem() torture.
|
|
Test af-list and arphdr-list.
|
|
Pretty trivial helper which wraps free() but returns NULL, so we can
simplify this:
free(foobar);
foobar = NULL;
to this:
foobar = mfree(foobar);
|
|
busctl: Misc cleanups and a fix (v2)
|
|
In member_compare_func(), it compares interface, type and name of
members. But as it can contain NULL pointer, it needs to check them
before calling strcmp(). So make it as a separate strcmp_ptr
function (named after streq_ptr) so that it can be used by others.
Also let streq_ptr() to use it in order to make the code simpler.
|
|
Given two bitmaps and the following code:
Bitmap *a = bitmap_new(), *b = bitmap_new();
bitmap_set(a, 1);
bitmap_clear(a);
bitmap_set(a, 0);
bitmap_set(b, 0);
These two bitmaps should now have the same bits set and they should be
equal but bitmap_equal() will return false in this case because while
bitmap_clear() resets the number of elements in the array it does not
clear the array and bitmap_set() expects the array to be cleared.
GREEDY_REALLOC0 looks at the allocated size and not the actual size so
it does not clear any memory.
Fix this by freeing the allocated memory and resetting the whole Bitmap
to an initial state in bitmap_clear().
This also adds test code for this issue.
|
|
Given two bitmaps and the following code:
Bitmap *a = bitmap_new(), *b = bitmap_new();
bitmap_set(a, 0);
bitmap_unset(a, 0);
These two bitmaps should now have the same bits set and they should be
equal but bitmap_equal() will return false in this case because the
bitmaps array in a is larger because of the bit which was previously
set.
Fix this by comparing only the bits which exists in both bitmaps and
then check that the rest of the bits (if any) is all zero.
This also adds test code for this issue.
|
|
We already refuse to resolve "localhost", hence we should also refuse
resolving "127.0.0.1" and friends.
|
|
Given three DNS names this function indicates if the second argument lies
strictly between the first and the third according to the canonical DNS
name order. Note that the order is circular, so the last name is
considered to be before the first.
|
|
Intended to be called repeatedly, and returns then successive unescaped labels
from the most to the least significant (left to right).
This is slightly inefficient as it scans the string three times (two would be
sufficient): once to find the end of the string, once to find the beginning
of each label and lastly once to do the actual unescaping. The latter two
could be done in one go, but that seemed unnecessarily convoluted.
|
|
unquote_first_word: parse ` '' ` as an empty argument instead of no arg
|
|
|
|
The bug found by David existed in several places, fix them all. Also
extend the tests to cover these cases.
|
|
Reuse the Iterator object from hashmap.h and expose a similar API.
This allows us to do
{
Iterator i;
unsigned n;
BITMAP_FOREACH(n, b, i) {
Iterator j;
unsigned m;
BITMAP_FOREACH(m, b, j) {
...
}
}
}
without getting confused. Requested by David.
|
|
This implements more of RFC4648.
|
|
For when a Hashmap is overkill.
|
|
This implements RFC4648 for a slightly more compact representation of
binary data compared to hex (6 bits per character rather than 4).
|
|
We were ignoring failures from unhexchar, which meant that invalid
hex characters were being turned into garbage rather than the string
rejected.
Fix this by making unhexmem return an error code, also change the API
slightly, to return the size of the returned memory, reflecting the
fact that the memory is a binary blob,and not a string.
For convenience, still append a trailing NULL byte to the returned
memory (not included in the returned size), allowing callers to
treat it as a string without doing a second copy.
|
|
fileio: consolidate write_string_file*()
|
|
|
|
Merge write_string_file(), write_string_file_no_create() and
write_string_file_atomic() into write_string_file() and provide a flags mask
that allows combinations of atomic writing, newline appending and automatic
file creation. Change all users accordingly.
|
|
Add a flag to control whether write_string_stream() should always enforce a
trailing newline character in the file.
|
|
|
|
a tiny hashmap cleanup
|
|
The test-barrier binary uses real-time alarms and timeouts to test for
races in the thread-barrier implementation. Hence, if your system is under
high load and your scheduler decides to not run test-barrier for
>BASE_TIME, then the tests are likely to fail.
Two options:
1) Increase BASE_TIME. This will make the test take significantly longer
for no real good. Furthermore, it is still not guaranteed that the
task is scheduled.
2) Don't rely on real-time timers, but use explicit synchronization. This
would basically test one barrier implementation with another.. kinda
ironic.. but maybe something worth looking into.
3) Disable test-barrier by default.
This patch chooses option 3) and makes sure test-barrier only runs if you
pass any argument.
Side note:
test-barrier is written in a way that if it fails under load, but
does not on idle systems, then it is very _unlikely_ that the
barrier implementation is the culprit. Hence, it makes little
sense to run it under load, anyway. It will not improve the test
coverage of barrier.c, but rather the coverage of the test itself.
|
|
This is consistent with how an empty string works in an ExecStart=
statement. We should not differentiate between an empty string and
whitespace only (since they look the same.)
Update the test case with whitespace only to reflect that the list is
reset.
Tested that `test-unit-file` passes and other test cases are not
affected. Installed the patched systemd binaries on a machine, booted
it, looked for out of the usual behavior but did not find any.
|
|
Convert config_parse_exec() from using FOREACH_WORD_QUOTED into a loop
of unquote_first_word.
Loop through the arguments only once (the FOREACH_WORD_QUOTED
implementation did it twice, once to count them and another time to
process and store them.)
Use newly introduced flag UNQUOTE_UNESCAPE_RELAX to preserve
unrecognized escape sequences such as regexps matches such as "\w",
"\d", etc. (Valid escape sequences such as "\s" or "\b" still need an
extra backslash if literals are desired for regexps.)
Differences in behavior:
- Handle ; (command separator) in special, so that only ; on its own is
valid for that purpose, an quoted semicolon ";" or ';' will now behave
as a literal semicolon. This is probably what was initially intended.
- Handle \; (to introduce a literal semicolon) in special, so that only \;
is turned into a semicolon but not \\; or "\\;" or "\;" which are kept
as a literal \; in the output. This is probably what was initially
intended.
Known issues:
- Using an empty string (for example, ExecStartPre=<empty>) will empty
the list and remove the existing commands, but using whitespace only
(for example, ExecStartPre=<spaces>) will not. This is a pre-existing
issue and will be dealt with in a follow up commit.
Tested:
- Unit tests passing. Also `make distcheck` still works as expected.
- Installed it on a local machine and booted with it, checked console
output, systemctl and journalctl output, did not notice any issues
running the patched systemd binaries.
Relevant bug: https://bugs.freedesktop.org/show_bug.cgi?id=90794
|
|
These tests will be useful to check the cases regarding quoted and
escaped semicolon when we switch to using unquote_first_word.
Additionally, convert some of the tests that have semicolons so that the
argument after the semicolon looks like a path (starts with /) so that
we can see the change of behavior when making config_parse_exec more
strict about what it accepts as a command separator.
|
|
It will try to unquot_first_word, but if it runs into escaping problems
it will retry it adding UNQUOTE_CUNESCAPE_RELAX to the flags. If it
succeeds on the second try, it will log a warning about it. If it fails
both times, it will log an error.
Add test cases to confirm it behaves as expected.
|
|
The new flag UNQUOTE_UNESCAPE_RELAX preserves unrecognized escape
sequences verbatim in unquote_first_word, either when it's a trailing
backslash (similar to UNQUOTE_RELAX, but in this case keep the extra
backslash in the output) or in the middle of a sequence string.
Add unit test cases to ensure the new flag works as expected and to
prevent regressions from being introduced.
Tested with a follow up commit converting config_parse_exec() to start
using unquote_first_word, in which case this flags makes it possible to
preserve unrecognized escape sequences.
Relevant bug: https://bugs.freedesktop.org/show_bug.cgi?id=90794
|
|
There is no reason to require key to be non-NULL.
Change test_ordered_hashmap_next() to use trivial_hash_ops in order to
test NULL key too.
|