Age | Commit message (Collapse) | Author |
|
core: add cgroup CPU controller support on the unified hierarchy
(zj: merging not squashing to make it clear against which upstream this patch was developed.)
|
|
The kernel treats values below a certain threshold (minfmt->min_coredump
which is initialized do ELF_EXEC_PAGESIZE, which varies between architectures,
but is usually the same as PAGE_SIZE) as disabling coredumps [1].
Any core image below ELF_EXEC_PAGESIZE will yield an invalid backtrace anyway [2],
so follow the kernel and not try to parse or store such images.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c#n660
[2] systemd-coredump[16260]: Process 16258 (sleep) of user 1002 dumped core.
Stack trace of thread 16258:
#0 0x00007f1d8b3d3810 n/a (n/a)
https://bugzilla.redhat.com/show_bug.cgi?id=1309172#c19
|
|
Under NixOS, the config_path /etc/systemd/system is a symlink to
/etc/static/systemd/system. Commands such as `systemctl list-unit-files`
and `systemctl is-enabled` did not work as the symlink was not followed.
This does not affect how symlinks are treated within the config_path
directory.
|
|
the /) (#3934)
Fixes #3927.
|
|
|
|
|
|
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches. The situation is explained in
the following documentation.
https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu
While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads. This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.
On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config
options are added with the usual compat translation. CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.
v2: - Error in man page corrected.
- CPU config application in cgroup_context_apply() refactored.
- CPU accounting now works on unified hierarchy.
|
|
since link_dirty itself calls manager_dirty no need to
call it separately .
|
|
|
|
When client requests to get logs with `follow` and `KEY=match` that
doesn't match any log entry, journal-gatewayd segfaulted.
Make request_reader_entries to return zero in such case to wait for
matching entries.
This fixes https://github.com/systemd/systemd/issues/3873.
|
|
Serve journals in the specified directory instead of default journals.
|
|
Fixed (master) versions of libtool pass -fsanitize=address correctly
into CFLAGS and LDFLAGS allowing ASAN to be used without any special
configure tricks..however ASAN triggers in lookup3.c for the same
reasons valgrind does. take the alternative codepath if
__SANITIZE_ADDRESS__ is defined as well.
|
|
It was meant to write to q instead of t
FAIL: test-id128
================
=================================================================
==125770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd4615bd31 at pc 0x7a2f41b1bf33 bp 0x7ffd4615b750 sp 0x7ffd4615b748
WRITE of size 1 at 0x7ffd4615bd31 thread T0
#0 0x7a2f41b1bf32 in id128_to_uuid_string src/libsystemd/sd-id128/id128-util.c:42
#1 0x401f73 in main src/test/test-id128.c:147
#2 0x7a2f41336341 in __libc_start_main (/lib64/libc.so.6+0x20341)
#3 0x401129 in _start (/home/crrodriguez/scm/systemd/.libs/test-id128+0x401129)
Address 0x7ffd4615bd31 is located in stack of thread T0 at offset 1409 in frame
#0 0x401205 in main src/test/test-id128.c:37
This frame has 23 object(s):
[32, 40) 'b'
[96, 112) 'id'
[160, 176) 'id2'
[224, 240) 'a'
[288, 304) 'b'
[352, 368) 'a'
[416, 432) 'b'
[480, 496) 'a'
[544, 560) 'b'
[608, 624) 'a'
[672, 688) 'b'
[736, 752) 'a'
[800, 816) 'b'
[864, 880) 'a'
[928, 944) 'b'
[992, 1008) 'a'
[1056, 1072) 'b'
[1120, 1136) 'a'
[1184, 1200) 'b'
[1248, 1264) 'a'
[1312, 1328) 'b'
[1376, 1409) 't' <== Memory access at offset 1409 overflows this variable
[1472, 1509) 'q'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow src/libsystemd/sd-id128/id128-util.c:42 in id128_to_uuid_string
Shadow bytes around the buggy address:
0x100028c23750: f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
0x100028c23760: f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
0x100028c23770: f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
0x100028c23780: f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
0x100028c23790: f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
=>0x100028c237a0: f2 f2 00 00 00 00[01]f4 f4 f4 f2 f2 f2 f2 00 00
0x100028c237b0: 00 00 05 f4 f4 f4 00 00 00 00 00 00 00 00 00 00
0x100028c237c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100028c237d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100028c237e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100028c237f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==125770==ABORTING
FAIL test-id128 (exit status: 1)
|
|
ASAN is unable to handle it.
|
|
beef up /var/tmp and /tmp handling; set $SERVICE_RESULT/$EXIT_CODE/$EXIT_STATUS on ExecStop= and make sure root/nobody are always resolvable
|
|
fixes #3881
|
|
Fix 3607
|
|
|
|
Without the address the message is not very useful.
Aug 04 23:52:21 rawhide systemd[1]: testlimit.socket: Too many incoming connections (4) from source ::1, dropping connection.
|
|
|
|
This fixes an issue during reexec — the count of connections would be lost:
[zbyszek@fedora-rawhide ~]$ systemctl status testlimit.socket | grep Connected
Accepted: 1; Connected: 1
[zbyszek@fedora-rawhide ~]$ sudo systemctl daemon-reexec
[zbyszek@fedora-rawhide ~]$ systemctl status testlimit.socket | grep Connected
Accepted: 1; Connected: 0
With the patch, Connected count is preserved.
Also add "Accept Socket" to the dump output for services.
|
|
Make functions and definitions that don't need to be shared local to
socket.c.
|
|
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.
No functional changes.
|
|
Fixes #3890.
|
|
The intention is to clamp the value to READ_FULL_BYTES_MAX, which
would be the minimum of the two.
|
|
journal-(gatewayd,remote).c don't actually utilize libgnutls even when
HAVE_GNUTLS is defined.
|
|
|
|
|
|
|
|
Previously, the result value of a unit was overriden with each failure that
took place, so that the result always reported the last failure that took
place.
With this commit this is changed, so that the first failure taking place is
stored instead. This should normally not matter much as multiple failures are
sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT
to a service due to a watchdog timeout, then this currently would be reported
as "coredump" failure, rather than the "watchodg" failure it really is. Hence,
in order to report information about the type of the failure, and not about
the effect of it, let's change this from all unit type to store the first, not
the last failure.
This addresses the issue pointed out here:
https://github.com/systemd/systemd/pull/3818#discussion_r73433520
|
|
Let's extend nss-systemd to also synthesize user/group entries for the
UIDs/GIDs 0 and 65534 which have special kernel meaning. Given that nss-systemd
is listed in /etc/nsswitch.conf only very late any explicit listing in
/etc/passwd or /etc/group takes precedence.
This functionality is useful in minimal container-like setups that lack
/etc/passwd files (or only have incompletely populated ones).
|
|
ExecStop=/ExecStopPost= commands
This should simplify monitoring tools for services, by passing the most basic
information about service result/exit information via environment variables,
thus making it unnecessary to retrieve them explicitly via the bus.
|
|
|
|
read_full_stream() _always_ allocated twice the memory needed, due to
only breaking the realloc() && fread() loop when fread() returned 0,
requiring another iteration and exponentially enlarged buffer just to
discover the EOF condition.
This also caused file sizes >2MiB && <= 4MiB to erroneously be treated
as E2BIG, due to the inappropriately doubled buffer size exceeding
4*1024*1024.
Also made the 4*1024*1024 magic number a READ_FULL_BYTES_MAX constant.
|
|
bridge vlan configuration was applied even if it wasn't configured.
fixes #3876
|
|
|
|
Let's fix up the flags fields in service_spawn() rather than its callers, in
order to simplify things a bit.
|
|
The ExecParameters structure contains a number of bit-flags, that were so far
exposed as bool:1, change this to a proper, single binary bit flag field. This
makes things a bit more expressive, and is helpful as we add more flags, since
these booleans are passed around in various callers, for example
service_spawn(), whose signature can be made much shorter now.
Not all bit booleans from ExecParameters are moved into the flags field for
now, but this can be added later.
|
|
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a
matching tmp_dir() call (the former looks for the place for /var/tmp, the
latter for /tmp).
Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses.
All dirs are validated before use. secure_getenv() is used in order to limite
exposure in suid binaries.
This also ports a couple of users over to these new APIs.
The var_tmp() return parameter is changed from an allocated buffer the caller
will own to a const string either pointing into environ[], or into a static
const buffer. Given that environ[] is mostly considered constant (and this is
exposed in the very well-known getenv() call), this should be OK behaviour and
allows us to avoid memory allocations in most cases.
Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
|
|
allow transient mounts and automounts
|
|
Update help for "short-full" and shorten to 80 columns
|
|
https://lists.freedesktop.org/archives/systemd-devel/2016-August/037268.html
|
|
Fixes #3669.
|
|
|
|
make dist-check-help FTW!
|
|
|
|
This permits CPUQuota to accept greater values as documented.
|
|
nspawn resolv.conf handling improvements, and inherit $TERM all the way through nspawn → console login
|
|
This new output mode formats all timestamps using the usual format_timestamp()
call we use pretty much everywhere else. Timestamps formatted this way are some
ways more useful than traditional syslog timestamps as they include weekday,
month and timezone information, while not being much longer. They are also not
locale-dependent. The primary advantage however is that they may be passed
directly to journalctl's --since= and --until= switches as soon as #3869 is
merged.
While we are at it, let's also add "short-unix" to shell completion.
|
|
This patch improves parsing and generation of timestamps and calendar
specifications in two ways:
- The week day is now always printed in the abbreviated English form, instead
of the locale's setting. This makes sure we can always parse the week day
again, even if the locale is changed. Given that we don't follow locale
settings for printing timestamps in any other way either (for example, we
always use 24h syntax in order to make uniform parsing possible), it only
makes sense to also stick to a generic, non-localized form for the timestamp,
too.
- When parsing a timestamp, the local timezone (in its DST or non-DST name)
may be specified, in addition to "UTC". Other timezones are still not
supported however (not because we wouldn't want to, but mostly because libc
offers no nice API for that). In itself this brings no new features, however
it ensures that any locally formatted timestamp's timezone is also parsable
again.
These two changes ensure that the output of format_timestamp() may always be
passed to parse_timestamp() and results in the original input. The related
flavours for usec/UTC also work accordingly. Calendar specifications are
extended in a similar way.
The man page is updated accordingly, in particular this removes the claim that
timestamps systemd prints wouldn't be parsable by systemd. They are now.
The man page previously showed invalid timestamps as examples. This has been
removed, as the man page shouldn't be a unit test, where such negative examples
would be useful. The man page also no longer mentions the names of internal
functions, such as format_timestamp_us() or UNIX error codes such as EINVAL.
|