| Age | Commit message (Collapse) | Author | 
|---|
|  | (NOTE: Cgroup namespaces work with legacy and unified hierarchies: "This is
completely backward compatible and will be completely invisible to any existing
cgroup users (except for those running inside a cgroup namespace and looking at
/proc/pid/cgroup of tasks outside their namespace.)"
(https://lists.linuxfoundation.org/pipermail/containers/2016-January/036582.html)
So there is no need to special case unified.)
If cgroup namespaces are supported we skip mount_cgroups() in the
outer_child(). Instead, we unshare(CLONE_NEWCGROUP) in the inner_child() and
only then do we call mount_cgroups().
The clean way to handle cgroup namespaces would be to delegate mounting of
cgroups completely to the init system in the container. However, this would
likely break backward compatibility with the UNIFIED_CGROUP_HIERARCHY flag of
systemd-nspawn. Also no cgroupfs would be mounted whenever the user simply
requests a shell and no init is available to mount cgroups. Hence, we introduce
mount_legacy_cgns_supported(). After calling unshare(CLONE_NEWCGROUP) it parses
/proc/self/cgroup to find the mounted controllers and mounts them inside the
new cgroup namespace. This should preserve backward compatibility with the
UNIFIED_CGROUP_HIERARCHY flag and mount a cgroupfs when no init in the
container is running. | 
|  | - define CLONE_NEWCGROUP
- add fun to detect whether cgroup namespaces are supported | 
|  | This is basically the same change as ea68351. | 
|  | Follow up on #3503 (pass service env vars to PAM sessions) | 
|  | Prevent free from being called on (a part of) the call-by-reference
variable env when setup_pam fails. | 
|  | The unit load queue can be processed in the middle of setting the
unit's properties, so its load_state would no longer be UNIT_STUB
for the check in bus_unit_set_properties(), which would cause it to
incorrectly return an error. | 
|  | Commit da4d897e ("core: add cgroup memory controller support on the unified
hierarchy (#3315)") changed the code in src/core/cgroup.c to always write
the real numeric value from the cgroup parameters to the
"memory.limit_in_bytes" attribute file.
For parameters set to CGROUP_LIMIT_MAX, this results in the string
"18446744073709551615" being written into that file, which is UINT64_MAX.
Before that commit, CGROUP_LIMIT_MAX was special-cased to the string "-1".
This causes a regression on CentOS 7, which is based on kernel 3.10, as the
value is interpreted as *signed* 64 bit, and clamped to 0:
[root@n54 ~]# echo 18446744073709551615 >/sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes
[root@n54 ~]# cat /sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes
0
[root@n54 ~]# echo -1 >/sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes
[root@n54 ~]# cat /sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes
9223372036854775807
Hence, all units that are subject to the limits enforced by the memory
controller will crash immediately, even though they have no actual limit
set. This happens to for the user.slice, for instance:
[  453.577153] Hardware name: SeaMicro SM15000-64-CC-AA-1Ox1/AMD Server CRB, BIOS Estoc.3.72.19.0018 08/19/2014
[  453.587024]  ffff880810c56780 00000000aae9501f ffff880813d7fcd0 ffffffff816360fc
[  453.594544]  ffff880813d7fd60 ffffffff8163109c ffff88080ffc5000 ffff880813d7fd28
[  453.602120]  ffffffff00000202 fffeefff00000000 0000000000000001 ffff880810c56c03
[  453.609680] Call Trace:
[  453.612156]  [<ffffffff816360fc>] dump_stack+0x19/0x1b
[  453.617324]  [<ffffffff8163109c>] dump_header+0x8e/0x214
[  453.622671]  [<ffffffff8116d20e>] oom_kill_process+0x24e/0x3b0
[  453.628559]  [<ffffffff81088dae>] ? has_capability_noaudit+0x1e/0x30
[  453.634969]  [<ffffffff811d4155>] mem_cgroup_oom_synchronize+0x575/0x5a0
[  453.641721]  [<ffffffff811d3520>] ? mem_cgroup_charge_common+0xc0/0xc0
[  453.648299]  [<ffffffff8116da84>] pagefault_out_of_memory+0x14/0x90
[  453.654621]  [<ffffffff8162f4cc>] mm_fault_error+0x68/0x12b
[  453.660233]  [<ffffffff81642012>] __do_page_fault+0x3e2/0x450
[  453.666017]  [<ffffffff816420a3>] do_page_fault+0x23/0x80
[  453.671467]  [<ffffffff8163e308>] page_fault+0x28/0x30
[  453.676656] Task in /user.slice/user-0.slice/user@0.service killed as a result of limit of /user.slice/user-0.slice/user@0.service
[  453.688477] memory: usage 0kB, limit 0kB, failcnt 7
[  453.693391] memory+swap: usage 0kB, limit 9007199254740991kB, failcnt 0
[  453.700039] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0
[  453.706076] Memory cgroup stats for /user.slice/user-0.slice/user@0.service: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[  453.725702] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[  453.733614] [ 2837]     0  2837    11950      899      23        0             0 (systemd)
[  453.741919] Memory cgroup out of memory: Kill process 2837 ((systemd)) score 1 or sacrifice child
[  453.750831] Killed process 2837 ((systemd)) total-vm:47800kB, anon-rss:3188kB, file-rss:408kB
Fix this issue by special-casing the UINT64_MAX case again. | 
|  | By cleaning up before setting up PAM we maintain control of overriding
behavior in setting variables. Otherwise, pam_putenv is in control.
This also makes sure we use a cleaned up environment in replacing
variables in argv. | 
|  | Commit d054f0a4 ("tree-wide: use xsprintf() where applicable") used a
semantic patch approach to change a number of locations from
  snprintf(buf, sizeof(buf), FMT, ...)
to
  xsprintf(buf, FMT, ...)
The problem is that xsprintf() wraps the snprintf() in an
assert_message_se(), so if snprintf() reports an overflow of the
destination buffer, the binary will now terminate.
This hit a user running a version of systemd that was built from a
deeply nested system path.
Fix this by
a) Switching back to snprintf() for this particular case. We should really
rather truncate the location string than crash in such situations.
b) Increasing the size of that static string buffer, to make the event more
unlikely. | 
|  | systemd-run --help says:
  -E --setenv=NAME=VALUE          Set environment | 
|  |  | 
|  | ==1447== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==1447==    at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==1447==    by 0x5350F19: strdup (in /usr/lib64/libc-2.23.so)
==1447==    by 0x4E9D435: strv_new_ap (strv.c:166)
==1447==    by 0x4E9D5FA: strv_new (strv.c:199)
==1447==    by 0x10E665: test_strv_fnmatch (test-strv.c:693)
==1447==    by 0x10EAD5: main (test-strv.c:763)
==1447== | 
|  | basic/fd-util: introduce stdio_unset_cloexec() function | 
|  |  | 
|  | There are some places in the systemd which are use the same pattern:
    fd_cloexec(STDIN_FILENO, false);
    fd_cloexec(STDOUT_FILENO, false);
    fd_cloexec(STDERR_FILENO, false);
to unset CLOEXEC for standard file descriptors. This patch introduces
the stdio_unset_cloexec() function to hide this and make code cleaner. | 
|  | Allow date and time ranges in OnCalendar | 
|  |  | 
|  | For backwards compatibility, both the new format (Mon..Wed) and
the old format (Mon-Wed) are supported. | 
|  | Resolves #3042 | 
|  | A 'llu' formatting statement was used in a debugging printf statement
instead of a 'PRIu64'. Correcting that mistake here. | 
|  | manager: Only invoke a single sigchld per unit within a cleanup cycle | 
|  | * networkd: condition_test() can return a negative error, handle that
If a condition check fails with an error we should not consider the check
successful. Fix that.
We should probably also improve logging in this case, but for now, let's just
unbreak this breakage.
Fixes: #3236
* condition: handle unrecognized architectures nicer
When we encounter a check for an architecture we don't know we should not
let the condition check fail with an error code, but instead simply return
false. After all the architecture might just be newer than the ones we know, in
which case it's certainly not our local one.
Fixes: #3236 | 
|  | make "machinectl clean" asynchronous, and open it up via PolicyKit | 
|  | sd_event_get_iteration() (#3631)
This extends the existing event loop iteration counter to 64bit, and exposes it
via a new function sd_event_get_iteration(). This is helpful for cases like
issue #3612. After all, since we maintain the counter anyway, we might as well
expose it.
(This also fixes an unrelated issue in the man page for sd_event_wait() where
micro and milliseconds got mixed up) | 
|  | By default, each iteration of manager_dispatch_sigchld() results in a unit level
sigchld event being invoked. For scope units, this results in a scope_sigchld_event()
which can seemingly stall for workloads that have a large number of PIDs within the
scope. The stall exhibits itself as a SIG_0 being initiated for each u->pids entry
as a result of pid_is_unwaited().
v2:
This patch resolves this condition by only paying to cost of a sigchld in the underlying
scope unit once per sigchld iteration. A new "sigchldgen" member resides within the
Unit struct. The Manager is incremented via the sd event loop, accessed via
sd_event_get_iteration, and the Unit member is set to the same value as the manager each
time that a sigchld event is invoked. If the Manager iteration value and Unit member
match, the sigchld event is not invoked for that iteration. | 
|  | sd-device: handle the 'drivers' pseudo-subsystem correctly | 
|  | journalctl: Use env variable TMPDIR to save temporary files | 
|  | This extends the existing event loop iteration counter to 64bit, and exposes it
via a new function sd_event_get_iteration(). This is helpful for cases like
issue #3612. After all, since we maintain the counter anyway, we might as well
expose it.
(This also fixes an unrelated issue in the man page for sd_event_wait() where
micro and milliseconds got mixed up) | 
|  | build-sys: Convert libshared into a private shared library | 
|  | Make `journalctl --directory=... --boot 0` work | 
|  | The loop on bus_match_run should break and return immediately if
bus->match_callbacks_modified is true. Otherwise the loop may access
free'd data. | 
|  | fixes https://bugzilla.redhat.com/show_bug.cgi?id=842060 | 
|  | --boot=0 magically meant "this boot", but when used with --file/--directory it
should simply refer to the last boot found in the specified journal. This way,
--boot and --list-boots are consistent.
Fixes #3603. | 
|  | Those are just local variables and ref_boot_offset is especially
obnoxious. | 
|  | Before --this-boot was deprecated in a331b5e6d47243, it did not take
any arguments. | 
|  | It works mostly fine, and can be quite useful to examine data from another
system.
OTOH, a single boot id doesn't make sense with --merge, so mixing with --merge
is still not allowed. | 
|  | “systemctl show” added an extra blank line after the dump of the
EnvironmentFile property of the unit. | 
|  | With commit 6f7da49d00 route-only domains do not get put into resolv.conf's
"search" list any more. Add a comment about the tri-state, to clarify its
semantics and why we are passing a bool parameter into an int type. Also add a
test case for it. | 
|  | to hide casting of '-1' strings and make code cleaner. | 
|  | Fixes:
```
$ systemctl list-unit-files 'hey\*'
0 unit files listed.
$ systemctl list-unit-files | grep hey
hey\x7eho.service                          static
``` | 
|  | We support writing out tags and db files in case a real subsystem called
'drivers' exists, so there is no reason to refuse parsing it. | 
|  | The 'drivers' pseudo-subsystem needs special treatment. These pseudo-devices are
found under /sys/bus/drivers/, so needs the real subsystem encoded
in the device_id in order to be resolved.
The reader side already assumed this to be the case. | 
|  | Collect the errors and return to the caller, but continue enumerating all devices. | 
|  | machinectl: interpret options placed between "shell" verb and machine name | 
|  | Let's make sure we catch early when a machine doesn't exist that is attempted
to be started or enabled as system service. | 
|  |  | 
|  | Fstab generator fixes | 
|  |  | 
|  | An incorrectly set if/else chain caused aus to apply the access mode of a
symlink to the directory it is located in. Yuck.
Fixes: #3547 | 
|  | Link as many binaries as possible with it, to save storage space.
Preserve the static libshared and libbasic for use in libraries, nss
modules and udev.
Libraries need to be static in order to avoid polluting the symbol
namespace.
Udev needs to be static so downstream can avoid strict version dependencies
with the systemd package, and this can complicate upgrade scenarios. |