Age | Commit message (Collapse) | Author |
|
This commit adds the possibility to leave /sys, and /proc/sys read-write.
It introduces a new (undocumented) env var SYSTEMD_NSPAWN_API_VFS_WRITABLE
to enable this feature.
If set to "yes", /sys, and /proc/sys will be read-write.
If set to "no", /sys, and /proc/sys will be read-only.
If set to "network" /proc/sys/net will be read-write. This is useful in
use-cases, where systemd-nspawn is used in an external network
namespace.
This adds the possibility to start privileged containers which need more
control over settings in the /proc, and /sys filesystem.
This is also a follow-up on the discussion from
https://github.com/systemd/systemd/pull/4018#r76971862 where an
introduction of a simple env var to enable R/W support for those
directories was already discussed.
|
|
Too many things don't get along with the unified hierarchy yet:
* https://github.com/opencontainers/runc/issues/1175
* https://github.com/docker/docker/issues/28109
* https://github.com/lxc/lxc/issues/1280
So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
|
|
The file /usr/lib/systemd/resolv.conf can be stale, it does not tell us
whether or not systemd-resolved is running or not.
So check for /run/systemd/resolve/resolv.conf as well, which is created
at runtime and hence is a better indication.
|
|
(#4596)
This adds a variable that is always set to false to make sure that
protect paths inside sandbox are always enforced and not ignored. The only
case when it is set to true is on DynamicUser=no and RootDirectory=/chroot
is set. This allows users to use more our sandbox features inside RootDirectory=
The only exception is ProtectSystem=full|strict and when DynamicUser=yes
is implied. Currently RootDirectory= is not fully compatible with these
due to two reasons:
* /chroot/usr|etc has to be present on ProtectSystem=full
* /chroot// has to be a mount point on ProtectSystem=strict.
|
|
core: add new RestrictNamespaces= unit file setting
Merging, not rebasing, because this touches many files and there were tree-wide cleanups in the mean time.
|
|
Format string tweaks (and a small fix on 32bit)
|
|
Remove FOREACH_WORD_QUOTED
|
|
The .so symlinks got moved to rootlibdir in 082210c7.
|
|
For normal arches this doesn't matter, but on arm32 arg_journal_size_max was smaller
than the other *SizeMax variables. This doesn't seem useful.
This is anothet part of the fix in 5206a724a0.
|
|
Commit b006762 inverted the initial exit code which is relevant for --help and
--version without a particular reason. For these special options, parse_argv()
returns 0 so that our main() immediately skips to the end without adjusting
"ret". Otherwise, if an actual container is being started, ret is set on error
in run(), which still provides the "non-zero exit on error" behaviour.
Fixes #4605.
|
|
According to comments in <asm/types.h>, __u64 is always defined as unsigned
long long. Those casts should be superfluous.
|
|
|
|
core: make RootDirectory= and ProtectKernelModules= work
|
|
In file included from ./src/basic/macro.h:415:0,
from ./src/shared/acl-util.h:28,
from src/coredump/coredump.c:36:
src/coredump/coredump.c: In function ‘submit_coredump’:
src/coredump/coredump.c:711:26: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:711:17: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
src/coredump/coredump.c:711:26: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 8 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:711:17: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
src/coredump/coredump.c:741:27: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:182:28: note: in expansion of macro ‘log_full’
#define log_debug(...) log_full(LOG_DEBUG, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:741:17: note: in expansion of macro ‘log_debug’
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^~~~~~~~~
src/coredump/coredump.c:741:27: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 8 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:182:28: note: in expansion of macro ‘log_full’
#define log_debug(...) log_full(LOG_DEBUG, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:741:17: note: in expansion of macro ‘log_debug’
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^~~~~~~~~
src/coredump/coredump.c:768:34: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:768:25: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
|
|
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
|
|
|
|
|
|
non-fatal mount errors shouldn't be logged as warnings.
|
|
Instead of having two fields inside BindMount struct where one is stack
based and the other one is heap, use one field to store the full path
and updated it when we chase symlinks. This way we avoid dealing with
both at the same time.
This makes RootDirectory= work with ProtectHome= and ProtectKernelModules=yes
Fixes: https://github.com/systemd/systemd/issues/4567
|
|
|
|
If systemd is built with --enable-split-usr, but the system is indeed a
merged-usr system, then systemd-delta gets all confused and reports
that all units and configuration files have been overridden.
Skip any prefix paths that are symlinks in this case.
Fixes: #4573
|
|
|
|
|
|
and over
|
|
|
|
It's the default, and NULL is shorter.
|
|
journalctl: fix memleak
|
|
This new setting permits restricting whether namespaces may be created and
managed by processes started by a unit. It installs a seccomp filter blocking
certain invocations of unshare(), clone() and setns().
RestrictNamespaces=no is the default, and does not restrict namespaces in any
way. RestrictNamespaces=yes takes away the ability to create or manage any kind
of namspace. "RestrictNamespaces=mnt ipc" restricts the creation of namespaces
so that only mount and IPC namespaces may be created/managed, but no other
kind of namespaces.
This setting should be improve security quite a bit as in particular user
namespacing was a major source of CVEs in the kernel in the past, and is
accessible to unprivileged processes. With this setting the entire attack
surface may be removed for system services that do not make use of namespaces.
|
|
/bin/kernel-install: line 143: return: can only `return' from a function or sourced script
https://bugzilla.redhat.com/show_bug.cgi?id=1391829
|
|
systemd-analyze syscall-filter
|
|
Fixes:
$ ./libtool --mode execute valgrind --leak-check=full ./journalctl >/dev/null
==22309== Memcheck, a memory error detector
==22309== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==22309== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==22309== Command: /home/vagrant/systemd/.libs/lt-journalctl
==22309==
Hint: You are currently not seeing messages from other users and the system.
Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages.
Pass -q to turn off this notice.
==22309==
==22309== HEAP SUMMARY:
==22309== in use at exit: 8,680 bytes in 4 blocks
==22309== total heap usage: 5,543 allocs, 5,539 frees, 9,045,618 bytes allocated
==22309==
==22309== 488 (56 direct, 432 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 4
==22309== at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==22309== by 0x6F37A0A: __new_var_obj_p (__libobj.c:36)
==22309== by 0x6F362F7: __acl_init_obj (acl_init.c:28)
==22309== by 0x6F37731: __acl_from_xattr (__acl_from_xattr.c:54)
==22309== by 0x6F36087: acl_get_file (acl_get_file.c:69)
==22309== by 0x4F15752: acl_search_groups (acl-util.c:172)
==22309== by 0x113A1E: access_check_var_log_journal (journalctl.c:1836)
==22309== by 0x113D8D: access_check (journalctl.c:1889)
==22309== by 0x115681: main (journalctl.c:2236)
==22309==
==22309== LEAK SUMMARY:
==22309== definitely lost: 56 bytes in 1 blocks
==22309== indirectly lost: 432 bytes in 1 blocks
==22309== possibly lost: 0 bytes in 0 blocks
==22309== still reachable: 8,192 bytes in 2 blocks
==22309== suppressed: 0 bytes in 0 blocks
|
|
bash-4.3# journalctl --no-hostname >/dev/null
=================================================================
==288==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48492 byte(s) in 2694 object(s) allocated from:
#0 0x7fb4aba13e60 in malloc (/lib64/libasan.so.3+0xc6e60)
#1 0x7fb4ab5b2cc4 in malloc_multiply src/basic/alloc-util.h:70
#2 0x7fb4ab5b3194 in parse_field src/shared/logs-show.c:98
#3 0x7fb4ab5b4918 in output_short src/shared/logs-show.c:347
#4 0x7fb4ab5b7cb7 in output_journal src/shared/logs-show.c:977
#5 0x5650e29cd83d in main src/journal/journalctl.c:2581
#6 0x7fb4aabdb730 in __libc_start_main (/lib64/libc.so.6+0x20730)
SUMMARY: AddressSanitizer: 48492 byte(s) leaked in 2694 allocation(s).
Closes: #4568
|
|
|
|
Tree wide cleanups
|
|
This reverts commit 75ead2b753cb9586f3f208326446081baab70da1.
Follow up for #4546:
> @@ -848,8 +848,7 @@ static int bus_kernel_make_message(sd_bus *bus, struct kdbus_msg *k) {
if (k->src_id == KDBUS_SRC_ID_KERNEL)
bus_message_set_sender_driver(bus, m);
else {
- xsprintf(m->sender_buffer, ":1.%llu",
- (unsigned long long)k->src_id);
+ xsprintf(m->sender_buffer, ":1.%"PRIu64, k->src_id);
This produces:
src/libsystemd/sd-bus/bus-kernel.c: In function ‘bus_kernel_make_message’:
src/libsystemd/sd-bus/bus-kernel.c:851:44: warning: format ‘%lu’ expects argument of type ‘long
unsigned int’, but argument 4 has type ‘__u64 {aka long long unsigned int}’ [-Wformat=]
xsprintf(m->sender_buffer, ":1.%"PRIu64, k->src_id);
^
|
|
Just to make the whole thing easier for users.
|
|
Now that the list is user-visible, @default should be first.
|
|
This should make it easier for users to understand what each filter
means as the list of syscalls is updated in subsequent systemd versions.
|
|
|
|
|
|
is set
Make sure that when DynamicUser= is set that we intialize the user
supplementary groups and that we also support SupplementaryGroups=
Fixes: https://github.com/systemd/systemd/issues/4539
Thanks Evgeny Vereshchagin (@evverx)
|
|
Two testsuite tweaks
|
|
|
|
If we encounter the (unlikely) situation where the combined path to the
new root and a path to a mount to be moved together exceed maximum path length,
we shouldn't crash, but fail this path instead.
|
|
|
|
This reverts some changes introduced in d054f0a4d4.
xsprintf should be used in cases where we calculated the right buffer
size by hand (using DECIMAL_STRING_MAX and such), and never in cases where
we are printing externally specified strings of arbitrary length.
Fixes #4534.
|
|
Add "perpetual" unit concept, sysctl fixes, networkd fixes, systemctl color fixes, nspawn discard.
|
|
|
|
overly long description strings
This essentially reverts one part of d054f0a4d451120c26494263fc4dc175bfd405b1.
(We might also choose to use proper ellipsation here, but I wasn't sure the
memory allocation this requires wouöld be a good idea here...)
Fixes: #4534
|
|
Preserve stored fds over service restart
|