Age | Commit message (Collapse) | Author |
|
cgroup IO controller supports maximum limits for both bandwidth and IOPS but
systemd resource control currently only supports bandwidth limits. This patch
adds support for IOReadIOPSMax and IOWriteIOPSMax when unified cgroup hierarchy
is in use.
It isn't difficult to also add BlockIOReadIOPS and BlockIOWriteIOPS for legacy
hierarchies but IO control on legacy hierarchies is half-broken anyway, so
let's leave it alone for now.
|
|
Currently, there are two cgroup IO limits, bandwidth max for read and write,
and they are hard-coded in various places. This is fine for two limits but IO
is expected to grow more limits - low, high and max limits for bandwidth and
IOPS - and hard-coding each limit won't make sense.
This patch replaces hard-coded limits with an array indexed by
CGroupIOLimitType and accompanying string and default value tables so that new
limits can be added trivially.
|
|
core: add io controller support on the unified hierarchy
|
|
|
|
|
|
Add a synchronization point so that custom initramfs units can run
after the root device becomes available, before it is fsck'd and
mounted.
This is useful for custom initramfs units that may modify the
root disk partition table, where the root device is not known in
advance (it's dynamically selected by the generators).
|
|
Fix "preset-all" with dangling symlinks and install-section hint emitted too eagerly
|
|
_const_ means that the caller can assume that the function will return the same
result every time (and will not modify global memory). special_glyph() meets
this: even though it depends on global memory, that part of global memory is
not expected to change. This allows the calls to special_glyph() to be
optimized, even if -flto is not used.
|
|
That function doesn't draw anything on it's own, just returns a string, which
sometimes is more than one character. Also remove "DRAW_" prefix from character
names, TREE_* and ARROW and BLACK_CIRCLE are unambigous on their own, don't
draw anything, and are always used as an argument to special_glyph().
Rename "DASH" to "MDASH", as there's more than one type of dash.
|
|
Make use of this in nspawn at a couple of places. A later commit should port
more code over to this, including networkd.
|
|
don't reopen socket fds when reloading the daemon
|
|
Previously, we'd simply close and reopen the socket file descriptors. This is
problematic however, as we won't transition through the SOCKET_CHOWN state
then, and thus the file ownership won't be correct for the sockets.
Rework the flushing logic, and actually read any queued data from the sockets
for flushing, and accept any queued messages and disconnect them.
|
|
On the unified hierarchy, blkio controller is renamed to io and the interface
is changed significantly.
* blkio.weight and blkio.weight_device are consolidated into io.weight which
uses the standardized weight range [1, 10000] with 100 as the default value.
* blkio.throttle.{read|write}_{bps|iops}_device are consolidated into io.max.
Expansion of throttling features is being worked on to support
work-conserving absolute limits (io.low and io.high).
* All stats are consolidated into io.stats.
This patchset adds support for the new interface. As the interface has been
revamped and new features are expected to be added, it seems best to treat it
as a separate controller rather than trying to expand the blkio settings
although we might add automatic translation if only blkio settings are
specified.
* io.weight handling is mostly identical to blkio.weight[_device] handling
except that the weight range is different.
* Both read and write bandwidth settings are consolidated into
CGroupIODeviceLimit which describes all limits applicable to the device.
This makes it less painful to add new limits.
* "max" can be used to specify the maximum limit which is equivalent to no
config for max limits and treated as such. If a given CGroupIODeviceLimit
doesn't contain any non-default configs, the config struct is discarded once
the no limit config is applied to cgroup.
* lookup_blkio_device() is renamed to lookup_block_device().
Signed-off-by: Tejun Heo <htejun@fb.com>
|
|
The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to
connect() or bind(). It automatically figures out if the socket refers to an
abstract namespace socket, or a socket in the file system, and properly handles
the full length of the path field.
This macro is not only safer, but also simpler to use, than the usual
offsetof() + strlen() logic.
|
|
|
|
make virtualization detection quieter, rework unit start limit logic, detect unit file drop-in changes correctly, fix autofs state propagation
|
|
Commit 82501b3fc added an early break when a terminal node is found to
incorrect place -- before setting c. This caused trie to be built that
does not correspond to what it points to in buffer, causing incorrect
deduplications:
# cat /etc/udev/rules.d/99-bug.rules
ENV{FOO}=="0"
ENV{xx0}=="BAR"
ENV{BAZ}=="00"
# udevadm test
* RULE /etc/udev/rules.d/99-bug.rules:1, token: 0, count: 2, label: ''
M ENV match 'FOO' '0'(plain)
* RULE /etc/udev/rules.d/99-bug.rules:2, token: 2, count: 2, label: ''
M ENV match 'xx0' 'BAR'(plain)
* RULE /etc/udev/rules.d/99-bug.rules:3, token: 4, count: 2, label: ''
M ENV match 'BAZ' 'x0'(plain)
* END
The addition of "xx0" following "0" will cause a trie like this to be
created:
c=\0
c=0 "0"
c=0 "xx0" <-- note the c is incorrect here, causing "00" to be
c=O "FOO" deduplicated to it
c=R "BAR"
This in effect caused the usb_modeswitch rule for Huawei modems to never
match and this never be switched to serial mode from mass storage.
|
|
machined: make "clone" asynchronous, and support copy-based fall-back
|
|
This is hardly useful, it's trivial for developers to get that info by running
cat /proc/cpuinfo.
Fixes #3155
|
|
Fall back to a normal copy operation when the backing file system isn't btrfs,
and hence doesn't support cheap snapshotting. Of course, this will be slow, but
given that the execution is asynchronous now, this should be OK.
Fixes: #1308
|
|
When recursively copying a directory tree, fix up the file times after having
created all contents in it, so that our changes don't end up altering any of
the directory times.
|
|
|
|
Let's make sigkill_wait() take a normal pid_t, and add sigkill_waitp() that
takes a pointer (which is useful for usage in _cleanup_), following the usual
logic we have for this.
|
|
Refuse aliases to non-aliasable units in more places
Fixes #2730.
|
|
Add nios2 architecture support. The nios2 is a softcore by Altera.
|
|
Assorted fixes #3149 + one commit tacked on top
|
|
Hashing should be quicker than allocating, hence let's first check if the
string already exists and only then allocate a new copy for it.
|
|
|
|
In 4.2 kernel headers, some netlink defines are missing that we need. missing.h
already can add them in, but currently makes this dependent on a definition
that these kernels already have. Change the check hence to check for the newest
definition in the table, so that the whole bunch of definitions as added in on
all kernels lacking this.
|
|
~ suffix works fine, but looks to much like it the file is supposed to be
automatically cleaned up. For new versions of configuration files installers
might want to using something that looks more permanent like foobar.new.
So let's add treat ".old" and ".new" as special.
Update test to match.
|
|
We previously would fail with EOPNOTSUPP when encountering an AF_UNIX socket in
the directory tree to copy. Fix that, and copy them too (even if they are dead
in the result).
Fixes: #2914
|
|
hidden_or_backup_file()
And let's add ".bak" as a generic suffix for backups, that people can use
without having to register their stuff in our list.
|
|
created in too
Fixes: #2831
|
|
On s390 size_t is an unsigned long, nor an unsigned int. They both are
of the same size and can be cast to each other safely, but the compiler
still seems unhappy about incompatible pointers.
Fixes: 7c2da2ca8
|
|
Various small cleanups in shared code
|
|
Added to kernel 4.6.
|
|
In standard linux parlance, "hidden" usually means that the file name starts
with ".", and nothing else. Rename the function to convey what the function does
better to casual readers.
Stop exposing hidden_file_allow_backup which is rather ugly and rewrite
hidden_file to extract the suffix first. Note that hidden_file_allow_backup
excluded files with "~" at the end, which is quite confusing. Let's get
rid of it before it gets used in the wrong place.
|
|
dirent_is_file_with_suffix
If the file name is supposed to end in a suffix, there's not need to check the
name against a list of "special" file names, which is slow. Instead, just check
that the name doens't start with a period.
|
|
ucf is a standard Debian helper for managing configuration file upgrades which
need more interaction or elaborate merging than conffiles managed by dpkg.
Ignore its temporary and backup files similarly to the *.dpkg-* ones to avoid
creating units for them in generators.
https://bugs.debian.org/775903
|
|
nspawn automatic user namespaces
|
|
This should allow tools like rkt to pre-mount read-only subtrees in the OS
tree, without breaking the patching code.
Note that the code will still fail, if the top-level directory is already
read-only.
|
|
|
|
With this change -U will turn on user namespacing only if the kernel actually
supports it and otherwise gracefully degrade to non-userns mode.
|
|
In nspawn we invoke copy_bytes() on a TTY fd. copy_file_range() returns EBADF
on a TTY and this error is considered fatal by copy_bytes() so far. Correct
that, so that nspawn's copy_bytes() operation works again.
This is a follow-up for a44202e98b638024c45e50ad404c7069c7835c04.
|
|
This is slightly nicer, since we actually watch the directories we opened and
enumerate. However, primarily this is preparation for adding support for
opening journal files by fd without specifying any path, to be added in a later
commit.
|
|
This moves the O_TMPFILE handling from the coredumping code into common library
code, and generalizes it as open_tmpfile_linkable() + link_tmpfile(). The
existing open_tmpfile() function (which creates an unlinked temporary file that
cannot be linked into the fs) is renamed to open_tmpfile_unlinkable(), to make
the distinction clear. Thus, code may now choose between:
a) open_tmpfile_linkable() + link_tmpfile()
b) open_tmpfile_unlinkable()
Depending on whether they want a file that may be linked back into the fs later
on or not.
In a later commit we should probably convert fopen_temporary() to make use of
open_tmpfile_linkable().
Followup for: #3065
|
|
Before we invoke now(CLOCK_BOOTTIME), let's make sure we actually have that
clock, since now() will otherwise hit an assert.
Specifically, let's refuse CLOCK_BOOTTIME early in sd-event if the kernel
doesn't actually support it.
This is a follow-up for #3037, and specifically:
https://github.com/systemd/systemd/pull/3037#issuecomment-210199167
|
|
This adds a new GetProcesses() bus call to the Unit object which returns an
array consisting of all PIDs, their process names, as well as their full cgroup
paths. This is then used by "systemctl status" to show the per-unit process
tree.
This has the benefit that the client-side no longer needs to access the
cgroupfs directly to show the process tree of a unit. Instead, it now uses this
new API, which means it also works if -H or -M are used correctly, as the
information from the specific host is used, and not the one from the local
system.
Fixes: #2945
|
|
IPv6 protocol requires a minimum MTU of 1280 bytes on the interface.
This fixes #3046.
Introduce helper link_ipv6_enabled() to figure out whether IPV6 is enabled.
Introduce network_has_static_ipv6_addresses() to find out if any static
ipv6 address configured.
If IPv6 is not configured on any interface that is SLAAC, DHCPv6 and static
IPv6 addresses not configured, then IPv6 will be automatically disabled for that
interface, that is we write "1" to /proc/sys/net/ipv6/conf//disable_ipv6.
|
|
After all it's something that we query over and over.
For example, systemctl calls colors_enabled() four times for each failing
service. The compiler is unable to optimize those calls away because they
(potentially) accesses external and global state through on_tty() and
getenv().
|