Age | Commit message (Collapse) | Author |
|
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
|
|
Also, enable TasksAccounting= for all services by default, too.
See:
http://lists.freedesktop.org/archives/systemd-devel/2015-November/035006.html
|
|
This allows initializing the TasksMax= setting of all units by default
to some fixed value, instead of leaving it at infinity as before.
|
|
Remove support for RequiresOverridable= and RequisiteOverridable=
|
|
|
|
This allows us to shorten our code a bit.
|
|
Now that we don't have RequiresOverridable= and RequisiteOverridable=
dependencies anymore, we can get rid of tracking the "override" boolean
for jobs in the job engine, as it serves no purpose anymore.
While we are at it, fix some error messages we print when invoking
functions that take the override parameter.
|
|
We can't handle errors of thisc all sanely anyway, and we never actually
return any errors from the unit type that implements the call. Hence,
let's make this void, in order to simplify things.
|
|
We cannot handle enumeration failures in a sensible way, hence let's try
hard to continue without making such failures fatal, and log about it
with precise error messages.
|
|
Don't simply continue after sleeping, it potentially puts us in a state
of spinning doing nothing slowly, if the ratelimit_test() keeps
detecting the need for limiting.
Observed in vms after the host had been suspended for a while, on resume
systemd entered a loop of making zero progress spamming the console
with:
[431942.850090] systemd[1]: Looping too fast. Throttling execution a
little.
I see no reason to have a continue here, the intention should be to
throttle execution, not circumvent it altogether.
|
|
Let's make sure we don't start blocking on sd_notify() earlier than
necessary, let's bump the socket buffer sizes to 8M.
We already do something similar for our logging socket buffers, hence
apply a similar bump here.
|
|
Let's make sure we don't even try to create the audit socket
|
|
|
|
|
|
Currently, we dispatch NOTIFY messages in a tight loop. Regardless how
much data is incoming, we always dispatch everything that is queued.
This, however, completely breaks priority event-handling of sd-event.
When dispatching one NOTIFY event, another completely different event
might fire, or might be queued by the NOTIFY handling. However, this
event will not get dispatched until all other further NOTIFY messages are
handled. Those might even arrive _after_ the other event fired, and as
such completely break priority ordering of sd-event (which several code
paths rely on).
Break this by never dispatching multiple messages. Just return after each
message that was read and let sd-event handle everything else.
(The patch looks scarier that it is. It basically just drops the for(;;)
loop and re-indents the loop-content.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There are more than enough to deserve their own .c file, hence move them
over.
|
|
string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
|
|
This really deserves its own file, given how much code this is now.
|
|
|
|
|
|
This adds support for naming file descriptors passed using socket
activation. The names are passed in a new $LISTEN_FDNAMES= environment
variable, that matches the existign $LISTEN_FDS= one and contains a
colon-separated list of names.
This also adds support for naming fds submitted to the per-service fd
store using FDNAME= in the sd_notify() message.
This also adds a new FileDescriptorName= setting for socket unit files
to set the name for fds created by socket units.
This also adds a new call sd_listen_fds_with_names(), that is similar to
sd_listen_fds(), but also returns the names of the fds.
systemd-activate gained the new --fdname= switch to specify a name for
testing socket activation.
This is based on #1247 by Maciej Wereski.
Fixes #1247.
|
|
|
|
A variety of mostly unrelated fixes
|
|
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
|
|
|
|
mount: use libmount to monitor mountinfo & utab
|
|
Add a new config directive called NetClass= to CGroup enabled units.
Allowed values are positive numbers for fix assignments and "auto" for
picking a free value automatically, for which we need to keep track of
dynamically assigned net class IDs of units. Introduce a hash table for
this, and also record the last ID that was given out, so the allocator
can start its search for the next 'hole' from there. This could
eventually be optimized with something like an irb.
The class IDs up to 65536 are considered reserved and won't be
assigned automatically by systemd. This barrier can be made a config
directive in the future.
Values set in unit files are stored in the CGroupContext of the
unit and considered read-only. The actually assigned number (which
may have been chosen dynamically) is stored in the unit itself and
is guaranteed to remain stable as long as the unit is active.
In the CGroup controller, set the configured CGroup net class to
net_cls.classid. Multiple unit may share the same net class ID,
and those which do are linked together.
|
|
The current implementation directly monitor /proc/self/mountinfo and
/run/mount/utab files. It's really not optimal because utab file is
private libmount stuff without any official guaranteed semantic.
The libmount since v2.26 provides API to monitor mount kernel &
userspace changes and since v2.27 the monitor is usable for
non-root users too.
This patch replaces the current implementation with libmount based
solution.
Signed-off-by: Karel Zak <kzak@redhat.com>
|
|
Let's make sure that we follow the same codepaths when adjusting a
cgroup property via the dbus SetProperty() call, and when we execute the
StartupCPUShares= effect.
|
|
In all occasions when this function is called we do so anyway, so let's
move this inside, to make things easier.
|
|
There's a good chance we never needs these sets, hence allocate them
only when needed.
|
|
Patch via coccinelle.
|
|
This replaces this:
free(p);
p = NULL;
by this:
p = mfree(p);
Change generated using coccinelle. Semantic patch is added to the
sources.
|
|
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
|
|
Let's move the actual cgroup part of it into a new separate function
manager_get_unit_by_pid_cgroup(), and then make
manager_get_unit_by_pid() just a wrapper that also checks the two pid
hashmaps.
Then, let's make sure the various calls that want to deliver events to
the owners of a PID check both hashmaps and the cgroup and deliver the
event to *each* of them. OTOH make sure bus calls like GetUnitByPID()
continue to check the PID hashmaps first and the cgroup only as
fallback.
|
|
This adds a new PID_TO_PTR() macro, plus PTR_TO_PID() and makes use of
it wherever we maintain processes in a hash table. Previously we
sometimes used LONG_TO_PTR() and other times ULONG_TO_PTR() for that,
hence let's make this more explicit and clean up things.
|
|
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.
A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).
It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.
The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.
This patch also removes cg_delete() which is unused now.
On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.
This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.
This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.
The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.
To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.
This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.
When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
|
|
Instead, remember that we have already written it.
|
|
Otherwise we might attempt to remove a non-existing fd from epoll.
|
|
Currently, PID1 installs an unfiltered NameOwnerChanged signal match, and
dispatches the signals itself. This does not scale, as right now, PID1
wakes up every time a bus client connects.
To fix this, install individual matches once they are requested by
unit_watch_bus_name(), and remove the watches again through their slot in
unit_unwatch_bus_name().
If the bus is not available during unit_watch_bus_name(), just store
name in the 'watch_bus' hashmap, and let bus_setup_api() do the installing
later.
|
|
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
|
|
./configure --enable/disable-kdbus can be used to set the default
behavior regarding kdbus.
If no kdbus kernel support is available, dbus-dameon will be used.
With --enable-kdbus, the kernel command line option "kdbus=0" can
be used to disable kdbus.
With --disable-kdbus, the kernel command line option "kdbus=1" is
required to enable kdbus support.
|