Age | Commit message (Collapse) | Author |
|
|
|
Also add a bit of debugging output to help diagnose problems,
add missing units, and simplify cppflags.
Move test-engine to normal tests from manual tests, it should now
work without destroying the system.
|
|
Only accept cpu quota values in percentages, get rid of period
definition.
It's not clear whether the CFS period controllable per-cgroup even has a
future in the kernel, hence let's simplify all this, hardcode the period
to 100ms and only accept percentage based quota values.
|
|
This is the behaviour the kernel cgroup rework exposes for all
controllers, hence let's do this already now for all cases.
|
|
Introduce a (unsigned long) -1 as "unset" state for cpu shares/block io
weights, and keep the startup unit set around all the time.
|
|
Similar to CPUShares= and BlockIOWeight= respectively. However only
assign the specified weight during startup. Each control group
attribute is re-assigned as weight by CPUShares=weight and
BlockIOWeight=weight after startup. If not CPUShares= or
BlockIOWeight= be specified, then the attribute is re-assigned to each
default attribute value. (default cpu.shares=1024, blkio.weight=1000)
If only CPUShares=weight or BlockIOWeight=weight be specified, then
that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.
|
|
|
|
We should no longer pretend that we can run in any sensible way
without the kernel supporting us with cgroups functionality.
|
|
|
|
if PrivateDevices=yes is used we need to make sure we can still
create /dev/null and so on.
|
|
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:
fd = safe_close(fd);
Which will close an fd if it is open, and reset the fd variable
correctly.
By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
|
|
|
|
hence don't bother
|
|
particular devices nodes
|
|
|
|
Resolve spotted issues related to missing or extraneous commas, dashes.
|
|
enabled when enabling/disabling cgroup controllers for units
|
|
Previously a cgroup setting down tree would result in cgroup membership
additions being propagated up the tree and to the siblings, however a
unit could never lose cgroup memberships again. With this change we'll
make sure that both cgroup additions and removals propagate properly.
|
|
|
|
If the unit already was in the hashmap, path would be leaked.
|
|
|
|
This way cleaning up the cgroup tree on shutdown is a lot easier since
we are in the root dir. Also PID 1 was previously artificially placed in
system.slice, even though our rule actually was not to have processes in
slices. The root slice otoh is magic anyway, so having PID 1 in there
sounds less surprising.
Of course, this means that PID is scheduled against the three top-level
slices.
|
|
each invocation
We can determine the list entry type via the typeof() gcc construct, and
so we should to make the macros much shorter to use.
|
|
controllers
Previously we did operations like attach, trim or migrate only on the
controllers that were enabled for a specific unit. With this changes we
will now do them for all supproted controllers, and fall back to all
possible prefix paths if the specified paths do not exist.
This fixes issues if a controller is being disabled for a unit where it
was previously enabled, and makes sure that all processes stay as "far
down" the tree as groups exist.
|
|
hierarchy
The non-hierarchial mode contradicts the whole idea of a cgroup tree so
let's not support this. In the future the kernel will only support the
hierarchial logic anyway.
|
|
The cgroup attribute memory.soft_limit_in_bytes is unlikely to stay
around in the kernel for good, so let's not expose it for now. We can
readd something like it later when the kernel guys decided on a final
API for this.
|
|
|
|
|
|
If the memory_limit of unit is -1, we should write "-1"
to the file memory.limit_in_bytes. not the (unit64_t) -1.
otherwise the memory.limit_in_bytes will be set to zero.
|
|
it should be memory.soft_limit_in_bytes.
|
|
set the value of variable "r" to the return value
of cg_set_attribute.
|
|
This prevents corruption of the hashmap, because we would free() the
keys in the hashmap, if the unit is already in there, with the same
cgroup path.
|
|
This reverts commit 1f11a0cdfe397cc404d61ee679fc12f58c0a885b.
|
|
do not recurse further, if unit_realize_cgroup_now() failed
|
|
This way we can nicely map the configuration directive to properties and
back, without requiring two different signatures for the same property.
|
|
The root slice is after all the root cgroup, so don't attempt to delete
it.
|
|
|
|
Some units set KillMode=none to survive the initrd→rootfs transition. We
cannot remove their cgroups, but that shouldn't really be considered an
issue, so let's downgrade the error message.
|
|
|
|
|
|
|
|
Replace the very generic cgroup hookup with a much simpler one. With
this change only the high-level cgroup settings remain, the ability to
set arbitrary cgroup attributes is removed, so is support for adding
units to arbitrary cgroup controllers or setting arbitrary paths for
them (especially paths that are different for the various controllers).
This also introduces a new -.slice root slice, that is the parent of
system.slice and friends. This enables easy admin configuration of
root-level cgrouo properties.
This replaces DeviceDeny= by DevicePolicy=, and implicitly adds in
/dev/null, /dev/zero and friends if DeviceAllow= is used (unless this is
turned off by DevicePolicy=).
|
|
- This changes all logind cgroup objects to use slice objects rather
than fixed croup locations.
- logind can now collect minimal information about running
VMs/containers. As fixed cgroup locations can no longer be used we
need an entity that keeps track of machine cgroups in whatever slice
they might be located. Since logind already keeps track of users,
sessions and seats this is a trivial addition.
- nspawn will now register with logind and pass various bits of metadata
along. A new option "--slice=" has been added to place the container
in a specific slice.
- loginctl gained commands to list, introspect and terminate machines.
- user.slice and machine.slice will now be pulled in by logind.service,
since only logind.service requires this slice.
|
|
In order to prepare for the kernel cgroup rework, let's introduce a new
unit type to systemd, the "slice". Slices can be arranged in a tree and
are useful to partition resources freely and hierarchally by the user.
Each service unit can now be assigned to one of these slices, and later
on login users and machines may too.
Slices translate pretty directly to the cgroup hierarchy, and the
various objects can be assigned to any of the slices in the tree.
|
|
containers there
Containers will now carry a label (normally derived from the root
directory name, but configurable by the user), and the container's root
cgroup is /machine/<label>. This label is called "machine name", and can
cover both containers and VMs (as soon as libvirt also makes use of
/machine/).
libsystemd-login can be used to query the machine name from a process.
This patch also includes numerous clean-ups for the cgroup code.
|
|
This allows clients to put inotify watches on these trees to watch for
state changes, without having to wait until these dirs are created.
This introduces the new top-level /machine cgroup dir as canonical
location where OS containers and VMs shall be located (as discussed with
the libvirt folks).
|
|
cgroup directories in sync
|
|
|
|
during runtime
|
|
Note: I did s/MANAGER/SYSTEMD/ everywhere, even though it makes the
patch quite verbose. Nevertheless, keeping MANAGER prefix in some
places, and SYSTEMD prefix in others would just lead to confusion down
the road. Better to rip off the band-aid now.
|