~lukeshu/systemd - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2017-02-20	manager: run environment generators	Zbigniew Jędrzejewski-Szmek
	Environment file generators are a lot like unit file generators, but not exactly: 1. environment file generators are run for each manager instance, and their output is (or at least can be) individualized. The generators themselves are system-wide, the same for all users. 2. environment file generators are run sequentially, in priority order. Thus, the lifetime of those files is tied to lifecycle of the manager instance. Because generators are run sequentially, later generators can use or modify the output of earlier generators. Each generator is run with no arguments, and the whole state is stored in the environment variables. The generator can echo a set of variable assignments to standard output: VAR_A=something VAR_B=something else This output is parsed, and the next and subsequent generators run with those updated variables in the environment. After the last generator is done, the environment that the manager itself exports is updated. Each generator must return 0, otherwise the output is ignored. The generators in /user-env-generator are for the user session managers, including root, and the ones in /system-env-generator are for pid1.
2017-02-20	core/manager: move environment serialization out to basic/env-util.c	Zbigniew Jędrzejewski-Szmek
	This protocol is generally useful, we might just as well reuse it for the env. generators. The implementation is changed a bit: instead of making a new strv and freeing the old one, just mutate the original. This is much faster with larger arrays, while in fact atomicity is preserved, since we only either insert the new entry or not, without being in inconsistent state. v2: - fix confusion with return value
2017-02-20	core/manager: fix grammar in comment	Zbigniew Jędrzejewski-Szmek

2017-02-20	basic/exec-util: add support for synchronous (ordered) execution	Zbigniew Jędrzejewski-Szmek
	The output of processes can be gathered, and passed back to the callee. (This commit just implements the basic functionality and tests.) After the preparation in previous commits, the change in functionality is relatively simple. For coding convenience, alarm is prepared before any children are executed, and not before. This shouldn't matter usually, since just forking of the children should be pretty quick. One could also argue that this is more correct, because we will also catch the case when (for whatever reason), forking itself is slow. Three callback functions and three levels of serialization are used: - from individual generator processes to the generator forker - from the forker back to the main process - deserialization in the main process v2: - replace an structure with an indexed array of callbacks
2017-02-20	core/manager: split out creation of serialization fd out to a helper	Zbigniew Jędrzejewski-Szmek
	There is a slight change in behaviour: the user manager for root will create a temporary file in /run/systemd, not /tmp. I don't think this matters, but simplifies implementation.
2017-02-11	basic/util: move execute_directory() to separate file	Zbigniew Jędrzejewski-Szmek
	It's a fairly specialized function. Let's make new files for it and the tests.
2017-02-06	core: use a memfd for serialization	Lennart Poettering
	If we can, use a memfd for serializing state during a daemon reload or reexec. Fall back to a file in /run/systemd or /tmp only if memfds are not available. See: #5016
2017-02-06	manager: refuse reloading/reexecing when /run is overly full	Lennart Poettering
	Let's add an extra safety check: before entering a reload/reexec, let's verify that there's enough room in /run for it. Fixes: #5016
2016-12-18	core: downgrade "Time has been changed" to debug (#4906)	Zbigniew Jędrzejewski-Szmek
	That message is emitted by every systemd instance on every resume: Dec 06 08:03:38 laptop systemd[1]: Time has been changed Dec 06 08:03:38 laptop systemd[823]: Time has been changed Dec 06 08:03:38 laptop systemd[916]: Time has been changed Dec 07 08:00:32 laptop systemd[1]: Time has been changed Dec 07 08:00:32 laptop systemd[823]: Time has been changed Dec 07 08:00:32 laptop systemd[916]: Time has been changed -- Reboot -- Dec 07 08:02:46 laptop systemd[836]: Time has been changed Dec 07 08:02:46 laptop systemd[1]: Time has been changed Dec 07 08:02:46 laptop systemd[926]: Time has been changed Dec 07 19:48:12 laptop systemd[1]: Time has been changed Dec 07 19:48:12 laptop systemd[836]: Time has been changed Dec 07 19:48:12 laptop systemd[926]: Time has been changed ... Fixes #4896.
2016-12-11	pid1,catalog: use a different MESSAGE_ID for user manager startup	Zbigniew Jędrzejewski-Szmek
	This add a new message id for the end of user instance startup. User manager startup is a different beast then the system startup. Their descriptions are completely different too. Let's just separate them. Partially fixes #3351. Also remove "successful" from the description, since we don't know if the startup was successful or not.
2016-12-09	tree-wide: replace all readdir cycles with FOREACH_DIRENT{,_ALL} (#4853)	Reverend Homer

2016-11-18	Merge pull request #4538 from fbuihuu/confirm-spawn-fixes	Lennart Poettering
	Confirm spawn fixes/enhancements
2016-11-17	core: add 'c' in confirmation_spawn to resume the boot process	Franck Bui

2016-11-17	core: allow to redirect confirmation messages to a different console	Franck Bui
	It's rather hard to parse the confirmation messages (enabled with systemd.confirm_spawn=true) amongst the status messages and the kernel ones (if enabled). This patch gives the possibility to the user to redirect the confirmation message to a different virtual console, either by giving its name or its path, so those messages are separated from the other ones and easier to read.
2016-11-17	core: prevent the cylon when confirmation_spawn=yes (#2194)	Franck Bui
	When booting with systemd.confirm_spawn=true, the eye of cylon animation kicks in pretty quickly so user doesn't have any chance to answer the questions which services to start before the confirmation message is screwed by the cylon. This basically breaks the confirm_spawn functionality completely. This patch prevents the cylon animation to kick in when confirmation_spawn=yes. Fixes: #2194
2016-11-16	core: GC redundant device jobs from the run queue	Lennart Poettering
	In contrast to all other unit types device units when queued just track external state, they cannot effect state changes on their own. Hence unless a client or other job waits for them there's no reason to keep them in the job queue. This adds a concept of GC'ing jobs of this type as soon as no client or other job waits for them anymore. To ensure this works correctly we need to track which clients actually reference a job (i.e. which ones enqueued it). Unfortunately that's pretty nasty to do for direct connections, as sd_bus_track doesn't work for them. For now, work around this, by simply remembering in a boolean that a job was requested by a direct connection, and reset it when we notice the direct connection is gone. This means the GC logic works fine, except that jobs are not immediately removed when direct connections disconnect. In the longer term, a rework of the bus logic should fix this properly. For now this should be good enough, as GC works for fine all cases except this one, and thus is a clear improvement over the previous behaviour. Fixes: #1921
2016-11-16	core: drop n_in_gc_queue field of Manager structure	Lennart Poettering
	We count the units in the GC queue with this, but actually never make use of it, hence drop it.
2016-11-03	Merge pull request #4510 from keszybz/tree-wide-cleanups	Lennart Poettering
	Tree wide cleanups
2016-10-23	tree-wide: drop NULL sentinel from strjoin	Zbigniew Jędrzejewski-Szmek
	This makes strjoin and strjoina more similar and avoids the useless final argument. spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/.c) git grep -e '\bstrjoin\b.NULL' -l\|xargs sed -i -r 's/strjoin$(.*), NULL$/strjoin(\1)/' This might have missed a few cases (spatch has a really hard time dealing with _cleanup_ macros), but that's no big issue, they can always be fixed later.
2016-10-22	tree-wide: use startswith return value to avoid hardcoded offset	Zbigniew Jędrzejewski-Szmek
	I think it's an antipattern to have to count the number of bytes in the prefix by hand. We should do this automatically to avoid wasting programmer time, and possible errors. I didn't any offsets that were wrong, so this change is mostly to make future development easier.
2016-10-21	core: use emergency_action for ctr+alt+del burst	Lukas Nykryn
	Fixes #4306
2016-10-19	pid1: downgrade some rlimit warnings	Zbigniew Jędrzejewski-Szmek
	Since we ignore the result anyway, downgrade errors to warning. log_oom() will still emit an error, but that's mostly theoretical, so it is not worth complicating the code to avoid the small inconsistency
2016-10-16	tree-wide: use mfree more	Zbigniew Jędrzejewski-Szmek

2016-10-07	core: add "invocation ID" concept to service manager	Lennart Poettering
	This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.
2016-10-07	core: only warn on short reads on signal fd	Zbigniew Jędrzejewski-Szmek

2016-10-07	manager: tighten incoming notification message checks	Lennart Poettering
	Let's not accept datagrams with embedded NUL bytes. Previously we'd simply ignore everything after the first NUL byte. But given that sending us that is pretty ugly let's instead complain and refuse. With this change we'll only accept messages that have exactly zero or one NUL bytes at the very end of the datagram.
2016-10-07	manager: be stricter with incomining notifications, warn properly about too ↵	Lennart Poettering
	large ones Let's make the kernel let us know the full, original datagram size of the incoming message. If it's larger than the buffer space provided by us, drop the whole message with a warning. Before this change the kernel would truncate the message for us to the buffer space provided, and we'd not complain about this, and simply process the incomplete message as far as it made sense.
2016-10-07	manager: don't ever busy loop when we get a notification message we can't ↵	Lennart Poettering
	process If the kernel doesn't permit us to dequeue/process an incoming notification datagram message it's still better to stop processing the notification messages altogether than to enter a busy loop where we keep getting notified but can't do a thing about it. With this change, manager_dispatch_notify_fd() behaviour is changed like this: - if an error indicating a spurious wake-up is seen on recvmsg(), ignore it (EAGAIN/EINTR) - if any other error is seen on recvmsg() propagate it, thus disabling processing of further wakeups - if any error is seen on later code in the function, warn about it but do not propagate it, as in this cas we're not going to busy loop as the offending message is already dequeued.
2016-10-06	core: add possibility to set action for ctrl-alt-del burst (#4105)	Lukáš Nykrýn
	For some certification, it should not be possible to reboot the machine through ctrl-alt-delete. Currently we suggest our customers to mask the ctrl-alt-delete target, but that is obviously not enough. Patching the keymaps to disable that is really not a way to go for them, because the settings need to be easily checked by some SCAP tools.
2016-10-01	core: do not try to create /run/systemd/transient in test mode	Zbigniew Jędrzejewski-Szmek
	This prevented systemd-analyze from unprivileged operation on older systemd installations, which should be possible. Also, we shouldn't touch the file system in test mode even if we can.
2016-10-01	core: update warning message	Zbigniew Jędrzejewski-Szmek
	"closing all" might suggest that _all_ fds received with the notification message will be closed. Reword the message to clarify that only the "unused" ones will be closed.
2016-10-01	core: get rid of unneeded state variable	Zbigniew Jędrzejewski-Szmek
	No functional change.
2016-09-29	pid1: more informative error message for ignored notifications	Zbigniew Jędrzejewski-Szmek
	It's probably easier to diagnose a bad notification message if the contents are printed. But still, do anything only if debugging is on.
2016-09-29	pid1: process zero-length notification messages again	Zbigniew Jędrzejewski-Szmek
	This undoes 531ac2b234. I acked that patch without looking at the code carefully enough. There are two problems: - we want to process the fds anyway - in principle empty notification messages are valid, and we should process them as usual, including logging using log_unit_debug().
2016-09-29	pid1: don't return any error in manager_dispatch_notify_fd() (#4240)	Franck Bui
	If manager_dispatch_notify_fd() fails and returns an error then the handling of service notifications will be disabled entirely leading to a compromised system. For example pid1 won't be able to receive the WATCHDOG messages anymore and will kill all services supposed to send such messages.
2016-09-29	If the notification message length is 0, ignore the message (#4237)	Jorge Niedbalski
	Fixes #4234. Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
2016-09-09	pid1: drop kdbus_fd and all associated logic	Zbigniew Jędrzejewski-Szmek

2016-08-22	core: add Ref()/Unref() bus calls for units	Lennart Poettering
	This adds two (privileged) bus calls Ref() and Unref() to the Unit interface. The two calls may be used by clients to pin a unit into memory, so that various runtime properties aren't flushed out by the automatic GC. This is necessary to permit clients to race-freely acquire runtime results (such as process exit status/code or accumulated CPU time) on successful service termination. Ref() and Unref() are fully recursive, hence act like the usual reference counting concept in C. Taking a reference is a privileged operation, as this allows pinning units into memory which consumes resources. Transient units may also gain a reference at the time of creation, via the new AddRef property (that is only defined for transient units at the time of creation).
2016-08-19	Merge pull request #3965 from htejun/systemd-controller-on-unified	Zbigniew Jędrzejewski-Szmek

2016-08-19	core: add RemoveIPC= setting	Lennart Poettering
	This adds the boolean RemoveIPC= setting to service, socket, mount and swap units (i.e. all unit types that may invoke processes). if turned on, and the unit's user/group is not root, all IPC objects of the user/group are removed when the service is shut down. The life-cycle of the IPC objects is hence bound to the unit life-cycle. This is particularly relevant for units with dynamic users, as it is essential that no objects owned by the dynamic users survive the service exiting. In fact, this patch adds code to imply RemoveIPC= if DynamicUser= is set. In order to communicate the UID/GID of an executed process back to PID 1 this adds a new "user lookup" socket pair, that is inherited into the forked processes, and closed before the exec(). This is needed since we cannot do NSS from PID 1 due to deadlock risks, However need to know the used UID/GID in order to clean up IPC owned by it if the unit shuts down.
2016-08-17	core: use the unified hierarchy for the systemd cgroup controller hierarchy	Tejun Heo
	Currently, systemd uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, systemd uses a named legacy hierarchy mounted on /sys/fs/cgroup/systemd without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for systemd to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates systemd so that it prefers the unified hierarchy for the systemd cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for systemd cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only systemd controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().
2016-08-15	core: rename cg_unified() to cg_all_unified()	Tejun Heo
	A following patch will update cgroup handling so that the systemd controller (/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel resource controllers are on the legacy hierarchies. This would require distinguishing whether all controllers are on cgroup v2 or only the systemd controller is. In preparation, this patch renames cg_unified() to cg_all_unified(). This patch doesn't cause any functional changes.
2016-08-03	core: drop spurious newline	Lennart Poettering

2016-07-25	Merge pull request #3728 from poettering/dynamic-users	Zbigniew Jędrzejewski-Szmek

2016-07-22	core: add a concept of "dynamic" user ids, that are allocated as long as a ↵	Lennart Poettering
	service is running This adds a new boolean setting DynamicUser= to service files. If set, a new user will be allocated dynamically when the unit is started, and released when it is stopped. The user ID is allocated from the range 61184..65519. The user will not be added to /etc/passwd (but an NSS module to be added later should make it show up in getent passwd). For now, care should be taken that the service writes no files to disk, since this might result in files owned by UIDs that might get assigned dynamically to a different service later on. Later patches will tighten sandboxing in order to ensure that this cannot happen, except for a few selected directories. A simple way to test this is: systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22	core: change TasksMax= default for system services to 15%	Lennart Poettering
	As it turns out 512 is max number of tasks per service is hit by too many applications, hence let's bump it a bit, and make it relative to the system's maximum number of PIDs. With this change the new default is 15%. At the kernel's default pids_max value of 32768 this translates to 4915. At machined's default TasksMax= setting of 16384 this translates to 2457. Why 15%? Because it sounds like a round number and is close enough to 4096 which I was going for, i.e. an eight-fold increase over the old 512 Summary: \| on the host \| in a container old default \| 512 \| 512 new default \| 4915 \| 2457
2016-07-21	core: remove duplicate includes (#3771)	Thomas H. P. Andersen

2016-07-16	manager: don't skip sigchld handler for main and control pid for services ↵	Lukáš Nykrýn
	(#3738) During stop when service has one "regular" pid one main pid and one control pid and the sighld for the regular one is processed first the unit_tidy_watch_pids will skip the main and control pid and does not remove them from u->pids(). But then we skip the sigchld event because we already did one in the iteration and there are two pids in u->pids. v2: Use general unit_main_pid() and unit_control_pid() instead of reaching directly to service structure.
2016-07-01	manager: Fixing a debug printf formatting mistake (#3640)	Kyle Walker
	A 'llu' formatting statement was used in a debugging printf statement instead of a 'PRIu64'. Correcting that mistake here.
2016-06-30	manager: Only invoke a single sigchld per unit within a cleanup cycle	Kyle Walker
	By default, each iteration of manager_dispatch_sigchld() results in a unit level sigchld event being invoked. For scope units, this results in a scope_sigchld_event() which can seemingly stall for workloads that have a large number of PIDs within the scope. The stall exhibits itself as a SIG_0 being initiated for each u->pids entry as a result of pid_is_unwaited(). v2: This patch resolves this condition by only paying to cost of a sigchld in the underlying scope unit once per sigchld iteration. A new "sigchldgen" member resides within the Unit struct. The Manager is incremented via the sd event loop, accessed via sd_event_get_iteration, and the Unit member is set to the same value as the manager each time that a sigchld event is invoked. If the Manager iteration value and Unit member match, the sigchld event is not invoked for that iteration.