summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2016-10-07core: add "invocation ID" concept to service managerLennart Poettering
This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.
2016-10-07util: use SPECIAL_ROOT_SLICE macro where appropriateLennart Poettering
2016-10-07log: minor fixesLennart Poettering
Most important is a fix to negate the error number if necessary, before we first access it.
2016-10-07journal: fix format string used for usec_tLennart Poettering
2016-10-07journal: complete slice info in journal metadataLennart Poettering
We are already attaching the system slice information to log messages, now add theuser slice info too, as well as the object slice info.
2016-10-07bus-util: generalize helper for ID128 prpoertiesLennart Poettering
This way, we can make use of this in other code, too.
2016-10-07strv: fix STRV_FOREACH_BACKWARDS() to be a single statement onlyLennart Poettering
Let's make sure people invoking STRV_FOREACH_BACKWARDS() as a single statement of an if statement don't fall into a trap, and find the tail for the list via strv_length().
2016-10-07Merge pull request #4304 from poettering/notify-nul-checkLennart Poettering
3 minor improvements for notification message handling
2016-10-07core: only warn on short reads on signal fdZbigniew Jędrzejewski-Szmek
2016-10-07networkd: remote checksum offload for vxlan (#4110)Susant Sahani
This patch adds support to remote checksum checksum offload to VXLAN. This patch adds RemoteCheckSumTx and RemoteCheckSumRx vxlan configuration to enable remote checksum offload for transmit and receive on the VXLAN tunnel.
2016-10-07architecture: Add support for the RISC-V architecture. (#4305)rwmjones
RISC-V is an open source ISA in development since 2010 at UCB. For more information, see https://riscv.org/ I am adding RISC-V support to Fedora: https://fedoraproject.org/wiki/Architectures/RISC-V There are three major variants of the architecture (32-, 64- and 128-bit). The 128-bit variant is a paper exercise, but the other two really exist in silicon. RISC-V is always little endian. On Linux, the default kernel uname(2) can return "riscv" for all variants. However a patch was added recently which makes the kernel return one of "riscv32" or "riscv64" (or in future "riscv128"). So systemd should be prepared to handle any of "riscv", "riscv32" or "riscv64" (in future, "riscv128" but that is not included in the current patch). If the kernel returns "riscv" then you need to use the pointer size in order to know the real variant. The Fedora/RISC-V kernel only ever returns "riscv64" since we're only doing Fedora for 64 bit at the moment, and we've patched the kernel so it doesn't return "riscv". As well as the major bitsize variants, there are also architecture extensions. However I'm trying to ensure that uname(2) does *not* return any other information about those in utsname.machine, so that we don't end up with "riscv64abcde" nonsense. Instead those extensions will be exposed in /proc/cpuinfo similar to how flags work in x86.
2016-10-07manager: tighten incoming notification message checksLennart Poettering
Let's not accept datagrams with embedded NUL bytes. Previously we'd simply ignore everything after the first NUL byte. But given that sending us that is pretty ugly let's instead complain and refuse. With this change we'll only accept messages that have exactly zero or one NUL bytes at the very end of the datagram.
2016-10-07manager: be stricter with incomining notifications, warn properly about too ↵Lennart Poettering
large ones Let's make the kernel let us know the full, original datagram size of the incoming message. If it's larger than the buffer space provided by us, drop the whole message with a warning. Before this change the kernel would truncate the message for us to the buffer space provided, and we'd not complain about this, and simply process the incomplete message as far as it made sense.
2016-10-07manager: don't ever busy loop when we get a notification message we can't ↵Lennart Poettering
process If the kernel doesn't permit us to dequeue/process an incoming notification datagram message it's still better to stop processing the notification messages altogether than to enter a busy loop where we keep getting notified but can't do a thing about it. With this change, manager_dispatch_notify_fd() behaviour is changed like this: - if an error indicating a spurious wake-up is seen on recvmsg(), ignore it (EAGAIN/EINTR) - if any other error is seen on recvmsg() propagate it, thus disabling processing of further wakeups - if any error is seen on later code in the function, warn about it but do not propagate it, as in this cas we're not going to busy loop as the offending message is already dequeued.
2016-10-06core: add possibility to set action for ctrl-alt-del burst (#4105)Lukáš Nykrýn
For some certification, it should not be possible to reboot the machine through ctrl-alt-delete. Currently we suggest our customers to mask the ctrl-alt-delete target, but that is obviously not enough. Patching the keymaps to disable that is really not a way to go for them, because the settings need to be easily checked by some SCAP tools.
2016-10-06user-util: rework maybe_setgroups() a bitLennart Poettering
Let's drop the caching of the setgroups /proc field for now. While there's a strict regime in place when it changes states, let's better not cache it since we cannot really be sure we follow that regime correctly. More importantly however, this is not in performance sensitive code, and there's no indication the cache is really beneficial, hence let's drop the caching and make things a bit simpler. Also, while we are at it, rework the error handling a bit, and always return negative errno-style error codes, following our usual coding style. This has the benefit that we can sensible hanld read_one_line_file() errors, without having to updat errno explicitly.
2016-10-06tree-wide: drop some misleading compiler warningsLennart Poettering
gcc at some optimization levels thinks thes variables were used without initialization. it's wrong, but let's make the message go anyway.
2016-10-06core: leave PAM stub process around with GIDs updatedLennart Poettering
In the process execution code of PID 1, before 096424d1230e0a0339735c51b43949809e972430 the GID settings where changed before invoking PAM, and the UID settings after. After the change both changes are made after the PAM session hooks are run. When invoking PAM we fork once, and leave a stub process around which will invoke the PAM session end hooks when the session goes away. This code previously was dropping the remaining privs (which were precisely the UID). Fix this code to do this correctly again, by really dropping them else (i.e. the GID as well). While we are at it, also fix error logging of this code. Fixes: #4238
2016-10-06sd-bus: add DNS errors to the errno translation tableLennart Poettering
We generate these, hence we should also add errno translations for them.
2016-10-06resolved: properly handle BADCOOKIE DNS errorLennart Poettering
Add this new error code (documented in RFC7873) to our list of known errors.
2016-10-06sd-bus: add a few missing entries to the error translation tablesLennart Poettering
These were forgotten, let's add some useful mappings for all errors we define.
2016-10-06sd-device/networkd: unify code to get a socket for issuing netdev ioctls onLennart Poettering
As suggested here: https://github.com/systemd/systemd/pull/4296#issuecomment-251911349 Let's try AF_INET first as socket, but let's fall back to AF_NETLINK, so that we can use a protocol-independent socket here if possible. This has the benefit that our code will still work even if AF_INET/AF_INET6 is made unavailable (for exmple via seccomp), at least on current kernels.
2016-10-06Merge pull request #4280 from giuseppe/unprivileged-userLennart Poettering
[RFC] run systemd in an unprivileged container
2016-10-06Merge pull request #4199 from dvdhrm/hwdb-orderLennart Poettering
hwdb: return conflicts in a well-defined order
2016-10-06core: do not fail in a container if we can't use setgroupsGiuseppe Scrivano
It might be blocked through /proc/PID/setgroups
2016-10-06audit: disable if cannot create NETLINK_AUDIT socketGiuseppe Scrivano
2016-10-06networkd: fix coding style (#4294)Susant Sahani
2016-10-06journald, ratelimit: fix inaccurate message suppression in ↵Yuki Inoguchi
journal_rate_limit_test() (#4291) Currently, the ratelimit does not handle the number of suppressed messages accurately. Even though the number of messages reaches the limit, it still allows to add one extra messages to journal. This patch fixes the problem.
2016-10-05Fix typoGiuseppe Scrivano
2016-10-05networkd: use BridgeFDB as well on bridge ports (#4253)Tobias Jungel
[BridgeFDB] did not apply to bridge ports so far. This patch adds the proper handling. In case of a bridge interface the correct flag NTF_MASTER is now set in the netlink call. FDB MAC addresses are now applied in link_enter_set_addresses to make sure the link is setup.
2016-10-05seccomp: add support for the s390 architecture (#4287)hbrueckner
Add seccomp support for the s390 architecture (31-bit and 64-bit) to systemd. This requires libseccomp >= 2.3.1.
2016-10-05nspawn: add log message to let users know that nspawn needs an empty /dev ↵Djalal Harouni
directory (#4226) Fixes https://github.com/systemd/systemd/issues/3695 At the same time it adds a protection against userns chown of inodes of a shared mount point.
2016-10-04tree-wide: remove consecutive duplicate words in commentsStefan Schweter
2016-10-04list: LIST_INSERT_BEFORE: update head if necessary (#4261)Michael Olbrich
If the new item is inserted before the first item in the list, then the head must be updated as well. Add a test to the list unit test to check for this.
2016-10-04automount: make sure the expire event is restarted after a daemon-reload (#4265)Michael Olbrich
If the corresponding mount unit is deserialized after the automount unit then the expire event is set up in automount_trigger_notify(). However, if the mount unit is deserialized first then the automount unit is still in state AUTOMOUNT_DEAD and automount_trigger_notify() aborts without setting up the expire event. Explicitly call automount_start_expire() during coldplug to make sure that the expire event is set up as necessary. Fixes #4249.
2016-10-03nspawn: set shared propagation mode for the containerAlban Crequy
2016-10-01core: do not try to create /run/systemd/transient in test modeZbigniew Jędrzejewski-Szmek
This prevented systemd-analyze from unprivileged operation on older systemd installations, which should be possible. Also, we shouldn't touch the file system in test mode even if we can.
2016-10-01analyze-verify: honour $SYSTEMD_UNIT_PATH, allow system paths to be ignoredZbigniew Jędrzejewski-Szmek
SYSTEMD_UNIT_PATH=foobar: systemd-analyze verify barbar/unit.service will load units from barbar/, foobar/, /etc/systemd/system/, etc. SYSTEMD_UNIT_PATH= systemd-analyze verify barbar/unit.service will load units only from barbar/, which is useful e.g. when testing systemd's own units on a system with an older version of systemd installed.
2016-10-01core: complain if Before= dep on .device is declaredZbigniew Jędrzejewski-Szmek
[Unit] Before=foobar.device [Service] ExecStart=/bin/true Type=oneshot $ systemd-analyze verify before-device.service before-device.service: Dependency Before=foobar.device ignored (.device units cannot be delayed)
2016-10-01systemctl: Add --wait option to wait until started units terminate againMartin Pitt
Fixes #3830
2016-10-01nss-resolve: return NOTFOUND instead of UNAVAIL on resolution errorsMartin Pitt
It needs to be possible to tell apart "the nss-resolve module does not exist" (which can happen when running foreign-architecture programs) from "the queried DNS name failed DNSSEC validation" or other errors. So return NOTFOUND for these cases too, and only keep UNAVAIL for the cases where we cannot handle the given address family. This makes it possible to configure a fallback to "dns" without breaking DNSSEC, with "resolve [!UNAVAIL=return] dns". Add this to the manpage. This does not change behaviour if resolved is not running, as that already falls back to the "dns" glibc module. Fixes #4157
2016-10-01nss-resolve: simplify error handlingMartin Pitt
Handle general errors from the resolved call in _nss_resolve_gethostbyaddr2_r() the same say as in the other variants: Just "goto fail" as that does exactly the same.
2016-10-01core: update warning messageZbigniew Jędrzejewski-Szmek
"closing all" might suggest that _all_ fds received with the notification message will be closed. Reword the message to clarify that only the "unused" ones will be closed.
2016-10-01core: get rid of unneeded state variableZbigniew Jędrzejewski-Szmek
No functional change.
2016-09-30networkd: fix "parametres" typo (#4244)Elias Probst
2016-09-30Merge pull request #4225 from keszybz/coredumpMartin Pitt
coredump: remove Storage=both support, various fixes for sd-coredump and coredumpctl
2016-09-30resolved: don't query domain-limited DNS servers for other domains (#3621)Martin Pitt
DNS servers which have route-only domains should only be used for the specified domains. Routing queries about other domains there is a privacy violation, prone to fail (as that DNS server was not meant to be used for other domains), and puts unnecessary load onto that server. Introduce a new helper function dns_server_limited_domains() that checks if the DNS server should only be used for some selected domains, i. e. has some route-only domains without "~.". Use that when determining whether to query it in the scope, and when writing resolv.conf. Extend the test_route_only_dns() case to ensure that the DNS server limited to ~company does not appear in resolv.conf. Add test_route_only_dns_all_domains() to ensure that a server that also has ~. does appear in resolv.conf as global name server. These reproduce #3420. Add a new test_resolved_domain_restricted_dns() test case that verifies that domain-limited DNS servers are only being used for those domains. This reproduces #3421. Clarify what a "routing domain" is in the manpage. Fixes #3420 Fixes #3421
2016-09-29pid1: more informative error message for ignored notificationsZbigniew Jędrzejewski-Szmek
It's probably easier to diagnose a bad notification message if the contents are printed. But still, do anything only if debugging is on.
2016-09-29pid1: process zero-length notification messages againZbigniew Jędrzejewski-Szmek
This undoes 531ac2b234. I acked that patch without looking at the code carefully enough. There are two problems: - we want to process the fds anyway - in principle empty notification messages are valid, and we should process them as usual, including logging using log_unit_debug().
2016-09-29pid1: don't return any error in manager_dispatch_notify_fd() (#4240)Franck Bui
If manager_dispatch_notify_fd() fails and returns an error then the handling of service notifications will be disabled entirely leading to a compromised system. For example pid1 won't be able to receive the WATCHDOG messages anymore and will kill all services supposed to send such messages.