Age | Commit message (Collapse) | Author |
|
Before, journald would remove journal files until both MaxUse= and
KeepFree= settings would be satisfied. The first one depends (if set
automatically) on the size of the file system and is constant. But
the second one depends on current use of the file system, and a spike
in disk usage would cause journald to delete journal files, trying to
reach usage which would leave 15% of the disk free. This behaviour is
surprising for the user who doesn't expect his logs to be purged when
disk usage goes above 85%, which on a large disk could be some
gigabytes from being full. In addition attempting to keep 15% free
provides an attack vector where filling the disk sufficiently disposes
of almost all logs.
Instead, obey KeepFree= only as a limit on adding additional files.
When replacing old files with new, ignore KeepFree=. This means that
if journal disk usage reached some high point that at some later point
start to violate the KeepFree= constraint, journald will not add files
to go above this point, but it will stay (slightly) below it. When
journald is restarted, it forgets the previous maximum usage value,
and sets the limit based on the current usage, so if disk remains to
be filled, journald might use one journal-file-size less on each
restart, if restarts happen just after rotation. This seems like a
reasonable compromise between implementation complexity and robustness.
|
|
The available_space function now returns 0 if reading the directory
fails. Previously, such errors were silently ignored.
|
|
With this change a failing event source handler will not cause the
entire event loop to fail. Instead, we just disable the specific event
source, log a message at debug level and go on.
This also introduces a new concept of "exit code" which can be stored in
the event loop and is returned by sd_event_loop(). We also rename "quit"
to "exit" everywhere else.
Altogether this should make things more robus and keep errors local
while still providing a way to return event loop errors in a clear way.
|
|
log message
|
|
generating them fresh for each log entry
|
|
|
|
In the time it takes to process incoming log messages, the process we
are logging details for may exit. This means the cgroup data is no
longer available from '/proc'. Unfortunately, the way the code was
structured before, we never log _SYSTEMD_UNIT if we don't have this
cgroup information.
Add an else if case that allows the passed in unit_id to be logged even
if we couldn't capture cgroup information. This ensures a command like
`journalctl -u run-XXX` will return all log messages from a oneshot
process.
|
|
|
|
Instead of individually checking for containers in each user do this
once in a new call proc_cmdline() that read the file only if we are not
in a container.
|
|
|
|
|
|
Always cache the results, and bypass low-level security calls when the
respective subsystem is not enabled.
|
|
Before, when the user journal file was rotated, journal_file_rotate
could close the old file and fail to open the new file. In that
case, we would leave the old (deallocated) file in the hashmap.
On subsequent accesses, we could retrieve this stale entry, leading
to a segfault.
When journal_file_rotate fails with the file pointer set to 0,
old file is certainly gone, and cannot be used anymore.
https://bugzilla.redhat.com/show_bug.cgi?id=890463
|
|
|
|
Also print out unexpected epoll events explictly.
|
|
In order to avoid a deadlock between journald looking up the
"systemd-journal" group name, and nscd (or anyother NSS backing daemon)
logging something back to the journal avoid all NSS in journald the same
way as we avoid it from PID 1.
With this change we rely on the kernel file system logic to adjust the
group of created journal files via the SETGID bit on the journal
directory. To ensure that it is always set, even after the user created
it with a simply "mkdir" on the shell we fix it up via tmpfiles on boot.
|
|
|
|
Can help since the journal requires /etc/machine-id to exists in order to start,
and will simply silently exit when it does not.
|
|
Vacuuming behaviour is a bit confusing, and/or we have some bugs,
so those additional messages should help to find out what's going
on. Also, rotation of journal files shouldn't be happening too
often, so the level of the messages is bumped to info, so that
they'll be logged under normal operation.
|
|
|
|
Since the journal can handle multiple lines just well natively,
and rsyslog can be configured to handle them as well, there is no need
to truncate messages from syslog() after the first newline.
Reproducer:
1. Add following four lines to /etc/rsyslog.conf
----------
$EscapeControlCharactersOnReceive off
$ActionFileDefaultTemplate RSYSLOG_SysklogdFileFormat
$SpaceLFOnReceive on
$DropTrailingLFOnReception off
----------
3. Restart rsyslog
# service rsyslog restart
4. Compile and run the following program
----------
#include <stdio.h>
#include <syslog.h>
int main()
{
syslog(LOG_INFO, "aaa%caaa", '\n');
return 0;
}
----------
Actual results:
Below message appears in /var/log/messages.
----------
Sep 7 19:19:39 localhost test2: aaa
----------
Expected results:
Below message, which worked prior to systemd-journald
appears in /var/log/messages.
----------
Sep 7 19:19:39 localhost test2: aaa aaa
https://bugzilla.redhat.com/show_bug.cgi?id=855313
|
|
message
|
|
units at the same time
|
|
When using Storage=none there is no point in collecting all the
information just to throw them away. After this change journald
consumes a lot less CPU time when only forwarding messages.
|
|
I think this is the most important of the capabilities bitmasks to log.
|
|
|
|
Reporting of the free space was bogus, since the remaining space
was compared with the maximum allowed, instead of the current
use being compared with the maximum allowed. Simplify and fix
by reporting limits directly at the point where they are calculated.
Also, assign a UUID to the message.
|
|
Too many people kept hitting them, so let's increase the limits a bit.
https://bugzilla.redhat.com/show_bug.cgi?id=965803
|
|
When journald encounters a message with OBJECT_PID= set
coming from a priviledged process (UID==0), additional fields
will be added to the message:
OBJECT_UID=,
OBJECT_GID=,
OBJECT_COMM=,
OBJECT_EXE=,
OBJECT_CMDLINE=,
OBJECT_AUDIT_SESSION=,
OBJECT_AUDIT_LOGINUID=,
OBJECT_SYSTEMD_CGROUP=,
OBJECT_SYSTEMD_SESSION=,
OBJECT_SYSTEMD_OWNER_UID=,
OBJECT_SYSTEMD_UNIT= or OBJECT_SYSTEMD_USER_UNIT=.
This is for other logging daemons, like setroubleshoot, to be able to
augment their logs with data about the process.
https://bugzilla.redhat.com/show_bug.cgi?id=951627
|
|
Since the system journal wasn't open yet, available_space() returned 0.
Before:
systemd-journal[22170]: Allowing system journal files to grow to 4.0G.
systemd-journal[22170]: Journal size currently limited to 0B due to SystemKeepFree.
After:
systemd-journal[22178]: Allowing system journal files to grow to 4.0G.
systemd-journal[22178]: Journal size currently limited to 3.0G due to SystemKeepFree.
Also, when failing to write a message, show how much space was needed:
"Failed to write entry (26 items, 260123456 bytes) despite vacuuming, ignoring: ...".
|
|
In the following scenario:
server creates system.journal
server creates user-1000.journal
both journals share the same seqnum_id.
Then
server writes to user-1000.journal first,
and server writes to system.journal a bit later,
and everything is fine.
The server then terminates (crash, reboot, rsyslog testing,
whatever), and user-1000.journal has entries which end with
a lower seqnum than system.journal. Now
server is restarted
server opens user-1000.journal and writes entries to it...
BAM! duplicate seqnums for the same seqnum_id.
Now, we usually don't see that happen, because system.journal
is closed last, and opened first. Since usually at least one
message is written during boot and lands in the system.journal,
the seqnum is initialized from it, and is set to a number higher
than than anything found in user journals. Nevertheless, if
system.journal is corrupted and is rotated, it can happen that
an entry is written to the user journal with a seqnum that is
a duplicate with an entry found in the corrupted system.journal~.
When browsing the journal, journalctl can fall into a loop
where it tries to follow the seqnums, and tries to go the
next location by seqnum, and is transported back in time to
to the older duplicate seqnum. There is not way to find
out the maximum seqnum used in a multiple files, without
actually looking at all of them. But we don't want to do
that because it would be slow, and actually it isn't really
possible, because a file might e.g. be temporarily unaccessible.
Fix the problem by using different seqnum series for user
journals. Using the same seqnum series for rotated journals
is still fine, because we know that nothing will write
to the rotated journal anymore.
Likely related:
https://bugs.freedesktop.org/show_bug.cgi?id=64566
https://bugs.freedesktop.org/show_bug.cgi?id=59856
https://bugs.freedesktop.org/show_bug.cgi?id=64296
https://bugs.archlinux.org/task/35581
https://bugzilla.novell.com/show_bug.cgi?id=817778
Possibly related:
https://bugs.freedesktop.org/show_bug.cgi?id=64293
|
|
|
|
When reporting the maximum journal size add a hint if it's limited
by KeepFree.
|
|
Since 11ec7ce, journald isn't setting the ACLs properly anymore if
the files had no ACLs to begin with: acl_set_fd fails with EINVAL.
An ACL with ACL_USER or ACL_GROUP entries but no ACL_MASK entry is
invalid, so make sure a mask exists before trying to set the ACL.
|
|
Use timespec_store instead of (incorrectly) doing it inline.
|
|
Otherwise we might end up with executable files if some default ACL is
set for the journal directory.
|
|
and the disk is close to being full
Bump the minimal size of the journal so that we can be sure creating the
journal file will always succeed. Previously the minimum size was
smaller than a empty jounral file...
|
|
Disallow recursive .include, and make it unavailable in anything but
unit files.
|
|
A small patch to remove a build warnining when SELinux is disabled.
|
|
Session objects will now get the .session suffix, user objects the .user
suffix, nspawn containers the .nspawn suffix.
This also changes the user cgroups to be named after the numeric UID
rather than the username, since this allows us the parse these paths
standalone without requiring access to the cgroup file system.
This also changes the mapping of instanced units to cgroups. Instead of
mapping foo@bar.service to the cgroup path /user/foo@.service/bar we
will now map it to /user/foo@.service/foo@bar.service, in order to
ensure that all our objects are properly suffixed in the tree.
|
|
http://lists.freedesktop.org/archives/systemd-devel/2013-April/010510.html
|
|
The information about the unit for which files are being parsed
is passed all the way down. This way messages land in the journal
with proper UNIT=... or USER_UNIT=... attribution.
'systemctl status' and 'journalctl -u' not displaying those messages
has been a source of confusion for users, since the journal entry for
a misspelt setting was often logged quite a bit earlier than the
failure to start a unit.
Based-on-a-patch-by: Oleksii Shevchuk <alxchk@gmail.com>
|
|
containers there
Containers will now carry a label (normally derived from the root
directory name, but configurable by the user), and the container's root
cgroup is /machine/<label>. This label is called "machine name", and can
cover both containers and VMs (as soon as libvirt also makes use of
/machine/).
libsystemd-login can be used to query the machine name from a process.
This patch also includes numerous clean-ups for the cgroup code.
|
|
|
|
|
|
|
|
|
|
Avoid the dynamic allocation for the _UID, _GID, and _PID strings.
The maximum size of the string can be determined at compile time.
The code has only been compile tested.
|
|
When systemd was compiled without audit support, do not collect the
audit session and loginuid in the journal. This is saving a couple of
syscalls and memory allocations per log message.
|
|
Before, we would initialize many fields twice: first
by filling the structure with zeros, and then a second
time with the real values. We can let the compiler do
the job for us, avoiding one copy.
A downside of this patch is that text gets slightly
bigger. This is because all zero() calls are effectively
inlined:
$ size build/.libs/systemd
text data bss dec hex filename
before 897737 107300 2560 1007597 f5fed build/.libs/systemd
after 897873 107300 2560 1007733 f6075 build/.libs/systemd
… actually less than 1‰.
A few asserts that the parameter is not null had to be removed. I
don't think this changes much, because first, it is quite unlikely
for the assert to fail, and second, an immediate SEGV is almost as
good as an assert.
|