Age | Commit message (Collapse) | Author |
|
If the kernel has no CONFIG_SCHED_DEBUG option set, systemd-bootchart produces
empty .svg file. The reason for this is very fragile file descriptor logic in
log_sample() and main() (/* do some cleanup, close fd's */ block). There are
many places where file descriptors are closed on failure (missing SCHED_DEBUG
provokes it), but there are several problems with it:
- following iterations in the loop see that the descriptor is non zero and do
not open the corresponding file again;
- "some cleanup" code closes already closed files and the descriptors are reused
already, in particular for resulting .svg file;
- static "vmstat" and "schedstat" variables in log_sample() made the situation
even worse.
These are the strace fragments:
[...]
close(7) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
pread(7, 0xbea60a2c, 4095, 0) = -1 EBADF (Bad file descriptor)
close(7) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
pread(7, 0xbea60a2c, 4095, 0) = -1 EBADF (Bad file descriptor)
close(7) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
getdents64(4, /* 0 entries */, 32768) = 0
clock_gettime(CLOCK_MONOTONIC, {24, 783843501}) = 0
nanosleep({0, 5221792}, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {24, 789726835}) = 0
lseek(4, 0, SEEK_SET) = 0
pread(5, "nr_free_pages 52309\nnr_alloc_bat"..., 4095, 0) = 685
pread(6, "version 15\ntimestamp 4294939775\n"..., 4095, 0) = 86
getdents64(4, /* 99 entries */, 32768) = 2680
pread(7, 0xbea60a2c, 4095, 0) = -1 EBADF (Bad file descriptor)
close(7) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
pread(8, 0xbea60a2c, 4095, 0) = -1 EBADF (Bad file descriptor)
close(8) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
pread(9, 0xbea60a2c, 4095, 0) = -1 EBADF (Bad file descriptor)
close(9) = -1 EBADF (Bad file descriptor)
[...]
where it obviously tries to close same and reused decriptors many times, also
passing return code "-1" instead of descriptor...
[...]
close(7) = -1 EBADF (Bad file descriptor)
close(-1) = -1 EBADF (Bad file descriptor)
pipe2([7, 8], O_CLOEXEC) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb6fd0068) = 192
close(8) = 0
fcntl64(7, F_SETFD, 0) = 0
fstat64(7, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6fd2000
read(7, "[ 0.074507] calling vfp_init"..., 4096) = 4096
[...]
read(7, "s)\n[ 6.228910] UBIFS: reserve"..., 4096) = 4096
read(7, "trary Executable File Formats Fi"..., 4096) = 1616
read(7, "", 4096) = 0
close(7) = 0
wait4(192, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 192
|
|
In Debian and rawhide Fedora, which have CONFIG_SCHEDSTATS=n,
bootchart creates empty files in /run/log before printing an error.
Stop doing that.
Moreover this duplicated part of the code doesn't even have error checking
so there is no error avoided by doing this early.
Reported-by: tfirg_ on IRC
|
|
|
|
|
|
Commit 6e1bf7ab99 used the wrong directory; we need rootlibexecdir, not
rootlibdir, as the latter is something like /lib/x86_64-linux-gnu/ on
multi-arch systems.
https://launchpad.net/bugs/1423867
|
|
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
|
|
|
|
When booting with systemd-bootchart, default to call the systemd binary
rather than the init binary on disk, which might be another init system.
Collecting data only works with booting systemd.
|
|
|
|
If the format string contains %m, clearly errno must have a meaningful
value, so we might as well use log_*_errno to have ERRNO= logged.
Using:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\((".*%m.*")/log_\1_errno(errno, \2/'
Plus some whitespace, linewrap, and indent adjustments.
|
|
Basically:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/log_(debug|info|notice|warning|error|emergency)\("([^"]*)%s"([^;]*),\s*strerror\(-?([->a-zA-Z_]+)\)\);/log_\1_errno(\4, "\2%m"\3);/gms;print;' \
$f; done
Plus manual indentation fixups.
|
|
It corrrectly handles both positive and negative errno values.
|
|
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
|
|
Spotted with coverity. If parsing both /etc/os-release and
/usr/lib/os-release fails then null would be passed on. The calls
to parse the two files are allowed to fail. A empty /etc may not
have had the /etc/os-release symlink restored yet and we just
try again in the loop. If for whatever reason that does not happen
then we now pass on 'n/a' instead of null.
|
|
Found by coverity. Fixes: CID#996314 and #996312
|
|
|
|
getopt is usually good at printing out a nice error message when
commandline options are invalid. It distinguishes between an unknown
option and a known option with a missing arg. It is better to let it
do its job and not use opterr=0 unless we actually want to suppress
messages. So remove opterr=0 in the few places where it wasn't really
useful.
When an error in options is encountered, we should not print a lengthy
help() and overwhelm the user, when we know precisely what is wrong
with the commandline. In addition, since help() prints to stdout, it
should not be used except when requested with -h or --help.
Also, simplify things here and there.
|
|
We always read system uptime before log start time. So the uptime
should be always smaller number, except it includes system suspend
time. It seems better to ask for --rel and exit() than try to be
smart and try to recovery from this situation or generate huge
messy graphs.
|
|
* systemd-bootchart always parses /proc/uptime, although the
information is unnecessary when --rel specified
* use /proc/uptime is overkill, since Linux 2.6.39 we have
clock_gettime(CLOCK_BOOTTIME, ...). The backend on kernel side is
get_monotonic_boottime() in both cases.
* main() uses "if (graph_start <= 0.0)" to detect that /proc is
available.
This is fragile solution as graph_start is always smaller than zero
on all systems after suspend/resume (e.g. laptops), because in this
case the system uptime includes suspend time and uptime is always
greater number than monotonic time. For example right now difference
between uptime and monotonic time is 37 hours on my laptop.
Note that main() calls log_uptime() (to parse /proc/uptime) for each
sample when it believes that /proc is not available. So on my laptop
systemd-boochars spends all live with /proc/uptime parsing +
nanosleep(), try
strace /usr/lib/systemd/systemd-bootchart
to see the never ending loop.
This patch uses access("/proc/vmstat", F_OK) to detect procfs.
|
|
In practice this shouldn't make much difference, but
sometimes our headers might be newer, and we want to
test them.
|
|
|
|
Special care is needed so that we get an error message if the
file failed to parse, but not when it is missing. To avoid duplicating
the same error check in every caller, add an additional 'warn' boolean
to tell config_parse whether a message should be issued.
This makes things both shorter and more robust wrt. to error reporting.
|
|
|
|
The file should have been in /usr/lib/ in the first place, since it
describes the OS container in /usr (and not the configuration in /etc),
hence, let's support os-release files in /usr/lib as fallback if no
version in /etc exists, following the usual override logic.
A prior commit already enabled tmpfiles to create /etc/os-release as a
symlink to /usr/lib/os-release should it be missing, thus providing nice
compatibility with applications only checking in /etc.
While it's probably a good idea if all apps check both locations via a
fallback logic, it is only necessary in the early boot process, as long
as the /etc/os-release symlink has not been restored, in case we boot
with an empty /etc.
|
|
|
|
|
|
|
|
As pointed-out by clang -Wunreachable-code.
No behaviour changes.
|
|
|
|
|
|
including it in the log strings
|
|
- Add space between if/for and the opening parentheses
- Place the opening brace on same line as the function (not for udev)
From the CODING_STYLE
Try to use this:
void foo() {
}
instead of this:
void foo()
{
}
|
|
each invocation
We can determine the list entry type via the typeof() gcc construct, and
so we should to make the macros much shorter to use.
|
|
Since the invention of read-only memory, write-only memory has been
considered deprecated. Where appropriate, either make use of the
value, or avoid writing it, to make it clear that it is not used.
|
|
"Corporation" was misspelled as "Coproration"
|
|
Instead of storing bootchart sample data in arrays, this patch moves
storage to linked lists so that there is no more limit on samples.
This patch also fixes parsing of /proc/<pid>/smaps in kernels > 3.7.
|
|
Disallow recursive .include, and make it unavailable in anything but
unit files.
|
|
http://lists.freedesktop.org/archives/systemd-devel/2013-April/010510.html
|
|
|
|
The information about the unit for which files are being parsed
is passed all the way down. This way messages land in the journal
with proper UNIT=... or USER_UNIT=... attribution.
'systemctl status' and 'journalctl -u' not displaying those messages
has been a source of confusion for users, since the journal entry for
a misspelt setting was often logged quite a bit earlier than the
failure to start a unit.
Based-on-a-patch-by: Oleksii Shevchuk <alxchk@gmail.com>
|
|
This bit of code is mostly stolen from coredump.c. We construct
a simple journal message and append the bootchart file in the
journal automatically.
You can extract the latest bootchart from the current boot with
something like:
$ journalctl -b MESSAGE_ID=9f26aa562cf440c2b16c773d0479b518 --field=BOOTCHART
which prints it to stdout.
None of the other logic is touched. The journal entry is created
even if bootchart was run manually, which is probably wrong.
|
|
|
|
If the configured number of samples was close to MAXSAMPLES,
the samples buffer could be overrun:
- by 1, because of off-by-one in the condition (samples > arg_samples_len),
and
- by many in case of an overrun, because the number of samples to
capture was increased, instead of being decreased.
Simplify things by converting to a normal for-loop.
In store.c: change buffer size from 4095 to 4096. 4095 is a strange
number.
|
|
|
|
system inherits the kernel default
|
|
Let's update bootchar to share the coding style a bit more with the rest
of the package.
- Some tabs/spaces fixes
- add #pragma to header
- split up header so that we have a 1:1 relation between .c and .h files
like everywhere else
- Prefix user command line arguments/configuration settings with "arg_".
- other coding style fixes
|
|
This reverts commit 0ea9530d401827e299c6e04a433e69a7a2a89d80.
attribute(cleanup) can only be used inside functions (*of, sysfd
are leaked).
Cleanup functions are only called once when exiting scope (*f
is leaked twice).
|
|
use _cleanup_{close_,fclose_} to close streams and file descriptors
|
|
|
|
|