diff options
| author | Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> | 2017-02-26 16:46:23 -0500 | 
|---|---|---|
| committer | Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> | 2017-02-28 21:33:52 -0500 | 
| commit | 92e92d71faea0f107312f296b7756cc04281ba99 (patch) | |
| tree | de4c7d4e9d8baf7f07830b00c30a07e759324b47 /src/shared/machine-pool.h | |
| parent | cc4419ed929b5c7c99eaed020b015b1b2e8c7a66 (diff) | |
coredump: process special crashes in an (almost) normal way
We would only log a terse message when pid1 or systemd-journald crashed.
It seems better to reuse the normal code paths as much as possible,
with the following differences:
- if pid1 crashes, we cannot launch the helper, so we don't analyze the
 coredump, just write it to file directly from the helper invoked by the
 kernel;
- if journald crashes, we can produce the backtrace, but we don't log full
  structured messages.
With comparison to previous code, advantages are:
- we go through most of the steps, so for example vacuuming is performed,
- we gather and log more data. In particular for journald and pid1 crashes we
  generate a backtrace, and for pid1 crashes we record the metadata (fdinfo,
  maps, etc.),
- coredumpctl shows pid1 crashes.
A disavantage (inefficiency) is that we gather metadata for journald crashes
which is then ignored because _TRANSPORT=kernel does not support structued
messages.
Messages for the systemd-journald "crash" have _TRANSPORT=kernel, and
_TRANSPORT=journal for the pid1 "crash".
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=11/SEGV
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Unit entered failed state.
Feb 26 16:37:54 systemd-coredump[18801]: Process 18729 (systemd-journal) of user 0 dumped core.
Feb 26 16:37:54 systemd-coredump[18801]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.36c14bf3c6ce4c38914f441038990979.18729.1488145074000000.lz4
Feb 26 16:37:54 systemd-coredump[18801]: Stack trace of thread 18729:
Feb 26 16:37:54 systemd-coredump[18801]: #0  0x00007f46d6a06b8d fsync (libpthread.so.0)
Feb 26 16:37:54 systemd-coredump[18801]: #1  0x00007f46d71bfc47 journal_file_set_online (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #2  0x00007f46d71c1c31 journal_file_append_object (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #3  0x00007f46d71c3405 journal_file_append_data (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #4  0x00007f46d71c4b7c journal_file_append_entry (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #5  0x00005577688cf056 write_to_journal (systemd-journald)
Feb 26 16:37:54 systemd-coredump[18801]: #6  0x00005577688d2e98 dispatch_message_real (systemd-journald)
Feb 26 16:37:54 kernel: systemd-coredum: 9 output lines suppressed due to ratelimiting
Feb 26 16:37:54 systemd-journald[18810]: Journal started
Feb 26 16:50:59 systemd-coredump[19229]: Due to PID 1 having crashed coredump collection will now be turned off.
Feb 26 16:51:00 systemd[1]: Caught <SEGV>, dumped core as pid 19228.
Feb 26 16:51:00 systemd[1]: Freezing execution.
Feb 26 16:51:00 systemd-coredump[19229]: Process 19228 (systemd) of user 0 dumped core.
                                         Stack trace of thread 19228:
                                         #0  0x00007fab82075c47 kill (libc.so.6)
                                         #1  0x000055fdf7c38b6b crash (systemd)
                                         #2  0x00007fab824175c0 __restore_rt (libpthread.so.0)
                                         #3  0x00007fab82148573 epoll_wait (libc.so.6)
                                         #4  0x00007fab8366f84a sd_event_wait (libsystemd-shared-233.so)
                                         #5  0x00007fab836701de sd_event_run (libsystemd-shared-233.so)
                                         #6  0x000055fdf7c4a380 manager_loop (systemd)
                                         #7  0x000055fdf7c402c2 main (systemd)
                                         #8  0x00007fab82060401 __libc_start_main (libc.so.6)
                                         #9  0x000055fdf7c3818a _start (systemd)
Poor machine ;)
Diffstat (limited to 'src/shared/machine-pool.h')
0 files changed, 0 insertions, 0 deletions
