Age | Commit message (Collapse) | Author |
|
We check these a number of times, hence let's unify these checks here.
This also allows us to make the PID 1 check more elaborate as we can
check both the PID and the cgroup. Checking the PID has the benefit that
we'll also cover cases where PID 1 might still be in the root cgroup, and
the cgroup check has the benefit that we also cover crashes in forked
off crasher processes (the way we actually do it in systemd)
|
|
Given that this is a field primarily processed by computers, and not so
much by humans, assign "1" instead of "yes". Also, use parse_boolean()
as we usually do for parsing it again.
This makes things more alike udev options (as one example), such as
SYSTEMD_READY where we also spit out "1" and "0", and parse with
parse_boolean().
|
|
Fixes #5477.
|
|
[guest@fedora ~]$ coredumpctl
No coredumps found.
[guest@fedora ~]$ ./coredumpctl
Hint: You are currently not seeing messages from other users and the system.
Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages.
Pass -q to turn off this notice.
No coredumps found.
Fixes #1733.
|
|
This is just a hint, so we shouldn't wait too long. A short timeout
helps for the case where pid1 of dbus have crashed.
|
|
We would only log a terse message when pid1 or systemd-journald crashed.
It seems better to reuse the normal code paths as much as possible,
with the following differences:
- if pid1 crashes, we cannot launch the helper, so we don't analyze the
coredump, just write it to file directly from the helper invoked by the
kernel;
- if journald crashes, we can produce the backtrace, but we don't log full
structured messages.
With comparison to previous code, advantages are:
- we go through most of the steps, so for example vacuuming is performed,
- we gather and log more data. In particular for journald and pid1 crashes we
generate a backtrace, and for pid1 crashes we record the metadata (fdinfo,
maps, etc.),
- coredumpctl shows pid1 crashes.
A disavantage (inefficiency) is that we gather metadata for journald crashes
which is then ignored because _TRANSPORT=kernel does not support structued
messages.
Messages for the systemd-journald "crash" have _TRANSPORT=kernel, and
_TRANSPORT=journal for the pid1 "crash".
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=11/SEGV
Feb 26 16:27:55 systemd[1]: systemd-journald.service: Unit entered failed state.
Feb 26 16:37:54 systemd-coredump[18801]: Process 18729 (systemd-journal) of user 0 dumped core.
Feb 26 16:37:54 systemd-coredump[18801]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.36c14bf3c6ce4c38914f441038990979.18729.1488145074000000.lz4
Feb 26 16:37:54 systemd-coredump[18801]: Stack trace of thread 18729:
Feb 26 16:37:54 systemd-coredump[18801]: #0 0x00007f46d6a06b8d fsync (libpthread.so.0)
Feb 26 16:37:54 systemd-coredump[18801]: #1 0x00007f46d71bfc47 journal_file_set_online (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #2 0x00007f46d71c1c31 journal_file_append_object (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #3 0x00007f46d71c3405 journal_file_append_data (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #4 0x00007f46d71c4b7c journal_file_append_entry (libsystemd-shared-233.so)
Feb 26 16:37:54 systemd-coredump[18801]: #5 0x00005577688cf056 write_to_journal (systemd-journald)
Feb 26 16:37:54 systemd-coredump[18801]: #6 0x00005577688d2e98 dispatch_message_real (systemd-journald)
Feb 26 16:37:54 kernel: systemd-coredum: 9 output lines suppressed due to ratelimiting
Feb 26 16:37:54 systemd-journald[18810]: Journal started
Feb 26 16:50:59 systemd-coredump[19229]: Due to PID 1 having crashed coredump collection will now be turned off.
Feb 26 16:51:00 systemd[1]: Caught <SEGV>, dumped core as pid 19228.
Feb 26 16:51:00 systemd[1]: Freezing execution.
Feb 26 16:51:00 systemd-coredump[19229]: Process 19228 (systemd) of user 0 dumped core.
Stack trace of thread 19228:
#0 0x00007fab82075c47 kill (libc.so.6)
#1 0x000055fdf7c38b6b crash (systemd)
#2 0x00007fab824175c0 __restore_rt (libpthread.so.0)
#3 0x00007fab82148573 epoll_wait (libc.so.6)
#4 0x00007fab8366f84a sd_event_wait (libsystemd-shared-233.so)
#5 0x00007fab836701de sd_event_run (libsystemd-shared-233.so)
#6 0x000055fdf7c4a380 manager_loop (systemd)
#7 0x000055fdf7c402c2 main (systemd)
#8 0x00007fab82060401 __libc_start_main (libc.so.6)
#9 0x000055fdf7c3818a _start (systemd)
Poor machine ;)
|
|
Unit systemd-coredump@1-3854-0.service is failed/failed, not counting it.
TIME PID UID GID SIG COREFILE EXE
Fri 2017-02-24 11:11:00 EST 10002 1000 1000 6 none /home/zbyszek/src/systemd-work/.libs/lt-Sat 2017-02-25 00:49:32 EST 26921 0 0 11 error /usr/libexec/fprintd
Sat 2017-02-25 11:56:30 EST 30703 1000 1000 - - /usr/bin/python3.5
Sat 2017-02-25 13:16:54 EST 3275 1000 1000 11 present /usr/bin/bash
Sat 2017-02-25 17:25:40 EST 4049 1000 1000 11 truncated /usr/bin/bash
For info and gdb output, the filename is marked in red and "(truncated)" is
appended. (Red is necessary because the annotation is hard to see when running
under a pager.)
Fixed #3883.
|
|
A few times I have seen the hint unexpectedly. Add this so debug info
so it's easier to see what's happening.
...
Unit systemd-coredump@0-3119-0.service is failed/failed, not counting it.
Unit systemd-coredump@1-3854-0.service is activating/start-pre, counting it.
...
-- Notice: 1 systemd-coredump@.service unit is running, output may be incomplete.
|
|
We logged about this, but did not attach information directly to the log
entry. It *would* be nice to log the full untruncated size, but afaict, to do
this, we would have to read the full data from the kernel. Doing this just to
log that information seems a bit excessive, in particular when the limit could
be set quite low. So for now let's just add a boolean field.
|
|
Most of the fields in the context array come from the kernel (passed
through argv), but two are special: comm and exe. We allocate them
ourselves. We forgot to initialize context[CONTEXT_COMM] with the value
we allocated (introduced in 9aa820231414baa28e6bf02a033932cb69ff6b8b).
To simplify things, just set context[CONTEXT_COMM] and context[CONTEXT_EXE],
and free those two fields at the end.
Fixes #5442.
|
|
|
|
Implement --since/--until (-S/-U) in the same fashion as journalctl.
This lets the user filter the results a bit so it would be easier to
find relevant info in case there were many core dumps.
|
|
From: #5393
|
|
Fixes #4685.
|
|
various coredump fixes
|
|
We didn't include the resource limit field, add it.
|
|
trailing zeroes
Our coredump handler operates on a "context" supplied by the kernel via
the core_pattern arguments. When we pass off a coredump for processing
to coredumpd we pass along enough information for this context to be
reconstructed. This information is passed in the usual journal fields,
and that means we extended the 1s granularity timestamp to 1µs
granularity by appending 6 zeroes. We need to chop them off again when
reconstructing the original kernel context.
Fixes: #4779
|
|
(Note that we only do this for the journal metadata, not for the xattrs,
as the xattrs are only supposed to store the original 1:1 info we
acquired from the kernel.)
|
|
When we encounter a "special" crash we should not continue processing it
the usual way.
|
|
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.
|
|
|
|
$ ./coredumpctl --no-pager -1
TIME PID UID GID SIG COREFILE EXE
Sun 2016-11-06 10:10:51 EST 29514 1002 1002 - - /usr/bin/python3.5
$ ./coredumpctl info 29514
PID: 29514 (python3)
UID: 1002 (zbyszek)
GID: 1002 (zbyszek)
Reason: ZeroDivisionError
Timestamp: Sun 2016-11-06 10:10:51 EST (3h 22min ago)
Command Line: python3 systemd_coredump_exception_handler.py
Executable: /usr/bin/python3.5
Control Group: /user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
Unit: user@1002.service
User Unit: gnome-terminal-server.service
Slice: user-1002.slice
Owner UID: 1002 (zbyszek)
Boot ID: 1531fd22ec84429e85ae888b12fadb91
Machine ID: 519a16632fbd4c71966ce9305b360c9c
Hostname: laptop
Storage: none
Message: Process 29514 (systemd_coredump_exception_handler.py) of user zbyszek failed with ZeroDivisionError: division by
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 134, in <module>
g()
File "systemd_coredump_exception_handler.py", line 133, in g
f()
File "systemd_coredump_exception_handler.py", line 131, in f
div0 = 1 / 0
ZeroDivisionError: division by zero
Local variables in innermost frame:
a=3
h=<function f at 0x7efdc14b6ea0>
|
|
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
"conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.
Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:
$ size .libs/systemd{,-logind,-journald}{.old,}
text data bss dec hex filename
1265204 149564 4808 1419576 15a938 .libs/systemd.old
1260268 149564 4808 1414640 1595f0 .libs/systemd
246805 13852 209 260866 3fb02 .libs/systemd-logind.old
240973 13852 209 255034 3e43a .libs/systemd-logind
146839 4984 34 151857 25131 .libs/systemd-journald.old
146391 4984 34 151409 24f71 .libs/systemd-journald
It is also much easier to check if a certain binary uses a certain MESSAGE_ID:
$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
|
|
No functional change, and we don't lose match order.
|
|
The entry must be a single entry in the journal export format, including the
terminating double newline. The MESSAGE field is now generated on the sender
side.
The advantage is that the reporter can easily pass additional metadata.
Continuing with the example of the python excepthook:
COREDUMP_PYTHON_EXECUTABLE=/usr/bin/python3
COREDUMP_PYTHON_VERSION=3.5.2 (default, Sep 14 2016, 11:28:32)
[GCC 6.2.1 20160901 (Red Hat 6.2.1-1)]
COREDUMP_PYTHON_THREAD_INFO=sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.24')
COREDUMP_PYTHON_EXCEPTION_TYPE=ZeroDivisionError
COREDUMP_PYTHON_EXCEPTION_VALUE=division by zero
MESSAGE=Process 29514 (systemd_coredump_exception_handler.py) of user zbyszek failed with ZeroDivisionError: division by zero
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 134, in <module>
g()
File "systemd_coredump_exception_handler.py", line 133, in g
f()
File "systemd_coredump_exception_handler.py", line 131, in f
div0 = 1 / 0
ZeroDivisionError: division by zero
Local variables in innermost frame:
a=3
h=<function f at 0x7efdc14b6ea0>
One consideration is whether to use the Journal Export Format, or send packets
over a UNIX socket instead. The advantage of current solution is that although
parsing is more complicated on the receiver side, it is much easier to use on the
sender side. I hope this can be used by various languages for which writing
binary structures to a UNIX socket is harder and more likely to be done wrong
than piping of a simple textyish format.
|
|
No functional change.
|
|
This is useful for example for Python progams. By installing a python
sys.execepthook we can store the backtrace in the journal. We gather the
backtrace in the python process, and call systemd-coredump to attach additional
fields (COREDUMP_COMM, COREDUMP_EXE, COREDUMP_UNIT, COREDUMP_USER_UNIT,
COREDUMP_OWNER_UID, COREDUMP_SLICE, COREDUMP_CMDLINE, COREDUMP_CGROUP,
COREDUMP_OPEN_FDS, COREDUMP_PROC_STATUS, COREDUMP_PROC_MAPS,
COREDUMP_PROC_LIMITS, COREDUMP_PROC_MOUNTINFO, COREDUMP_CWD, COREDUMP_ROOT,
COREDUMP_ENVIRON, COREDUMP_CONTAINER_CMDLINE). This could also be done in the
python process, but doing this in systemd-coredump saves quite a bit of
duplicate work and unifies the handling of various tricky fields like
COREDUMP_CONTAINER_CMDLINE in one place.
(Of course this applies to any other language which does not dump cores
but wants to log a traceback, e.g. ruby.)
journal entry:
_TRANSPORT=journal
_UID=1002
_GID=1002
_CAP_EFFECTIVE=0
_AUDIT_LOGINUID=1002
_SYSTEMD_OWNER_UID=1002
_SYSTEMD_SLICE=user-1002.slice
_SYSTEMD_USER_SLICE=-.slice
_SELINUX_CONTEXT=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
_BOOT_ID=1531fd22ec84429e85ae888b12fadb91
_MACHINE_ID=519a16632fbd4c71966ce9305b360c9c
_HOSTNAME=laptop
_AUDIT_SESSION=1
_SYSTEMD_UNIT=user@1002.service
_SYSTEMD_INVOCATION_ID=3c4238d790a44aca9576ecdb2c7576d3
COREDUMP_UNIT=user@1002.service
COREDUMP_USER_UNIT=gnome-terminal-server.service
COREDUMP_UID=1002
COREDUMP_GID=1002
COREDUMP_OWNER_UID=1002
COREDUMP_SLICE=user-1002.slice
COREDUMP_CGROUP=/user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
COREDUMP_PROC_LIMITS=Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 15413 15413 processes
Max open files 4096 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 15413 15413 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
COREDUMP_PROC_CGROUP=1:name=systemd:/
0::/user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
COREDUMP_PROC_MOUNTINFO=17 39 0:17 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel
18 39 0:4 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
19 39 0:6 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=1972980k,nr_inodes=493245,mode=755
20 17 0:18 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - securityfs securityfs rw
21 19 0:19 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw,seclabel
22 19 0:20 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
23 39 0:21 / /run rw,nosuid,nodev shared:12 - tmpfs tmpfs rw,seclabel,mode=755
24 17 0:22 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime shared:8 - cgroup2 cgroup rw
25 17 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:9 - pstore pstore rw,seclabel
36 17 0:24 / /sys/kernel/config rw,relatime shared:10 - configfs configfs rw
39 0 0:26 /root / rw,relatime shared:1 - btrfs /dev/mapper/fedora-root2 rw,seclabel,ssd,space_cache,subvolid=257,subvol=/root
26 17 0:16 / /sys/fs/selinux rw,relatime shared:11 - selinuxfs selinuxfs rw
27 19 0:15 / /dev/mqueue rw,relatime shared:13 - mqueue mqueue rw,seclabel
28 18 0:30 / /proc/sys/fs/binfmt_misc rw,relatime shared:14 - autofs systemd-1 rw,fd=35,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13663
29 17 0:7 / /sys/kernel/debug rw,relatime shared:15 - debugfs debugfs rw,seclabel
30 19 0:31 / /dev/hugepages rw,relatime shared:16 - hugetlbfs hugetlbfs rw,seclabel
31 18 0:32 / /proc/fs/nfsd rw,relatime shared:17 - nfsd nfsd rw
32 28 0:33 / /proc/sys/fs/binfmt_misc rw,relatime shared:18 - binfmt_misc binfmt_misc rw
57 39 0:34 / /tmp rw,relatime shared:19 - tmpfs none rw,seclabel
61 57 0:35 / /tmp/test rw,relatime shared:20 - autofs systemd-1 rw,fd=48,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=18251
59 39 8:1 / /boot rw,relatime shared:21 - ext4 /dev/sda1 rw,seclabel,data=ordered
60 39 253:2 / /home rw,relatime shared:22 - ext4 /dev/mapper/fedora-home rw,seclabel,data=ordered
65 39 0:37 / /var/lib/nfs/rpc_pipefs rw,relatime shared:23 - rpc_pipefs sunrpc rw
136 23 0:39 / /run/user/1002 rw,nosuid,nodev,relatime shared:91 - tmpfs tmpfs rw,seclabel,size=397432k,mode=700,uid=1002,gid=1002
211 23 0:41 / /run/user/42 rw,nosuid,nodev,relatime shared:163 - tmpfs tmpfs rw,seclabel,size=397432k,mode=700,uid=42,gid=42
329 136 0:44 / /run/user/1002/gvfs rw,nosuid,nodev,relatime shared:277 - fuse.gvfsd-fuse gvfsd-fuse rw,user_id=1002,group_id=1002
287 61 253:3 / /tmp/test rw,relatime shared:236 - ext4 /dev/mapper/fedora-test rw,seclabel,data=ordered
217 23 0:42 / /run/user/1000 rw,nosuid,nodev,relatime shared:168 - tmpfs tmpfs rw,seclabel,size=397432k,mode=700,uid=1000,gid=1000
225 217 0:43 / /run/user/1000/gvfs rw,nosuid,nodev,relatime shared:175 - fuse.gvfsd-fuse gvfsd-fuse rw,user_id=1000,group_id=1000
COREDUMP_ROOT=/
PRIORITY=2
CODE_FILE=src/coredump/coredump.c
SYSLOG_IDENTIFIER=lt-systemd-coredump
_COMM=lt-systemd-core
_SYSTEMD_CGROUP=/user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
_SYSTEMD_USER_UNIT=gnome-terminal-server.service
MESSAGE_ID=1f4e0a44a88649939aaea34fc6da8c95
CODE_FUNC=process_traceback
COREDUMP_COMM=python3
COREDUMP_EXE=/usr/bin/python3.5
COREDUMP_CMDLINE=python3 systemd_coredump_exception_handler.py
COREDUMP_CWD=/home/zbyszek/src/systemd-coredump-python
COREDUMP_RLIMIT=-1
COREDUMP_OPEN_FDS=0:/dev/pts/1
pos: 0
flags: 0102002
mnt_id: 22
1:/dev/pts/1
pos: 0
flags: 0102002
mnt_id: 22
2:/dev/pts/1
pos: 0
flags: 0102002
mnt_id: 22
CODE_LINE=1284
COREDUMP_SIGNAL=ZeroDivisionError: division by zero
COREDUMP_ENVIRON=LANG=en_US.utf8
DISPLAY=:0
...
MANWIDTH=90
LC_MESSAGES=en_US.utf8
PYTHONPATH=.
_=/usr/bin/python3
COREDUMP_PID=14498
COREDUMP_PROC_STATUS=Name: python3
Umask: 0002
State: S (sleeping)
Tgid: 14498
Ngid: 0
Pid: 14498
PPid: 16245
TracerPid: 0
Uid: 1002 1002 1002 1002
Gid: 1002 1002 1002 1002
FDSize: 64
Groups:
NStgid: 14498
NSpid: 14498
NSpgid: 14498
NSsid: 16245
VmPeak: 34840 kB
VmSize: 34792 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 9332 kB
VmRSS: 9332 kB
RssAnon: 4872 kB
RssFile: 4460 kB
RssShmem: 0 kB
VmData: 5012 kB
VmStk: 136 kB
VmExe: 4 kB
VmLib: 5452 kB
VmPTE: 84 kB
VmPMD: 12 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
Threads: 1
SigQ: 0/15413
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000001001000
SigCgt: 0000000180000002
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Seccomp: 0
Cpus_allowed: f
Cpus_allowed_list: 0-3
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 2
nonvoluntary_ctxt_switches: 47
COREDUMP_PROC_MAPS=55cb7b7fe000-55cb7b7ff000 r-xp 00000000 00:1a 5289186 /usr/bin/python3.5
55cb7b9ff000-55cb7ba00000 r--p 00001000 00:1a 5289186 /usr/bin/python3.5
55cb7ba00000-55cb7ba01000 rw-p 00002000 00:1a 5289186 /usr/bin/python3.5
55cb7c007000-55cb7c189000 rw-p 00000000 00:00 0 [heap]
7f4da2d51000-7f4da2d54000 r-xp 00000000 00:1a 5279150 /usr/lib64/python3.5/lib-dynload/resource.cpython-35m-x86_64-linux-gnu.so
7f4da2d54000-7f4da2f53000 ---p 00003000 00:1a 5279150 /usr/lib64/python3.5/lib-dynload/resource.cpython-35m-x86_64-linux-gnu.so
7f4da2f53000-7f4da2f54000 r--p 00002000 00:1a 5279150 /usr/lib64/python3.5/lib-dynload/resource.cpython-35m-x86_64-linux-gnu.so
7f4da2f54000-7f4da2f55000 rw-p 00003000 00:1a 5279150 /usr/lib64/python3.5/lib-dynload/resource.cpython-35m-x86_64-linux-gnu.so
7f4da2f55000-7f4da2f5d000 r-xp 00000000 00:1a 5279143 /usr/lib64/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
7f4da2f5d000-7f4da315c000 ---p 00008000 00:1a 5279143 /usr/lib64/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
7f4da315c000-7f4da315d000 r--p 00007000 00:1a 5279143 /usr/lib64/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
7f4da315d000-7f4da315f000 rw-p 00008000 00:1a 5279143 /usr/lib64/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
7f4da315f000-7f4da319f000 rw-p 00000000 00:00 0
7f4da319f000-7f4da31a4000 r-xp 00000000 00:1a 5279151 /usr/lib64/python3.5/lib-dynload/select.cpython-35m-x86_64-linux-gnu.so
7f4da31a4000-7f4da33a3000 ---p 00005000 00:1a 5279151 /usr/lib64/python3.5/lib-dynload/select.cpython-35m-x86_64-linux-gnu.so
7f4da33a3000-7f4da33a4000 r--p 00004000 00:1a 5279151 /usr/lib64/python3.5/lib-dynload/select.cpython-35m-x86_64-linux-gnu.so
7f4da33a4000-7f4da33a6000 rw-p 00005000 00:1a 5279151 /usr/lib64/python3.5/lib-dynload/select.cpython-35m-x86_64-linux-gnu.so
7f4da33a6000-7f4da33a9000 r-xp 00000000 00:1a 5279130 /usr/lib64/python3.5/lib-dynload/_posixsubprocess.cpython-35m-x86_64-linux-gnu.so
7f4da33a9000-7f4da35a8000 ---p 00003000 00:1a 5279130 /usr/lib64/python3.5/lib-dynload/_posixsubprocess.cpython-35m-x86_64-linux-gnu.so
7f4da35a8000-7f4da35a9000 r--p 00002000 00:1a 5279130 /usr/lib64/python3.5/lib-dynload/_posixsubprocess.cpython-35m-x86_64-linux-gnu.so
7f4da35a9000-7f4da35aa000 rw-p 00003000 00:1a 5279130 /usr/lib64/python3.5/lib-dynload/_posixsubprocess.cpython-35m-x86_64-linux-gnu.so
7f4da35aa000-7f4da362a000 rw-p 00000000 00:00 0
7f4da362a000-7f4da362c000 r-xp 00000000 00:1a 5279122 /usr/lib64/python3.5/lib-dynload/_heapq.cpython-35m-x86_64-linux-gnu.so
7f4da362c000-7f4da382b000 ---p 00002000 00:1a 5279122 /usr/lib64/python3.5/lib-dynload/_heapq.cpython-35m-x86_64-linux-gnu.so
7f4da382b000-7f4da382c000 r--p 00001000 00:1a 5279122 /usr/lib64/python3.5/lib-dynload/_heapq.cpython-35m-x86_64-linux-gnu.so
7f4da382c000-7f4da382e000 rw-p 00002000 00:1a 5279122 /usr/lib64/python3.5/lib-dynload/_heapq.cpython-35m-x86_64-linux-gnu.so
7f4da382e000-7f4da39ee000 rw-p 00000000 00:00 0
7f4da39ee000-7f4da3bab000 r-xp 00000000 00:1a 4844904 /usr/lib64/libc-2.24.so
7f4da3bab000-7f4da3daa000 ---p 001bd000 00:1a 4844904 /usr/lib64/libc-2.24.so
7f4da3daa000-7f4da3dae000 r--p 001bc000 00:1a 4844904 /usr/lib64/libc-2.24.so
7f4da3dae000-7f4da3db0000 rw-p 001c0000 00:1a 4844904 /usr/lib64/libc-2.24.so
7f4da3db0000-7f4da3db4000 rw-p 00000000 00:00 0
7f4da3db4000-7f4da3ebc000 r-xp 00000000 00:1a 4844910 /usr/lib64/libm-2.24.so
7f4da3ebc000-7f4da40bb000 ---p 00108000 00:1a 4844910 /usr/lib64/libm-2.24.so
7f4da40bb000-7f4da40bc000 r--p 00107000 00:1a 4844910 /usr/lib64/libm-2.24.so
7f4da40bc000-7f4da40bd000 rw-p 00108000 00:1a 4844910 /usr/lib64/libm-2.24.so
7f4da40bd000-7f4da40bf000 r-xp 00000000 00:1a 4844928 /usr/lib64/libutil-2.24.so
7f4da40bf000-7f4da42be000 ---p 00002000 00:1a 4844928 /usr/lib64/libutil-2.24.so
7f4da42be000-7f4da42bf000 r--p 00001000 00:1a 4844928 /usr/lib64/libutil-2.24.so
7f4da42bf000-7f4da42c0000 rw-p 00002000 00:1a 4844928 /usr/lib64/libutil-2.24.so
7f4da42c0000-7f4da42c3000 r-xp 00000000 00:1a 4844908 /usr/lib64/libdl-2.24.so
7f4da42c3000-7f4da44c2000 ---p 00003000 00:1a 4844908 /usr/lib64/libdl-2.24.so
7f4da44c2000-7f4da44c3000 r--p 00002000 00:1a 4844908 /usr/lib64/libdl-2.24.so
7f4da44c3000-7f4da44c4000 rw-p 00003000 00:1a 4844908 /usr/lib64/libdl-2.24.so
7f4da44c4000-7f4da44dc000 r-xp 00000000 00:1a 4844920 /usr/lib64/libpthread-2.24.so
7f4da44dc000-7f4da46dc000 ---p 00018000 00:1a 4844920 /usr/lib64/libpthread-2.24.so
7f4da46dc000-7f4da46dd000 r--p 00018000 00:1a 4844920 /usr/lib64/libpthread-2.24.so
7f4da46dd000-7f4da46de000 rw-p 00019000 00:1a 4844920 /usr/lib64/libpthread-2.24.so
7f4da46de000-7f4da46e2000 rw-p 00000000 00:00 0
7f4da46e2000-7f4da4917000 r-xp 00000000 00:1a 5277535 /usr/lib64/libpython3.5m.so.1.0
7f4da4917000-7f4da4b17000 ---p 00235000 00:1a 5277535 /usr/lib64/libpython3.5m.so.1.0
7f4da4b17000-7f4da4b1c000 r--p 00235000 00:1a 5277535 /usr/lib64/libpython3.5m.so.1.0
7f4da4b1c000-7f4da4b7f000 rw-p 0023a000 00:1a 5277535 /usr/lib64/libpython3.5m.so.1.0
7f4da4b7f000-7f4da4baf000 rw-p 00000000 00:00 0
7f4da4baf000-7f4da4bd4000 r-xp 00000000 00:1a 4844897 /usr/lib64/ld-2.24.so
7f4da4bdf000-7f4da4c10000 rw-p 00000000 00:00 0
7f4da4c10000-7f4da4c61000 r--p 00000000 00:1a 5225117 /usr/lib/locale/pl_PL.utf8/LC_CTYPE
7f4da4c61000-7f4da4d91000 r--p 00000000 00:1a 4844827 /usr/lib/locale/en_US.utf8/LC_COLLATE
7f4da4d91000-7f4da4d95000 rw-p 00000000 00:00 0
7f4da4dc1000-7f4da4dc2000 r--p 00000000 00:1a 4844832 /usr/lib/locale/en_US.utf8/LC_NUMERIC
7f4da4dc2000-7f4da4dc3000 r--p 00000000 00:1a 4844795 /usr/lib/locale/en_US.utf8/LC_TIME
7f4da4dc3000-7f4da4dc4000 r--p 00000000 00:1a 4844793 /usr/lib/locale/en_US.utf8/LC_MONETARY
7f4da4dc4000-7f4da4dc5000 r--p 00000000 00:1a 4844830 /usr/lib/locale/en_US.utf8/LC_MESSAGES/SYS_LC_MESSAGES
7f4da4dc5000-7f4da4dc6000 r--p 00000000 00:1a 4844847 /usr/lib/locale/en_US.utf8/LC_PAPER
7f4da4dc6000-7f4da4dc7000 r--p 00000000 00:1a 4844831 /usr/lib/locale/en_US.utf8/LC_NAME
7f4da4dc7000-7f4da4dc8000 r--p 00000000 00:1a 4844790 /usr/lib/locale/en_US.utf8/LC_ADDRESS
7f4da4dc8000-7f4da4dc9000 r--p 00000000 00:1a 4844794 /usr/lib/locale/en_US.utf8/LC_TELEPHONE
7f4da4dc9000-7f4da4dca000 r--p 00000000 00:1a 4844792 /usr/lib/locale/en_US.utf8/LC_MEASUREMENT
7f4da4dca000-7f4da4dd1000 r--s 00000000 00:1a 4845203 /usr/lib64/gconv/gconv-modules.cache
7f4da4dd1000-7f4da4dd2000 r--p 00000000 00:1a 4844791 /usr/lib/locale/en_US.utf8/LC_IDENTIFICATION
7f4da4dd2000-7f4da4dd4000 rw-p 00000000 00:00 0
7f4da4dd4000-7f4da4dd5000 r--p 00025000 00:1a 4844897 /usr/lib64/ld-2.24.so
7f4da4dd5000-7f4da4dd6000 rw-p 00026000 00:1a 4844897 /usr/lib64/ld-2.24.so
7f4da4dd6000-7f4da4dd7000 rw-p 00000000 00:00 0
7ffd24da1000-7ffd24dc2000 rw-p 00000000 00:00 0 [stack]
7ffd24de8000-7ffd24dea000 r--p 00000000 00:00 0 [vvar]
7ffd24dea000-7ffd24dec000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
COREDUMP_TIMESTAMP=1477877460000000
MESSAGE=Process 14498 (python3) of user 1002 failed with ZeroDivisionError: division by zero:
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 89, in <module>
g()
File "systemd_coredump_exception_handler.py", line 88, in g
f()
File "systemd_coredump_exception_handler.py", line 86, in f
div0 = 1 / 0 # pylint: disable=W0612
ZeroDivisionError: division by zero
Local variables in innermost frame:
h=<function f at 0x7f4da3606e18>
a=3
_PID=14499
_SOURCE_REALTIME_TIMESTAMP=1477877460025975
|
|
In preparation for subsequenct changes...
Various stack allocations are changed to use the heap. This might be minimally
slower, but probably doesn't matter. The upside is that we will now properly
free all memory that is allocated.
|
|
Like journalctl, users sometimes want to see coredump list in reverse
order.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
|
|
It seems the -o opiton and -D option can be printed together.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
|
|
Fixes:
```
root# systemd-nspawn -D ./cont/ --register=no /bin/sh -c '/bin/sh -c "kill -ABRT \$\$"'
...
Container cont failed with error code 134.
root# journalctl MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1 -o verbose | grep -i container_cmdline
...prints nothing...
...should be COREDUMP_CONTAINER_CMDLINE=systemd-nspawn -D ./cont/ --register=no /bin/sh -c /bin/sh -c "kill -ABRT \$\$"
```
Also, fixes CID #1368263
```
==352== 130 bytes in 1 blocks are definitely lost in loss record 1 of 2
==352== at 0x4C2ED5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==352== by 0x4ED8581: greedy_realloc (alloc-util.c:57)
==352== by 0x4ECAAD5: get_process_cmdline (process-util.c:147)
==352== by 0x10E385: get_process_container_parent_cmdline (coredump.c:645)
==352== by 0x112949: process_kernel (coredump.c:1240)
==352== by 0x113003: main (coredump.c:1297)
==352==
```
|
|
Even if pressing Ctrl-c after spawning gdb with "coredumpctl gdb" is not really
useful, we should let gdb handle the signal entirely otherwise the user can be
suprised to see a different behavior when gdb is started by coredumpctl vs when
it's started directly.
Indeed in the former case, gdb exits due to coredumpctl being killed by the
signal.
So this patch makes coredumpctl ignore SIGINT as long as gdb is running.
|
|
For normal arches this doesn't matter, but on arm32 arg_journal_size_max was smaller
than the other *SizeMax variables. This doesn't seem useful.
This is anothet part of the fix in 5206a724a0.
|
|
In file included from ./src/basic/macro.h:415:0,
from ./src/shared/acl-util.h:28,
from src/coredump/coredump.c:36:
src/coredump/coredump.c: In function ‘submit_coredump’:
src/coredump/coredump.c:711:26: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:711:17: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
src/coredump/coredump.c:711:26: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 8 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:711:17: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
src/coredump/coredump.c:741:27: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:182:28: note: in expansion of macro ‘log_full’
#define log_debug(...) log_full(LOG_DEBUG, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:741:17: note: in expansion of macro ‘log_debug’
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^~~~~~~~~
src/coredump/coredump.c:741:27: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 8 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:182:28: note: in expansion of macro ‘log_full’
#define log_debug(...) log_full(LOG_DEBUG, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:741:17: note: in expansion of macro ‘log_debug’
log_debug("Not generating stack trace: core size %zu is greater than %zu (the configured maximum)",
^~~~~~~~~
src/coredump/coredump.c:768:34: warning: format ‘%zu’ expects argument of type ‘size_t’, but argument 7 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat=]
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^
./src/basic/log.h:175:82: note: in definition of macro ‘log_full_errno’
? log_internal(_level, _e, __FILE__, __LINE__, __func__, __VA_ARGS__) \
^~~~~~~~~~~
./src/basic/log.h:183:28: note: in expansion of macro ‘log_full’
#define log_info(...) log_full(LOG_INFO, __VA_ARGS__)
^~~~~~~~
src/coredump/coredump.c:768:25: note: in expansion of macro ‘log_info’
log_info("The core will not be stored: size %zu is greater than %zu (the configured maximum)",
^~~~~~~~
|
|
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
|
|
This makes strjoin and strjoina more similar and avoids the useless final
argument.
spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)
git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'
This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
|
|
|
|
coredump had code to check if copy_bytes() hit the max_bytes limit,
and refuse further processing in that case.
But in 84ee0960443, the return convention for copy_bytes() was changed
from -EFBIG to 1 for the case when the limit is hit, so the condition
check in coredump couldn't ever trigger.
But it seems that *do* want to process such truncated cores [1].
So change the code to detect truncation properly, but instead of
returning an error, give a nice log entry.
[1] https://github.com/systemd/systemd/issues/3883#issuecomment-239106337
Should fix (or at least alleviate) #3883.
|
|
Another fix for #4161.
|
|
For the user, if the core file is missing or inaccessible, it is
more interesting that the fact that they forgot to pipe to a file.
So delay the failure from the check until after we have verified
that the file or the COREDUMP field are present.
Partially fixes #4161.
Also, error reporting on failure was duplicated. save_core() now
always prints an error message (because it knows the paths involved,
so can the most useful message), and the callers don't have to.
|
|
Propagate errors properly, so that if we hit oom or an error in the
journal, the whole command will fail. This is important when using
the output in scripts.
Support the output of multiple values for the same field with -F.
The journal supports that, and our official commands should too, as
far as it makes sense. -F can be used to print user-defined fields
(e.g. somebody could use a TAG field with multiple occurences), so
we should support that too. That seems better than silently printing
the last value found as was done before.
We would iterate trying to match the same field with all possible
field names. Once we find something, cut the loop short, since we
know that nothing else can match.
|
|
The column for "present" was easy to miss, especially if somebody had no
coredumps present at all, in which case the column of spaces of width one
wasn't visually distinguished from the neighbouring columns. Replace this
with an explicit text, one of: "missing", "journal", "present", "error".
$ coredumpctl
TIME PID UID GID SIG COREFILE EXE
Mon 2016-09-26 22:46:31 CEST 8623 0 0 11 missing /usr/bin/bash
Mon 2016-09-26 22:46:35 CEST 8639 1001 1001 11 missing /usr/bin/bash
Tue 2016-09-27 01:10:46 CEST 16110 1001 1001 11 journal /usr/bin/bash
Tue 2016-09-27 01:13:20 CEST 16290 1001 1001 11 journal /usr/bin/bash
Tue 2016-09-27 01:33:48 CEST 17867 1001 1001 11 present /usr/bin/bash
Tue 2016-09-27 01:37:55 CEST 18549 0 0 11 error /usr/bin/bash
Also, use access(…, R_OK), so that we can report a present but inaccessible
file different than a missing one.
|
|
In 'list', show present also for coredumps stored in the journal.
In 'status', replace "File" with "Storage" line that is always present.
Possible values:
Storage: none
Storage: journal
Storage: /path/to/file (inacessible)
Storage: /path/to/file
Previously the File field be only present if the file was accessible, so users
had to manually extract the file name precisely in the cases where it was
needed, i.e. when coredumpctl couldn't access the file. It's much more friendly
to always show something. This output is designed for human consumption, so
it's better to be a bit verbose.
The call to sd_j_set_data_threshold is moved, so that status is always printed
with the default of 64k, list uses 4k, and coredump retrieval is done with the
limit unset. This should make checking for the presence of the COREDUMP field
not too costly.
|
|
|
|
sd_journal_previous() returns 0 if it didn't do any move, so the
warning was stupidly always printed.
|
|
Added in 9fe13294a9 (by me :[```), and later obfuscated in d0c8806d4ab, if an
uncompressed external file or an internally stored coredump was supposed to be
written to a file descriptor, nothing would be written.
|
|
Back when external storage was initially added in 34c10968cb, this mode of
storage was added. This could have made some sense back when XZ compression was
used, and an uncompressed core on disk could be used as short-lived cache file
which does require costly decompression. But now fast LZ4 compression is used
(by default) both internally and externally, so we have duplicated storage,
using the same compression and same default maximum core size in both cases,
but with different expiration lifetimes. Even the uncompressed-external,
compressed-internal mode is not very useful: for small files, decompression
with LZ4 is fast enough not to matter, and for large files, decompression is
still relatively fast, but the disk-usage penalty is very big.
An additional problem with the two modes of storage is that it complicates
the code and makes it much harder to return a useful error message to the user
if we cannot find the core file, since if we cannot find the file we have to
check the internal storage first.
This patch drops "both" storage mode. Effectively this means that if somebody
configured coredump this way, they will get a warning about an unsupported
value for Storage, and the default of "external" will be used.
I'm pretty sure that this mode is very rarely used anyway.
|
|
If ulimit is smaller than page_size(), function save_external_coredump()
returns -EBADSLT and this causes skipping whole core dumping part in
submit_coredump(). Initializing coredump_size to UINT64_MAX prevents
evaluating a condition with uninitialized varialbe which leads to
calling allocate_journal_field() with coredump_fd = -1 which causes
aborting.
Signed-off-by: Matej Habrnal <mhabrnal@redhat.com>
|
|
Network file dropins
|
|
In preparation for adding a version which takes a strv.
|