=== Interpreting kernel output at process crash === When a usermode process crashes the linux kernel usually prints some lines. These can be looked up with `dmesg` or `journalctl`. This is an example: {{{ foobar[20400]: segfault at 0 ip 00000000f79ffb5b sp 00000000fffd3828 error 4 in libc-2.29.so[f78d2000+145000] Code: 66 0f 6f 25 37 7c 03 00 66 0f 6f 2d 3f 7c 03 00 66 0f 6f 35 47 7c 03 00 83 f9 30 0f 87 8e 00 00 00 83 f8 30 0f 87 85 00 00 00 0f 6f 0f f3 0f 6f 16 66 0f 6f f9 66 44 0f 6f c5 66 44 0f 6f ca }}} Gets printed [[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/mm/fault.c?h=v5.5#n845|here]]. === The first line contains following information: === * `foobar` is the process name * `20400` is the PID * `segfault` gives the kind of exception * `0` is the memory address which caused the exception when being accessed * the address after `ip` is the instruction pointer * the address after `sp` is the stack pointer * `error 4` gives information what kind of access caused the exception * behind `in` is usually the executable or shared library which contains the exception causing instruction The `error 4` could be decoded looking at in binary notation 0b00000100 using the `enum x86_pf_error_code` in [[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/traps.h?h=v5.5#n160|traps.h]]. {{{ * bit 0 == 0: no page found 1: protection fault * bit 1 == 0: read access 1: write access * bit 2 == 0: kernel-mode access 1: user-mode access * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access }}} Therefore in this case we have `no page found`, `read access` and `user-mode access`. === The second line === The second line shows the bytes surrounding the instruction (the first byte of it marked with <..>) which caused the exception. This sequence can be used to find the source line and therefore the function causing the exception. If the dmesg output is captured on a different system, make sure you have really the same versions of the executable and shared libraries in the examining system installed. Additionally install the matching dbgsym packages as described in [[HowToGetABacktrace#Installing_the_debugging_symbols|Installing the debugging symbols]]. Following translates the byte sequence to be used inside gdb: {{{ echo -n "find /b ..., ..., 0x" && \ echo "66 0f 6f 25 37 7c 03 00 66 0f 6f 2d 3f 7c 03 00 66 0f 6f 35 47 7c 03 00 83 f9 30 0f 87 8e 00 00 00 83 f8 30 0f 87 85 00 00 00 0f 6f 0f f3 0f 6f 16 66 0f 6f f9 66 44 0f 6f c5 66 44 0f 6f ca" \ | sed 's/[<>]//g' | sed 's/ /, 0x/g' }}} This is the result, just the start and end address is still missing. {{{ find /b ..., ..., 0x66, 0x0f, 0x6f, 0x25, 0x37, 0x7c, 0x03, 0x00, 0x66, 0x0f, 0x6f, 0x2d, 0x3f, 0x7c, 0x03, 0x00, 0x66, 0x0f, 0x6f, 0x35, 0x47, 0x7c, 0x03, 0x00, 0x83, 0xf9, 0x30, 0x0f, 0x87, 0x8e, 0x00, 0x00, 0x00, 0x83, 0xf8, 0x30, 0x0f, 0x87, 0x85, 0x00, 0x00, 0x00, 0xf3, 0x0f, 0x6f, 0x0f, 0xf3, 0x0f, 0x6f, 0x16, 0x66, 0x0f, 0x6f, 0xf9, 0x66, 0x44, 0x0f, 0x6f, 0xc5, 0x66, 0x44, 0x0f, 0x6f, 0xca }}} Then start a gdb session like following. Fill start and end address in the find command either from the nameless `.text` if the crash happened in the executable, or these from the shared library: {{{ $ gdb -q (gdb) set width 0 (gdb) set pagination off (gdb) file /sbin/foobar (gdb) tb main (gdb) run ... (gdb) pipe info target | grep "\.text" ... 0x5655c400 - 0x56579381 is .text ... 0xf7967320 - 0xf7aaa4db is .text in /lib/x86_64-linux-gnux32/libc.so.6 ... (gdb) find /b 0xf7967320, 0xf7aaa4db, 0x66, 0x0f, 0x6f, 0x25, 0x37, 0x7c, 0x03, 0x00, 0x66, 0x0f, 0x6f, 0x2d, 0x3f, 0x7c, 0x03, 0x00, 0x66, 0x0f, 0x6f, 0x35, 0x47, 0x7c, 0x03, 0x00, 0x83, 0xf9, 0x30, 0x0f, 0x87, 0x8e, 0x00, 0x00, 0x00, 0x83, 0xf8, 0x30, 0x0f, 0x87, 0x85, 0x00, 0x00, 0x00, 0xf3, 0x0f, 0x6f, 0x0f, 0xf3, 0x0f, 0x6f, 0x16, 0x66, 0x0f, 0x6f, 0xf9, 0x66, 0x44, 0x0f, 0x6f, 0xc5, 0x66, 0x44, 0x0f, 0x6f, 0xca 0xf7a94b31 <__strncasecmp_l_sse42+50> 1 pattern found. (gdb) b * (0xf7a94b31 + 42) Breakpoint 2 at 0xf7a94b5b: file ../sysdeps/x86_64/multiarch/strcmp-sse42.S, line 199. (gdb) info b Num Type Disp Enb Address What 2 breakpoint keep y 0xf7a94b5b ../sysdeps/x86_64/multiarch/strcmp-sse42.S:199 (gdb) disassemble /r 0xf7a94b31, 0xf7a94b31 + 62 Dump of assembler code from 0xf7a94b31 to 0xf7a94b6f: 0xf7a94b31 <__strncasecmp_l_sse42+50>: 66 0f 6f 25 37 7c 03 00 movdqa 0x37c37(%rip),%xmm4 # 0xf7acc770 0xf7a94b39 <__strncasecmp_l_sse42+58>: 66 0f 6f 2d 3f 7c 03 00 movdqa 0x37c3f(%rip),%xmm5 # 0xf7acc780 0xf7a94b41 <__strncasecmp_l_sse42+66>: 66 0f 6f 35 47 7c 03 00 movdqa 0x37c47(%rip),%xmm6 # 0xf7acc790 0xf7a94b49 <__strncasecmp_l_sse42+74>: 83 f9 30 cmp $0x30,%ecx 0xf7a94b4c <__strncasecmp_l_sse42+77>: 0f 87 8e 00 00 00 ja 0xf7a94be0 <__strncasecmp_l_sse42+225> 0xf7a94b52 <__strncasecmp_l_sse42+83>: 83 f8 30 cmp $0x30,%eax 0xf7a94b55 <__strncasecmp_l_sse42+86>: 0f 87 85 00 00 00 ja 0xf7a94be0 <__strncasecmp_l_sse42+225> 0xf7a94b5b <__strncasecmp_l_sse42+92>: f3 0f 6f 0f movdqu (%rdi),%xmm1 0xf7a94b5f <__strncasecmp_l_sse42+96>: f3 0f 6f 16 movdqu (%rsi),%xmm2 0xf7a94b63 <__strncasecmp_l_sse42+100>: 66 0f 6f f9 movdqa %xmm1,%xmm7 0xf7a94b67 <__strncasecmp_l_sse42+104>: 66 44 0f 6f c5 movdqa %xmm5,%xmm8 0xf7a94b6c <__strncasecmp_l_sse42+109>: 66 44 0f 6f ca movdqa %xmm2,%xmm9 End of assembler dump. }}} In this case the crash happened in `__strncasecmp_l_sse42`. Unfortunately functions in glibc are used by most applications and libraries, so this might be no good example to start with. It makes more sense to anaylze this way if it is crashing inside the executable or a rarely used shared library. === addr2line === Before ASLR got enabled by default the `ip` value could just be given to addr2line: {{{ $ addr2line --addresses --functions --pretty-print --exe=foobar 0000000000401020 0x00401020: main at /home/benutzer/crash.c:2 }}} But this is not going to work with an address from an ASLR enabled process. Could still be controlled e.g. by /proc/sys/kernel/randomize_va_space. With `set disable-randomization off` in gdb or `export LLDB_LAUNCH_FLAG_DISABLE_ASLR=1` for lldb ASLR could be controlled for a single process. ---- CategoryDebugging