Mike Kohn!

CONTENTS

YouTube
BlueSky
GitHub
LinkedIn

Core File Analysis

(When Missing Libraries)

Posted: May 12, 2019
Updated: August 27, 2022

Introduction

There are times when I have had to deal with core files that crashed in a library where I don't have access to any of the other libraries or original executable that called the library that crashed. This can sometimes make it difficult to examine the core file. The best example may be when the crash comes in a Java program calling a C library with JNI. Java will catch the segfault and dump a hs_err<pid>.log file with a ton of info, but loading the core in gdb can be problematic if the exact original Java executable is not available.

To make this easier to deal with, I added code to magic_elf so that it can modify registers in a core a file so the RSP and RIP registers can point to the actual point of the crash and not to the missing Java binary.

The same thing can be done with the output of libraries such as libSegFault.so.

As a side note, magic_elf has a newer feature -extract_java that will search for Java class files in a core dump (looking for 0xcafebabe signature) and copy them into .class files so they can analyzed (or even run) outside of the core file.

Analyzing a Java Core

First step is to download magic_elf and build it:


git clone https://github.com/mikeakohn/magic_elf.git
cd magic_elf
make

In the sample directory is a Java program called Test.java that makes a call to a static method forceSegfault() defined in Crash.java. The method forceSegfault() is a native method that makes a call into libcrash.so. This function will dereference a NULL pointer forcing a crash.

In the samples directory, after typing the following:


ulimit -c unlimited
make java
export LD_LIBRARY_PATH=.
java Test

Unfortunately, on newer Ubuntu and maybe other Linux OS's, core files are being caught by Apport and base64 encoded inside of text files. In this case something like this will need to be done too:


mkdir crash
apport-unpack /var/crash/_usr_lib_jvm_java-11-openjdk-amd64_bin_java.1000.crash
crash
cp crash/CoreDump core.94617

Two things should now be created: core and an hs_err_<pid>.log file. In this example now type:


gdb -c core
bt

Which gives a backtrace that looks like this:


(gdb) bt
#0  0x00007fc5c216ee97 in ?? ()
#1  0xfffffffe7ffbfa07 in ?? ()
#2  0x00007fc5c2964d90 in ?? ()
#3  0x00007fc5c1adbb80 in ?? ()
#4  0x00007fc5c129f886 in ?? ()
#5  0x0000000000000000 in ?? ()

Not very useful. Since this was compiled this on my system I could type:


set solib-search-path .

This would cause the library to be loaded along with java and every other supporting library. But to simulate not having access to the original Java executable, I will look into the hs_err file to find the shared library load address of libcrash.so and load it manually:


Dynamic libraries:

7fc58cbc8000-7fc58cbc9000 r-xp 00000000 08:11 6554941            libcrash.so

Since libcrash.so was compiled with the -g option and not stripped, gdb can load the library into memory so bt can show variable names, function names, file names, etc by doing:


(gdb) add-symbol-file libcrash.so 0x7fc58cbc8000

Unfortunately, because the "java" executable has the top entries on the stack and since that executable hasn't been loaded into gdb, gdb still can't figure out how to give a readable stack trace into libcrash.so. At this point, magic_elf can be used to modify the RSP stack register and RIP instruction pointer register in the core file to point to the top of the stack where libcrash.so is being called. Before this can be done, two pieces of information are needed: thread id of the crash and RSP and RIP values. The thread id can be retreived from gdb and the RSP and RIP are in the hs_err file:


(gdb) info thread
  Id   Target Id         Frame 
* 1    LWP 30755         0x00007fc5c216ee97 in ?? ()
  2    LWP 30757         0x00007fc5c1af59f3 in ?? ()
  3    LWP 30768         0x00007fc5c1af59f3 in ?? ()
  4    LWP 30760         0x00007fc5c1af5ed9 in ?? ()
  5    LWP 30758         0x00007fc5c1af86d6 in ?? ()
  6    LWP 30754         0x00007fc5c1af0d2d in ?? ()
...

So the thread id is 30755. Actually, the thread id is also reported in the hs_err file. The hs_err files also shows:


Registers:
RAX=0x0000000000000000, RBX=0x00007fc58dc005b8, RCX=0x0000000000000b40, RDX=0x0000000000000000
RSP=0x00007fc5c2965900, RBP=0x00007fc5c2965920, RSI=0x00007fc5c251d8c0, RDI=0x00007fc5c251c760
R8 =0x00007fc5c251d8c0, R9 =0x00007fc5c2966700, R10=0x000000000000000b, R11=0x00007fc5c21ae7e0
R12=0x0000000000000000, R13=0x00007fc58dc005b8, R14=0x00007fc5c2965998, R15=0x00007fc5b8012000
RIP=0x00007fc58cbc86d4, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000006
  TRAPNO=0x000000000000000e

Just a word of warning, magic_elf will overwrite information in the core file so it would be a good idea to only do this to a backup of the core. So I will type:


cp core core.modified
../magic_elf -modify_core 30755 rsp 0x7fc5c2965900 core.modified
../magic_elf -modify_core 30755 rip 0x7fc58cbc86d4 core.modified

Now the new core file should be able to be loaded with:


gdb -c core.modified
add-symbol-file libcrash.so 0x7fc58cbc8000

Oddly, this wasn't working too. Even weirder, typing "info shared" into gdb was showing the address differently:


0x00007fc58cbc85c0  0x00007fc58cbc86da  No          libcrash.so

Using magic_elf on libcrash.so, the address of the .text section + the address of the library load address listed in the hs_err file gives:


Section Header 12 (offset=0x64f0)
---------------------------------------------
     sh_name: 135 (.text)
     sh_type: 1 (SHT_PROGBITS)
    sh_flags: 0x6 (SHF_ALLOC SHF_EXECINSTR )
     sh_addr: 0x5c0
   sh_offset: 0x5c0
     sh_size: 282
     sh_link: 0
     sh_info: 0
sh_addralign: 16
  sh_entsize: 0

0x7fc58cbc8000 + 0x5c0 = 0x00007fc58cbc85c0

So instead, this should be typed into gdb:


add-symbol-file libcrash.so 0x7fc58cbc85c0

Kind of seems like a bug in gdb since info shared shows the same load address as it did before. Either way, this seems to solve it:


(gdb) bt
(gdb) bt
#0  0x00007fc58cbc86d4 in Java_Crash_forceSegfault (env=0x7fc5c128d249, 
    obj=0x7fc5c2964fb0) at crash.c:15
#1  0x00007fc5c14b4cca in ?? ()
#2  0x00000008008e6c36 in ?? ()
#3  0x00007fc5c1570ea3 in ?? ()
#4  0x00007fc5b8012000 in ?? ()
#5  0x00007fc5c2965130 in ?? ()

This core file can now be analyzed with gdb. It would probably be a good idea to modify all the other registers in the core file too in gdb looks at a variable's value from registers instead of memory. In the last core file I analyzed I did this using a pretty simple bash script that invoked magic_elf for every register listed in the log file.