Hi Denys,
On Thu, 03 Sep 2009 17:21:40 +0200, Denys Vlasenko wrote:
On Wed, 2009-08-26 at 10:39 +0200, Jan Kratochvil wrote:
> If you type just "core-file COREFILE" (without "file BINARY") it
will find the
> binary according to its build-id.
This may be wrong in the rare case when binary name is somehow
misdetected, or the binary was replaced. But such cases are not typical,
so I do not want to worry about it just yet.
I find it very common, 30% of processes on my system already have their
binary/library files deleted due to `yum update's:
# ls -l /proc/*/maps|wc -l
282
# for i in /proc/*/maps;do egrep '(/lib|/bin).*deleted' $i|grep -vq prelink
&& echo $i;done|wc -l
86
-> ~30%
There is another problem one currently no longer has debuginfo files for it
installed but that requires two unrelated action items:
* Installation of multiple debuginfo rpms simultaneously.
Currently not possible, planned by Roland McGrath, hacked it before:
http://people.redhat.com/jkratoch/multidebug/
* distribution of debuginfo rpms for releases:
* full release (as is)
* every released update (not just the last update as currently is)
(distribution of debuginfo rpms for rawhide)
* probably about last two weeks of built rpms or so.
Not following build-ids would be another item to solve. Currently after
a crash it has no valid backtrace one has to restart the current on-disk
version of the daemon hoping it will crash again before the next `yum update'.
> Libraries are always found preferred to their build-id.
This is the part I am interested in. How can we extract libraries'
build-ids?
By ldd'ing the binary and then extracting libraries'
build-ids? What about
dlopen'ed libs, how to find their debuginfos?
Right, DT_NEEDED (=ldd) way would not catch those.
Basically, we need to answer the question "do we need to
install
debuginfo packages, and which ones?". For that, we need to know
"what debuginfo FILES (not packages) gdb would need?".
One way to achieve it is to obtain the list of all build-ids
of all binaries/libraries loaded in crashed process' memory.
Then it is trivial to check existence of
/usr/lib/debug/.build-id/XX/XXX files.
Can we do it somehow?
Core file contains build-id of every ELF file loaded in memory
(if CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS).
You can extract this build-id list as I wrote before:
On Wed, 26 Aug 2009 10:39:23 +0200, Jan Kratochvil wrote:
# ABRT should
# eu-unstrip -n --core=/tmp/core.20546
# and for its each produced line like
# 0x3979600000+0x36e000 ec8dd400904ddfcac8b1c343263a790f977159dc@0x3979600280
/lib64/libc-2.10.1.so /usr/lib/debug/lib64/libc-2.10.1.so.debug libc.so.6
# use
# yum --enablerepo='*-debuginfo' install
/usr/lib/debug/.build-id/ec/8dd400904ddfcac8b1c343263a790f977159dc.debug
# (or some other yum/gpk command what recommend their maintainers)
It is enough for GDB to provide all these files as the one listed above:
/usr/lib/debug/.build-id/ec/8dd400904ddfcac8b1c343263a790f977159dc.debug
Point #2:
------------------------------------------------------------------------------
This solution imperfect as some program (for example linker) may have
mmap(2)ed some library which it does not execute and does not need for
a backtrace. GDB will not even search for such debug info file.
(One could improve such heuristics by checking the 'x' (executable) flag of
page ranges of such mmap(2)ed data in /proc/PID/maps but Linux kernel
currently does not save /proc/PID/maps into a core file - although there were
some intentions (or even kernel patches?) to do so. Still it would be just
heuristics.)
3979200000-397921f000 r-xp 00000000 fd:00 5236735 /lib64/ld-2.10.1.so
--> ^ <--
397941e000-397941f000 r--p 0001e000 fd:00 5236735 /lib64/ld-2.10.1.so
397941f000-3979420000 rw-p 0001f000 fd:00 5236735 /lib64/ld-2.10.1.so
The right solution to never download unneeded .debug files would
* find the AUXV note in the core file.
eu-readelf -n corefile
[...]
CORE 288 AUXV
* Find the executable binary VMA (address-in-memory) in it:
[...]
PHDR: 0x400040
* Find build-id of the executable in that page.
* Load matching executable file from disk according to that build-id as the
next looked up structures may be in readonly pages omitted in the core file.
[ Here it is similar to GDB elf_locate_base()->scan_dyntag(DT_DEBUG).]
* Find DYNAMIC segment address in that PHDR.
eu-readelf -l executable-file
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg
Align
[...]
DYNAMIC 0x0cb5e8 0x00000000006cb5e8 0x00000000006cb5e8 0x0001b0 0x0001b0 RW
0x8
* Find DT_DEBUG tag in that DYNAMIC segment:
eu-readelf -d executable-file
DEBUG [ tag value is 0x0 in the on-disk file, read it from the core file ]
* The DT_DEBUG tag value is the address of:
extern struct r_debug _r_debug;
* _r_debug.r_map contains the linkmap of loaded shared libraries to traverse.
Code for this traversal from a core file would be probably best to write as
a new program based on elfutils.
------------------------------------------------------------------------------
I imagine the last resort way to do it
is to read gdb source and extract the code which does that,
but maybe there is a simpler way?
I think currently the eu-unstrip is good enough as in real world cases there
will never be needless excessive .debug files being downloaded.
Thanks,
Jan