On Fri, Mar 16, 2012 at 1:57 AM, Jan Kratochvil jan.kratochvil@redhat.com wrote:
And both machines pass rpm -Va just fine. So the binaries should, um, be the same.
It is a core from yesterday,
There can be difference one of the machines has the files prelink-ed while the other one does not. prelink runs nightly (/etc/cron.daily/prelink). But it
Thanks!
Prelink is not involved -- I doublechecked. In OLPC builds, we currently don't prelink due to http://dev.laptop.org/ticket/10898 , we just don't install prelink and don't run it during OS image creation. Even back then when we did, we disabled the cronjob :-)
should be already fixed in your GDB version gdb-7.2-52.fc14,
You got that one right :-)
If it helps please contact me off-list, with your disk image. It assumes the system generating the core file was not prelinked.
Uploading at http://dev.laptop.org/~martin/os5rw-brokenimg/Sandisk_1200908562DEN.img
Bear in mind - that'll contain 2 partitions. The 2nd partition is / but our initrd mounts it, and then chroots into a subdirectory. So when you mount it, you'll want too look into /versions/run/5/
(WTF is this? Root FS "snapshots" via hardlinked trees. Until we have btrfs running on these puppies, it's the best update fail-proof mechanism we have.)
That missing file: Missing separate debuginfo for Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/63/420e48a2edbae61166c708ebd2ff1a5aed1054
is probably for kernel vDSO (as its name is empty), therefore kernel rpm.
Argh, that could be. But our kernel is a custom built rpm, and we don't build -debuginfo. Here, have a fistful of my freshly-torn-out hair.
Now, at the time of this segfault, the dmesg reports a segfault in python2.7, inside calls to glib... (1) why are we then in the kernel and (2) why isn't gdb telling us anything about the python/glib part of the callstack?
still confused -
martin PS: On a different investigation track we think there may be some subtle/odd disk corruption that _passes_ rpm -Va and our own olpc-contents-verify, yet strikes at runtime. Could a subtly corrupt binary (ie: vmlinuz) lead here?