[fedora-arm] [AArch64] Stage2 updates for 3 Jan 2013

Mark Salter msalter at redhat.com
Mon Jan 7 16:42:19 UTC 2013


On Fri, 2013-01-04 at 11:07 -0500, Mark Salter wrote:
> On Thu, 2013-01-03 at 19:05 -0700, Al Stone wrote:
> > On 01/03/2013 06:09 PM, Mark Salter wrote:
> > > On Thu, 2013-01-03 at 14:02 -0700, Al Stone wrote:
> > >> The redhat-rpm-config and rpm packages build, but Mark Salter and
> > >> Jon Masters will need to put their heads together to figure out
> > >> what needs changing so that they'll work properly for aarch64.
> > >
> > > I'm trying the following for rpm (along with updated config.guess/sub).
> > > It builds but install step hangs in installplatform when it calls the
> > > build dir rpm. I'm trying to sort that out now.
> > >
> > 
> > Cool.  Yeah, I ran into that, too.  Everything seems to have
> > built properly, and then it just hangs.  If there's something
> > I can help with to debug this, just holler.
> 
> RPM is making a call to NSS_NoDB_Init() (libnss3.so) which never
> returns. Using LD_DEBUG=all it looks like it may actually get stuck
> in libfreebl3.so init, but I'm not sure. GDB would help, but it
> segfaults in my branch. I may take a step back and rebase on the
> master branch now that it has all of the package builds I did on
> my branch.

Thought I'd post some status on this.

I made a simple program with just the NSS_NoDB_Init() call in it and it
hangs as well. I *really* needed gdb to help debug this but gdb would
segfault immediately while starting up. So I got sidetracked looking at
that problem.

The immediate gdb segfault turned out to be another of the mysterious
make/shell problems seen while building earlier packages. These problems
were usually in non-trivial make recipes used during install. In the gdb
case, there is a make rule used to generate an init.c file which has a
single initialization function which calls out to init functions found
in various other source files. The init.c generated during the gdb build
only had a couple default (always present) init calls in it. This left
most of gdb uninitialized and the segfault was caused by an unitialized
pointer dereference. I ran the script to generate init.c outside of make
and it created a reasonable looking file. I recompiled it and relinked
gdb. This got me further, but it hit an internal error while still
starting up: "_initialize_gdb_osabi: gdb_osabi_names[] is inconsistent".
This turned out to be a problem in one of the aarch64 patches which
added a "Newlib" entry to gdb_osabi_names[] but didn't update the enum
of array indexes. GDB noticed the inconsistency and issued the internal
error because of it. Removing the Newlib entry allowed gdb to at least
finish initializing, read my NSS_NoDB_Init program and set a breakpoint
at main. But when I tried to run it:

    (gdb) b main
    Breakpoint 1 at 0x400858: file nss_nodb_init.c, line 6.
    (gdb) run
    Starting program: /stage2/nss_nodb_init 
    Failed to read a valid object file image from memory.
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/libthread_db.so.1".
    Cannot find user-level thread for LWP 983: capability not available

That last error looks like a problem in libthread_db (glibc) or maybe
kernel. I didn't have the glibc sources handy, so I abandoned gdb for
printfs in the nss library.

To make a long story short (and it would have been even shorter with a
working gdb), the NSS_NoDB_Init() hang is in RNG_FileUpdate() in
libfreebl3. This is called repeatedly for a number of files listed in
an internal array and data from those files is used to update the random
number generator. The list of files is:

    static const char * const files[] = {
	"/etc/passwd",
	"/etc/utmp",
	"/tmp",
	"/var/tmp",
	"/usr/tmp",
	0
    };

RNG_FileUpdate() uses fread() in a loop until a certain amount of data
is read or eof. With the code instrumented with fprintfs, I see data
from /etc/passwd being read, /etc/utmp being skipped because it doesn't
exist and then fread never returns for /tmp. I played around with the
ordering of the list but that didn't matter. I would see fread hang for
any file which was a directory. So maybe a libc or kernel problem. For
grins, I wrote a test program using fread on /tmp and that did succeed.
So I'm not sure what is going on. I wanted to give rpm a try (which was
where I started) so I just commented out the directories in the files
list and rebuilt libfreebl3 and installed it. Yay, that fixed the rpm
hang.

Feeling good about that level of success, I tried "rpm --initdb" but
that failed with a "file not found" error. Turns out that a number of
rpm binaries including /usr/bin/rpmdb didn't get installed. This turned
out to be another make/shell problem in the install-binPROGRAMS rule.
That rule takes a list of binaries and installs them, but only the
first in the list is getting installed. the basic flow is:

  list='x y z' ; for f in $list ; echo $f $f ; done ; \
  while read foo bar ; do echo $foo $bar ; done

What I see is that the for loop runs through all of the elements of list
but the while loop stops after reading the first one. So a handful of
rpm binaries didn't get installed. I made a test Makefile with one rule
which used the same script as the rpm Makefile. It worked. Weird.

Anyway, I manually installed the missing binaries and was able to init
the db, install source rpm, query a binary rpm and other such simple
things. So that's where I am right now. Still no patches to work around
the make/shell issues in gdb or rpm. I think the gdb/libthread_db
problem needs fixing the most. Time spent on it will make debugging
other problems way easier.

--Mark








More information about the arm mailing list