I wanted to do this for a long time but only now I had the time and
a destop beefy enough to try this. Basically I replaced /usr/bin/gnome-session
by a shell script :
#!/bin/sh
/usr/bin/valgrind --trace-children=yes --log-file=/tmp/valgrind
/usr/bin/gnome-session.orig $*
Then logged on in gdm , and checked what happened from an ssh connection
top the box.
The good news:
- logging went through, but it took a few minutes
- everything looked functional though extremely slow
- there wasn't many logs reported by valgrind
The bad news:
- I had to stop the session shortly after the login fully complete
the VM was full (1G of Ram + 500M of swap)
- reports from the logs are a pain to try to analyze.
- one python (rhn applet I suspect) generated a huge log, python-2.3
doesn't seems valgrindable.
I them eliminated all the empty /tmp/valgrind.pid* files, I was left with
reports from oly 25 processes.
First a word of warning, I used the normal optimized code as shipped as
part of Fedora devel (fully up-to-date box for todays version), some
of the optimizations sometimes defeat valgrind so there may be false positive.
I have tried to sort all the reports to gather together what was frequently
reported because all apps went through the same code path, for example
there is an error reported when opening gdk display which is reported like
30 times by various apps. So what I saw most:
- gdk_display_open leading to write(buf) contains uninitialised or
unaddressable byte in __write_nocancel though _X11TransWrite
hard to tell without a debugging lib if the error is a false positive
a lack of initialization gdk_display_open() or within X. Strange thing
is that valgrind report the block as being alloc'ed with calloc()
offending address is 128 bytes inside a block of size 16384
- giop_send_buffer_write in libORBit-2 leading to
Syscall param writev(vector[...]) contains uninitialised or unaddressable byte(s)
that time the uninitialized data is 10 bytes inside a block of size 2048
allocated within orbit itself.
- pango read_line raises a strange pthread mutex error:
pthread_mutex_lock/trylock: mutex has invalid owner
in pthread_mutex_lock called by pango_read_line from pango_find_map
Apparently the GStreamer code detects it's running under valgrind and
manage to shut it up :-)
Except those 3 repeated all other the place and consisting of the bulk of
the reports, I have seen errors in:
- /usr/bin/gnome-session: invalid file descriptors,
pango_attr_list_get_iterator uninitialized value.
- /usr/bin/pam-panel-icon: 2 invalid file descriptor, seems the same
as for gnome-session with value 828 too.
- /usr/lib/libwnck: uninitialized values in _wnck_read_icons
- /usr/libexec/gconfd-2: repeated g_strdup of initialized values
from gconf_set_daemon_ior, gconf_get_lock, gconf_object_to_string,
gconf_quote_string, and an fprintf
- /usr/libexec/bonobo-activation-server: uninitialized values in
CORBA_ORB_object_to_stringr,fprintf,giop_send_buffer_write
- gam_server : I got one too :-)
- metacity: uninitialized values in gdk_window_new, gdk_window_resize,
gdk_region_rectangle, gdk_region_subtract, a couple of strange
g_int_equal bugs, meta_display_begin_grab_op, meta_display_end_grab_op
- gnome-terminal: terminal_profile_update and _vte_pty_open
The best way to double check is to do the same trick as I did for
gnome-session, move the original somewhere else, replace it by a script
calling valgrind but without recursion to child on a local copy of the
program in debugging mode.
Enclosed are the data as sorted and recouped for more informations.
happy valgrinding,
Daniel
--
Daniel Veillard | Red Hat Desktop team
http://redhat.com/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine
http://rpmfind.net/