Debug Python stacks revisited - experimental build in Rawhide, targetting Fedora 14

David Malcolm dmalcolm at redhat.com
Thu May 20 19:37:17 UTC 2010


(This is a scaled-back version of a proposal I sent to this list a
couple of months ago [1])

There are various configuration flags that can be used when building
Python.

Currently we have a configuration aimed at the typical use-case: as much
optimization as reasonable.

However, upstream Python supports a number of useful debug options which
use more RAM and CPU cycles, but make it easier to track down bugs [2]
Typically these are of use to people working on Python C extensions, for
example, for tracking down awkward reference-counting mistakes.   I've
had at least three developers whose opinion I value very highly ask me
for these (for example John Palmieri is currently working on the PyGI
stack, and is running into difficult reference-counting issues).
Indeed, Debian and Ubuntu have had these alternate builds available for
a couple of years now. [3]

I've looked through Debian's patch [4], and come up with a somewhat
modified version that does mostly the same thing, though (I think)
somewhat better fitting our build process.

The python.spec now configures and builds, and installs the python
sources twice, once with the regular optimized settings, and again with
debug settings. (in most cases the files are identical between the two
installs, and for the files that are different, they get separate paths)

I've been testing with this on my machine and it works fine; I've also
been able to successfully use distutils to build extension modules

So I've decided to try this in Rawhide for F-14; the latest build is
here:
http://koji.fedoraproject.org/koji/buildinfo?buildID=174357

The relevant CVS commit is here:
http://cvs.fedoraproject.org/viewvc/rpms/python/devel/python.spec?r1=1.184&r2=1.185
http://cvs.fedoraproject.org/viewvc/rpms/python/devel/python-2.6.5-debug-build.patch?revision=1.1&view=markup

and the specfile comment contains more detailed implementation notes.

The builds are set up so that they can share the same .py and .pyc files
- they have the same bytecode format.

However, they are incompatible at the machine-code level: the extra
debug-checking options change the layout of Python objects in memory, so
the configurations have different shared library ABIs.  A compiled C
extension built for one will not work with the other.

The key to keeping the different module ABIs separate is that module
"foo.so" for the standard optimized build will instead be "foo_d.so"
i.e. gaining a "_d" suffix to the filename, and this is what the
"import" routine will look for.  This convention is from the Debian
patch, and ultimately comes from the way the Windows build is set up in
the upstream build process.

Similarly, the optimized libpython2.6.so.1.0 now has a
libpython2.6_d.so.1.0 cousin for the debug build: all of the extension
modules are linked against the appropriate libpython, and there's
a /usr/include/python2.6-debug directory, parallel with
the /usr/include/python2.6 directory.  There's a new "sys.pydebug"
boolean to distinguish the two configurations, and the distutils module
uses this to supply the appropriate header paths ,and linker flags when
building C extension modules.

Finally, the debug build's python binary is /usr/bin/python2.6-debug,
hardlinked as /usr/bin/python-debug (as opposed to /usr/bin/python2.6
and /usr/bin/python)

It's easy to spot the debug build: the interactive mode tells you the
total reference count of all live Python objects after each command:

[david at surprise devel]$ python-debug
Python 2.6.5 (r265:79063, May 19 2010, 18:20:14) 
[GCC 4.4.3 20100422 (Red Hat 4.4.3-18)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print "hello world"
hello world
[28748 refs]
>>> 
[28748 refs]
[15041 refs]

So the debug build shares _most_ of the files with the regular build
(.py/.pyc/.pyo files; directories; support data; documentation); the
only differences are the ELF files (binaries/shared libraries), and
infrastructure relating to configuration (Include files, Makefile,
python-config => python-debug-config, etc) that are different.

I've tested building the "coverage" module against both runtimes, and it
works; it installs shared .py/.pyc files and a pair of
tracer.so/tracer_d.so files.

I tried a few different ways of packaging the debug configuration: I
considered 
(a) adding it to the python-devel subpackage, or 
(b) to the python-debuginfo subpackage (Debian adds it to their
python-dbg packages, which are kind of the equivalent of our -debuginfo
rpms), alternatively 
(c) building out a "debug" subpackage for each of the subpackages within
the python specfile, doubling the number of subpackages

The approach I favor (option (d), I guess), is to have a single
"python-debug" subpackage, holding everything to do with the debug
configuration: equivalent to all of the subpackaes from the regular
configuration, and requiring them all (since they leverage the
shared .py files, for instance).  My reasoning here is that this feature
is aimed at advanced Python developers, and if you want some of it you
probably want all of it - so just one subpackage for simplicitly - but
you don't need it for regular builds or debugging, so it's seems better
to keep separate from the -devel and -debuginfo subpackages.

This is a scaled-back version of my earlier proposal (in which I
proposed entirely parallel stacks, and varying the unicode settings)
This is far simpler.  In particular, the optimized build should be
unaffected: all of the paths and the ELF metadata for the standard build
should be unchanged compared to how they were before adding the debug
configuration.

I would like to build out some of our compiled extension modules so that
we can add -debug subpackages, in an analogous way to the core python
package, but I think it should purely be a voluntary thing: I don't want
to burden people packaging Python modules with additional work.  Having
said that, if you do find yourself debugging a nasty reference counting
issue inside an extension module, you'll need a debug build of every C
extension module that your reproducer script uses, so the more the
better.  For reference, Ubuntu do this for all of the Python code in a
typical GNOME desktop  [3].   We should figure out sane RPM conventions
for packaging these (sorry: yes I want to change the python packaging
guidelines again, hopefully less invasive than the Python 3 change
though)

I'm tracking all of this work here:
https://fedoraproject.org/wiki/DaveMalcolm/DebugPythonStacks

I hope for it to be a Fedora 14 feature.  It's debatable whether it
should be a feature: this is an area where we're somewhat behind other
distributions, so not so good from a marketing perspective - but a good
thing to get fixed.

I plan to work next on doing the same for our python3 src.rpm.  I need
to try to get this upstream in some form as well.

Hope this seems sane - thoughts? (thanks for reading this far; I know
this email is too long)

Dave

[1]
http://lists.fedoraproject.org/pipermail/python-devel/2010-March/000213.html
[2] http://svn.python.org/projects/python/trunk/Misc/SpecialBuilds.txt
[3] https://wiki.ubuntu.com/PyDbgBuilds
[4]
http://patch-tracker.debian.org/patch/series/view/python2.6/2.6.5-2/debug-build.dpatch and http://patch-tracker.debian.org/patch/series/view/python2.6/2.6.5-2/pydebug-path.dpatch




More information about the python-devel mailing list