More python 2.7 fun: deprecation of PyCObject API

Toshio Kuratomi a.badger at gmail.com
Fri Aug 13 23:38:37 UTC 2010


On Fri, Aug 13, 2010 at 02:20:51PM -0400, David Malcolm wrote:
> (Sorry about the length of this email)
> 
> Python 2.7 deprecated the PyCObject API in favor of a new "capsule" API.
>   http://docs.python.org/dev/whatsnew/2.7.html#capsules
> 
> The deprecations are set to "ignore" by default, so in theory the API
> still works: every time an extension uses the API, a deprecation warning
> is emitted, but then "swallowed" by the filters, and the call succeeds.
> 
> However, if someone overrides the process-wide warnings settings, then
> the API can fail altogether, raising a PendingDeprecationWarning
> exception (which in CPython terms means setting a thread-specific error
> state and returning NULL).
> 
Do I understand correctly that when raising this in python code, it's just
a warning but when raising this in CPython, it's the same as an exception
and thus causes problems?

> There are at least 15 extension modules that use this API in Fedora 14,
> and most of the C code I've seen that uses this API doesn't expect it to
> return NULL.  This can lead to hard failures in which /usr/bin/python
> aborts with an assertion failure (or even a segfault).
> 
> This has caused at least one app to fail (virt-manager, see bug 620216,
> due to it modifying the warning settings: fixed), so I've been
> doublechecking the scope of usage of the PyCObject API, and I've filed
> bugs against components that are possibly affected:
> https://bugzilla.redhat.com/showdependencytree.cgi?id=620842&hide_resolved=1
> 
> This was on a test machine, so there may be some I've missed.
> 
> Unfortunately, the list of affected modules includes pygtk2 and
> pygobject, numpy, selinux, and SWIG.
> 
> To add to the "fun", the pygtk2/pygobject modules expose the API via
> macros in a public header file, so that to fix this we'd have to both
> rebuild those modules, and rebuild users of the header file.
> 
> 
> You can typically trigger a hard failure of the API via:
> >>> import warnings
> >>> warnings.filterwarnings('error')
> and then try to import one of the affected modules.  I've tried to give
> exact reproducers where I have them in each of the bugs.
> 
> But if nothing touches the warning settings, you'll never see a problem.
> 
What about warnings.filterwarnings('default') (or 'always', 'module', or
'once')?

Does this also reliably trigger the hard failure?
  PYTHONWARNINGS='error' python PROGRAM

> Possible ways forward:
>   (a) don't fix this; treat enabling the warning in the "Doctor, it
> hurts when I do this!  So don't do that!" category, and add this to the
> release notes.  Patch Python code that enables the warning so that it
> doesn't.
>   (b) try to fix the ones that are self-contained; send fixes upstream
>   (c) try to fix them all; send fixes upstream
>   (d) hack the python rpm to remove this warning; this would be a
> significant change from upstream, given that it's already disabled.
> 
Taking the next bit out of order:

> Personally, I'm leaning towards option (a) above (the "don't override
> warnings" option): closing the various as WONTFIX, and adding a section
> to the release notes, whilst working towards fixing this in Fedora 15.
> Affected applications should be patched in Fedora 14 to avoid touching
> the relevant warning setting, and we'll fix the root cause in Fedora 15.
> 
Is it overriding the warnings option that causes a problem or is it *only*
setting the warnings filter to 'error' that is the problem?  I think that
setting the warning level to always, default, module, or once should be
supported.  Setting a "warning" to "error" could be seen as "buyer beware",
though.  ie: if it's only error that's affected, then (a) seems okay.  If
the others also cause issues, then I think (a) is the wrong fix.

> One issue here is that this API expresses a binary interface between
> different Python modules, and that we don't yet have a way to express
> this at the rpm metadata level.  I think we should, to make it easier to
> track these issues in the future.  I don't think it's possible to track
> these automatically, but we could do it manually.
> 
Tracking this manualy is no good unless you can explain to people how to
detect it.  Once you can explain how to manually detect it, it might be
possible to automatically detect it....

> Suggested way to express this:
> Modules that provide a capsule named
>    "full.dotted.path.to.module.CAPSULENAME"
> as passed to PyCapsule_Import [1] should have this in their specfile
>    Provides: PyCapsule(full.dotted.path.to.module.CAPSULENAME)
> for the subpackage holding the module in its %files manifest.
> 
> 
> Modules that import a capsule should have a corresponding:
>    Requires: PyCapsule(full.dotted.path.to.module.CAPSULENAME)
> in the appropriate subpackage.
> 
> 
> So, for example, if we apply the patch I've proposed upstream for pygtk,
> then pygtk2 should have a:
>    Provides: PyCapsule(gtk._gtk._PyGtk_API)
> 
What's the dotted path here represent?  Is that a file on the filesystem?
Is it a function in a pygtk .c file?  Is it a string that's exported by
pygtk via some PyCapsule function call?

> and anything using it needs a:
>    Requires: PyCapsule(gtk._gtk._PyGtk_API)
> 
What is meant by "using it"?  Python code that does import gtk?  Or only
C code that calls some PyCapsule function to ask for that to be returned?

> This wouldn't solve all the problems: we'd also need the legacy users of
> the macro to be rebuilt, and upgrading pygtk2 without upgrading them
> would lead to a runtime failure when the PyCObject_Ptr call fails (we
> could potentially supply both hooks, though this would fail if ignoring
> deprecation warnings was disabled).
> 
> So upon switching from PyCObject to the capsule API and adding the:
> Provides above, pygtk2 would also need to add a Conflicts on each of the
> known users of the API, <= the last version (known to use the broken
> API); these would then need to be rebuilt, and rpm/yum would have enough
> information to enforce valid combinations.
> 
This seems wrong but I could just be too used to thinking of things in terms
of SQL.  You have a many to one relationship here and you're putting the
data in the record that is the one instead of the many.

Maybe someone else can offer a better model for doing this.

Also, it seems like this is something that we're fixing on the boundary
between releases where we traditionally don't treat the Provides/Requires
situation wrt the already installed packages quite the same.

> (None of this seems to address the issue of ABI changes between
> different supposedly-compatible versions of the API; perhaps we need to
> treat these capsule names like SONAMEs, and have a numbering within
> them?)
> 
I'd love to see you get SONAMEs with API versions into upstream python for
extensions and pure python code :-)  Not sure if something less than
using only upstream data is a good idea though.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/devel/attachments/20100813/9844b30c/attachment.bin 


More information about the devel mailing list