I'll likely be helping to guide updates to the Python packaging format standards over the coming months. While they won't hit the standard library until 3.4, there will likely be third party tool support in earlier versions (since the whole point of the exercise is to eliminate the current implementation coupling to distutils and setuptools in favour of better defined metadata standards for communication between multiple tools).
The first step will be reviewing the status quo and then creating a plausible road map (as well as describing current efforts for various aspects). I've started on that here: http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/core_packagin...
One thing I would *love* to be able to enable is adding support for automatic mapping of PyPI distribution names (similar to what already exists for Perl and CPAN) where (for example), a developer could just write "Requires: python(south)" instead of having to figure out manually the name of the appropriate RPM package in Fedora.
I believe that the new metadata fields defined in PEP 345 and PEP 426 should be enough to support that when generating a SPEC file from the PyPI metadata.
Cheers, Nick.
On Wed, Sep 12, 2012 at 01:41:22PM +1000, Nick Coghlan wrote:
I'll likely be helping to guide updates to the Python packaging format standards over the coming months. While they won't hit the standard library until 3.4, there will likely be third party tool support in earlier versions (since the whole point of the exercise is to eliminate the current implementation coupling to distutils and setuptools in favour of better defined metadata standards for communication between multiple tools).
The first step will be reviewing the status quo and then creating a plausible road map (as well as describing current efforts for various aspects). I've started on that here: http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/core_packagin...
One thing I would *love* to be able to enable is adding support for automatic mapping of PyPI distribution names (similar to what already exists for Perl and CPAN) where (for example), a developer could just write "Requires: python(south)" instead of having to figure out manually the name of the appropriate RPM package in Fedora.
I believe that the new metadata fields defined in PEP 345 and PEP 426 should be enough to support that when generating a SPEC file from the PyPI metadata.
Hey Nick! Thanks for working on this whole thing. I haven't been looking at things in the past year but was active on it a few years ago if you want to bounce any ideas around, get some idea of what's been discussed i nthe past by whom, or talk about what Fedora-specifically is currently doing/has found troublesome in the past.
Mapping of pypi distribution names was something that I looked into a small bit with some Canonical people at PyCon two years ago. IIRC, it was part of trying to map distribution packages with each other and with pypi in order to figure out the state of python3 porting. Barry Warsaw still works there and might know more about what happened to it -- Allison Randal is who I was working with but I got the impression that she isn't working on that anymore so I don't know if she'll still have code around or not.
There are numerous caveats to trying to do this, none of them insurmountable. I believe we were trying to compare versions as well as package names which was even tougher. Some things I can remember off the top of my head:
* Multiple names for a package * pypi usually has the same name as the setup.py name field which is also encoded in the egg file name and metadata * The name of the module that is imported * One upstream package having multiple downstream names -- for instance, Debian has setuptools and pkg_resources as two separate binary packages. However, they're provided by the setuptools pypi entry. * A module that is provided by multiple upstreams: For instance, setuptools is provided by setuptools and distribute. pexpect is provided by pexpect and pexpect-u. * Some packages aren't present on pypi. Many library bindings provided by the libraries are this way, for instance libselinux-python. * Some libraries have conflicting names upstream (mock was one example in the past. ming still is one: http://www.libming.org/ and http://merciless.sourceforge.net/ * The naming for packages in the distribution isn't always simple. In Fedora, for instance, we have several styles and not all of them are mutually exclusive: * python-foo is the common case for libraries (python-docutils) * foo-python are the majority (but not all) of libraries which are bindings to C libraries (selinux-python) * Fedora allows modules that have py as a prefix to not add python- to their name. (pygtk2) * Applications may provide libraries that other packages need but since they're "primarily applications" they may simply bear the name of the application. (bzr) * Case sensitivity can get you. There are some maintainers that prefer to lowercase the package names and others who prefer to do exactly what upstream did. * dashes, periods, and other punctuation can also get you. Sometimes those are translated into dashes, other times they're left alone, and other times they're omitted. * Even today, not all packages provide egg-info. For instance, a useful python module might consist of a single .py file so upstream might not provide a setup.py for it. We install it by copying the .py file to site-packages. * We'll need to differentiate between things provided for python2 and python3 (and python26 in EPEL)
All that said.... you can probably sidestep some of these issues by having python packages contain explicit virtual Provides. These might be manually added or automatically generated by a tool like pypi2rpm with the maintainer editting them afterwards to make sure they didn't hit any of the above cornercases.
There's some prior work done by other people: * http://www.rpm.org/ticket/154 * http://lists.fedoraproject.org/pipermail/packaging/2008-June/004715.html (Despite my having written that email, the hard parts were all dmalcolm :-)
-Toshio
On 09/13/2012 08:14 AM, Toshio Kuratomi wrote:
All that said.... you can probably sidestep some of these issues by having python packages contain explicit virtual Provides. These might be manually added or automatically generated by a tool like pypi2rpm with the maintainer editting them afterwards to make sure they didn't hit any of the above cornercases.
Yeah, explicit virtual provides is definitely the path I was thinking of heading down, along the lines of automating the simple cases (where "python(pypi-dist-name)" does the right thing when there's a one-to-one mapping from the PyPI distribution to the Fedora RPM, even if the RPM name is different), and permitting manual workarounds for the "two RPMs" case (e.g. by setting things up so that depending on "python(pypi-dist-name)" install both of them) and being able to declare explicit conflicts for the non-pypi items.
There's some prior work done by other people:
- http://www.rpm.org/ticket/154
- http://lists.fedoraproject.org/pipermail/packaging/2008-June/004715.html (Despite my having written that email, the hard parts were all dmalcolm :-)
Thanks for the references.
Cheers, Nick.
On Wed, Sep 12, 2012 at 3:14 PM, Toshio Kuratomi a.badger@gmail.com wrote:
Mapping of pypi distribution names was something that I looked into a small bit with some Canonical people at PyCon two years ago. IIRC, it was part of trying to map distribution packages with each other and with pypi in order to figure out the state of python3 porting. Barry Warsaw still works there and might know more about what happened to it -- Allison Randal is who I was working with but I got the impression that she isn't working on that anymore so I don't know if she'll still have code around or not.
Grepping through my programming work for something else, I happened upon the work I did. Allison has it up on bitbucket (although it doesn't look like that has seen any work since my commits).
http://bitbucket.org/allison/py3kdeps
Once I saw the code, I remembered how we worked around many of the issues with mapping package names to pypi names and names in other distros. We used the tarball. For Fedora this is especially easy as we don't modify the tarball at all, so the tarball name is almost always going to be the same as what's in pypi. For Debian derivatives, the tarball gets renamed, but it's still possible to do this.
-Toshio
python-devel@lists.fedoraproject.org