On Fri, 10 Apr 2015 14:41:22 +0200
Adrian Reber <adrian(a)lisas.de> wrote:
While trying to recreate the mm2_crawler crash without the
MirrorManager database as backend I discovered that the crawler
mainly uses python's httplib to do all the HEAD requests. For
repomd.xml file, which are actually downloaded, the crawler switches
to urlgrabber. Which seems to be problematic in threaded
applications. Or in combination with httplib. Or something.
Ah. Great detective work!
The easiest solution seems to be to rewrite the single
urlgrabber.urlread() to use one of the other available methods.
So a question to the python experts. Which implementation is the
"best" to download a single repomd.xml via either http or ftp?
I would replace it with urllib2. Is that the correct replacement?
I would think that or python-requests? Not sure...