MM2 crawler crash solved

Kevin Fenzi kevin at scrye.com
Fri Apr 10 16:05:27 UTC 2015


On Fri, 10 Apr 2015 14:41:22 +0200
Adrian Reber <adrian at lisas.de> wrote:

> While trying to recreate the mm2_crawler crash without the
> MirrorManager database as backend I discovered that the crawler
> mainly uses python's httplib to do all the HEAD requests. For
> repomd.xml file, which are actually downloaded, the crawler switches
> to urlgrabber. Which seems to be problematic in threaded
> applications. Or in combination with httplib. Or something.

Ah. Great detective work! 

> The easiest solution seems to be to rewrite the single
> urlgrabber.urlread() to use one of the other available methods.
> 
> So a question to the python experts. Which implementation is the
> "best" to download a single repomd.xml via either http or ftp?
> 
> I would replace it with urllib2. Is that the correct replacement?

I would think that or python-requests? Not sure...

kevin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20150410/238a5920/attachment.sig>


More information about the infrastructure mailing list