MM2 crawler crash solved

Friday, 10 April 2015

While trying to recreate the mm2_crawler crash without the MirrorManager
database as backend I discovered that the crawler mainly uses python's
httplib to do all the HEAD requests. For repomd.xml file, which are
actually downloaded, the crawler switches to urlgrabber. Which seems to
be problematic in threaded applications. Or in combination with httplib.
Or something.

The easiest solution seems to be to rewrite the single
urlgrabber.urlread() to use one of the other available methods.

So a question to the python experts. Which implementation is the
"best" to download a single repomd.xml via either http or ftp?

I would replace it with urllib2. Is that the correct replacement?

		Adrian

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

MM2 crawler crash solved