MM related changes
adrian at lisas.de
Fri Jun 12 16:05:11 UTC 2015
On Fri, Jun 12, 2015 at 09:47:34AM -0600, Kevin Fenzi wrote:
> These changes look ok to me. ;)
Thanks, for the reviews. I will commit them and somebody needs to run
the corresponding playbooks.
> Related however, what is our plan for crawlers currently?
> We are only going to use mm-crawler01?
No, both crawlers are used. There is script in the crawler command-line
which retrieves the crawler with the highest ID and divides it by 2, so that
every crawler gets half of the mirrors to crawl.
/usr/bin/mm2_crawler --timeout-minutes 180 --threads 35 `/usr/local/bin/run_crawler.sh 2`
> Should we nuke 02?
No, as we are using it.
> Should we give 01 more memory?
No, we need to crawl better and not throw more resources at it.
> I still have seen them hit swap alerts in nagios, which we should avoid
> if at all possible.
Okay, currently we have 35 parallel crawlers running. We can decrease it to
32. The problem is that it is hard to predict which mirrors are crawled
at the same time and thus it is hard to predict the required memory.
The current setup, however, is not optimal. We have two crawlers with
64GB of memory in total which is only used for maybe 10 hours per day
and the rest of the day the memory is completely idling and wasted.
So we could crawl all mirrors on one crawler. First the first half and
then the second half. The reason this is not yet implemented is that we
do not have a way to gracefully shutdown the crawler if it takes to
long. Sometimes the crawler is hanging on a single mirror for hours and
it is not clear why. Even with all the timeout checks all over the place
it just hangs.
I am currently working on some code to gracefully shutdown the crawler
if it takes too long. With this and with canary and repodata mode we can
use only one crawler and make sure its resources are in use most of the
time. If this feature to gracefully shutdown the crawler is finished I
would like to join one of the next infrastructure meetings to discuss in
a larger group what the best combination of full, repodata and canary
crawls would be.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 811 bytes
Desc: not available
More information about the infrastructure