On Tue, 2014-12-09 at 09:42 +0100, Adrian Reber wrote:
On Tue, Nov 25, 2014 at 07:24:32AM -0700, Kevin Fenzi wrote:
> On Tue, 25 Nov 2014 10:27:19 +0100
> Adrian Reber <adrian(a)lisas.de> wrote:
> > The OOM killer on bapp02 has terminated a few mirrormanager crawler
> > processes. It seems it needs more memory or the number of parallel
> > crawlers has to be further limited.
> Well, it's got 16GB now... I can bump it to 24 without too much
> trouble. Will of course need a freeze break...
> We are currently doing 60 threads. We could cut it down, but I guess
> I'd say lets try more memory first.
As dmesg on bapp02 has no timestamps it is hard to tell when the last
crawler was terminated because of OOM. Looking at different log files it
seems, however, that some crawler processes are terminated without
finishing correctly. Especially mirrors which take a long time to crawl
are not examined completely. So maybe it would be a good thing to
decrease the number of parallel crawls to avoid OOM situations.
I have just managed to reproduce it right now while I was running the
umdl script, maybe it didn't like having both running at the same time?