On Fri, Mar 20, 2015 at 04:38:24PM +0100, Adrian Reber wrote:
The biggest MM2 problem which currently exists is that the crawler
segfaults when running with more than 10 or 12 threads. The current
configuration runs daily with 75 threads and crashes regularly:
[740512.481002] mm2_crawler[18149]: segfault at 30 ip 00007ffdd8201557 sp
00007ffd787d5250 error 4 in libcurl.so.4.3.0[7ffdd81d8000+63000]
[783445.620762] mm2_crawler[20500]: segfault at 30 ip 00007f87477ff557 sp
00007f86e7fd4250 error 4 in libcurl.so.4.3.0[7f87477d6000+63000]
[826619.130431] mm2_crawler[24376]: segfault at 30 ip 00007f7cee7ac557 sp
00007f7c8cfde250 error 4 in libcurl.so.4.3.0[7f7cee783000+63000]
[869846.873962] mm2_crawler[27771]: segfault at 30 ip 00007ffd3bc07557 sp
00007ffd11ff8250 error 4 in libcurl.so.4.3.0[7ffd3bbde000+63000]
By preloading libcurl from F21 on the command-line
seems to make the segfault go away. So somewhere between
curl 7.29 (RHEL 7.1) and curl 7.37 (F21) something was fixed
which would be needed on RHEL7.1 before switching to MM2.
This is discussed in
https://bugzilla.redhat.com/show_bug.cgi?id=1204825
Additionally the 4GB of RAM on mm-crawler01 are not enough to
crawl all the mirrors in a reasonable time. Even if only
started with 20 crawler threads instead of 75 the 4GB are not
enough.
This has been increased to 32GB (thanks) and I had a few test runs of the crawler
over the weekend with libcurl from F21:
All runs for 435 mirrors take at least 6 hours:
50 threads:
http://lisas.de/~adrian/crawler-resources/2015-03-21-19-51-44-crawler-res...
50 threads with explicit garbage collection:
http://lisas.de/~adrian/crawler-resources/2015-03-22-06-18-30-crawler-res...
75 threads:
http://lisas.de/~adrian/crawler-resources/2015-03-22-13-02-37-crawler-res...
75 threads with explicitly setting variables to None at the end:
http://lisas.de/~adrian/crawler-resources/2015-03-23-07-46-19-crawler-res...
Manually triggering the garbage collector makes almost no difference (if
any at all). The crawler takes huge amount of memories and a really long
time.
As much as I like the new threaded design I am not 100% convinced it the
best solution when looking at the memory requirements. Somewhere memory
must be leaking.
The next changes I will do is to sort the mirrors descending by the
crawl duration to make sure the longest runnings crawls are started as
early as possible (this was implemented in MM1). I will then try to
start with 100 threads to see how long it takes and how much memory is
required.
Adrian