[change req] Allow fedorahosted robots.txt to only crawl /wiki/*

Stephen John Smoogen smooge at gmail.com
Mon Dec 31 02:36:33 UTC 2012


On 30 December 2012 18:12, Kevin Fenzi <kevin at scrye.com> wrote:
> On Sun, 30 Dec 2012 20:07:37 -0500
> Ricky Elrod <codeblock at elrod.me> wrote:
>
>> We've been seeing load spikes on hostedXX, following
>> df7e8578432b224d9576dc8359f0729763861526. This semi-reverts that
>> commit and only allows /wiki/* to be crawled.
>>
>> diff --git a/configs/web/fedorahosted.org/fedorahosted-robots.txt
>> b/configs/web/fedorahosted.org/fedorahosted-robots.txt
>> index cd572f8..7782677 100644
>> --- a/configs/web/fedorahosted.org/fedorahosted-robots.txt
>> +++ b/configs/web/fedorahosted.org/fedorahosted-robots.txt
>> @@ -1,5 +1,5 @@
>>  User-agent: *
>> -Disallow: /*/browser
>> -Disallow: /*/search
>> +Allow: /wiki/*
>> +Disallow: /
>>  user-agent: AhrefsBot
>>  disallow: /

+1 Also.
-- 
Stephen J Smoogen.
"Don't derail a useful feature for the 99% because you're not in it."
Linus Torvalds
"Years ago my mother used to say to me,... Elwood, you must be oh
so smart or oh so pleasant. Well, for years I was smart. I
recommend pleasant. You may quote me."  —James Stewart as Elwood P. Dowd


More information about the infrastructure mailing list