[change req] Allow fedorahosted robots.txt to only crawl /wiki/*
Kevin Fenzi
kevin at scrye.com
Mon Dec 31 01:12:49 UTC 2012
On Sun, 30 Dec 2012 20:07:37 -0500
Ricky Elrod <codeblock at elrod.me> wrote:
> We've been seeing load spikes on hostedXX, following
> df7e8578432b224d9576dc8359f0729763861526. This semi-reverts that
> commit and only allows /wiki/* to be crawled.
>
> diff --git a/configs/web/fedorahosted.org/fedorahosted-robots.txt
> b/configs/web/fedorahosted.org/fedorahosted-robots.txt
> index cd572f8..7782677 100644
> --- a/configs/web/fedorahosted.org/fedorahosted-robots.txt
> +++ b/configs/web/fedorahosted.org/fedorahosted-robots.txt
> @@ -1,5 +1,5 @@
> User-agent: *
> -Disallow: /*/browser
> -Disallow: /*/search
> +Allow: /wiki/*
> +Disallow: /
> user-agent: AhrefsBot
> disallow: /
It seems like some things are causing the load, but it's hard to
isolate (in particular timeline, changeset and log since they all hit
the repo browser).
I'm +1 to just applying this one for now and we can adjust it further
after the freeze is over.
kevin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20121230/9375cf1d/attachment.sig>
More information about the infrastructure
mailing list