On Thu, 2 Sep 2010, Pascal Minnerup wrote:
Dear Fedora team,
We on the Google Code Search project (
www.google.com/codesearch) want to improve the
quality of our index, and as part of that, would like to systematically crawl the fedora
git repositories of
fedoraproject.org, which we consider one of the major hosts of open
source. Our crawlers use bandwidth throttling that should ensure that we don't
overstress your web servers.
1. Is it okay for you if we systematically crawl your git repositories for new source
code?
2. How would you recommend we get the repository directories? Our current approach would
be to get the git repositories of recently updated packages from this page:
http://pkgs.fedoraproject.org/gitweb/?o=age.
3. Are there any particular times or actions we should _avoid_?
4. Is there any particular person we should talk to in the future?
An answer to these questions would be very helpful in improving the presence of Fedora
code files in Code Search. We look forward to hearing from you.
Thanks for contacting us, we really don't know how that would all react
but I'm ok with it provided we can contact you to change things later if
things do go south?
-Mike