Brian W. Fitzpatrick wrote:
> What would happen if a robot crawled a big repository with a whole
> lotta tags and branches?
>
> Shouldn't we have a big bold warning box in the book telling people to
> create a robots.txt file in their DOCUMENT_ROOT That contains:
>
> User-agent: *
> Disallow: /
>
> We've had this on svn.collab.net for ages, and I'm thinking we should
> really let people know about it.
Depending on the number of directories, it might be desirable to allow
crawling of directories, but not of file contents. For example I have
often searched Google for "somefile.c webcvs" in order to have a look at
that source file. But can that be done with /robots.txt ? I believe
not :-(
> And while I'm at it, how about advising people who use Subversion
> working copies as websites to put something like this in their
> httpd.conf files:
>
> # Disallow browsing of Subversion working copy administrative
> # directories.
> <DirectoryMatch "^/.*/\.svn/">
> Order deny,allow
> Deny from all
> </DirectoryMatch>
>
> Thoughts?
Sounds like a good idea. IIRC .svn can contain things that should often
by kept private in cases like this?
Of course, not everyone who uses a WC for the web documents will be
running Apache, but (hopefully) the idea translates easily to other servers.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Mar 12 06:08:14 2004