[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: stopping webcrawlers using robots.txt

From: Bob Proulx <bob_at_proulx.com>
Date: 2006-07-09 20:42:58 CEST

Thomas Beale wrote:
> how to make /robots.txt visible in an apache virtual host config for
> a subversion server.

Put it in your document root. Let it be served normally by the web
server.

> How would I tell Apache to allow requests to read /robots.txt given
> the following configuration?
> ...
> <Location />
> DAV svn
> SVNParentPath /usr/local/var/svn

Oh, I see. You have configured your subversion repository as the only
visible paths in your web server. In my opinion you have made a bad
choice of repository location. By putting it in the root directly you
have prevented your ability to do anything else with your web server.
At that point I think you are unable to do what you want. So to
answer your question, in your configuration you can't. I suppose you
could check in robots.txt into the top level of your repository. But
then it would be part of your project and so forth.

I suggest you reconfigure your server to put the subversion
repositories under an /svn directory. That will release your web
server for other uses under your document root.

> <Location /svn>

That way you can put a robots.txt file into your document root and all
will behave normally.

Sorry that I am proposing that you change your URL. But I think that
is the best course of action. If you don't do it for this problem you
will eventually need to do it for another reason later. The earlier
you make URL changes the better because it only gets more painful
later.

Bob

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Jul 9 20:43:59 2006

This is an archived mail posted to the Subversion Users mailing list.