[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: stopping webcrawlers using robots.txt

From: Ryan Schmidt <subversion-2006c_at_ryandesign.com>
Date: 2006-07-16 04:31:37 CEST

I worked on this problem today and reached the following solution. If
you try this out and it works (or if you try it out and it doesn't
work) please let me know!

<VirtualHost *:80>
        ServerName svn.example.com
        
        RewriteEngine on
        RewriteCond %{REQUEST_METHOD} ^GET$
        RewriteRule ^/(favicon\.ico|robots\.txt|svnrsrc/.*)$ http://
www.example.com/svnroot/$1 [P,L]
        
        <Location />
                DAV svn
                SVNParentPath /path/to/subversion/repositories
                SVNListParentPath on
                SVNIndexXSLT /svnrsrc/index.xslt
                AuthType Basic
                AuthName "Subversion Repositories"
                AuthUserFile /path/to/subversion/conf/users
                Require valid-user
        </Location>
</VirtualHost>

Basically, no matter what manner of Alias directives and the like I
tried, the DAV server in the Location directive always wanted to take
precedence. The only way I found to conditionally override the DAV
server was to proxy the request away to a separate vhost using
mod_rewrite. I only do this for GET requests, and only for
favicon.ico, robots.txt and a directory svnrsrc where you can put
xslt files, css files, images and anything else you might need to
properly show your directory listings. All other requests are still
handled by mod_dav_svn.

Here, I just made a directory svnroot in the document root of my
normal vhost (www.example.com) to contain the favicon.ico, robots.txt
and svnrsrc that will be used by svn.example.com. It shouldn't bother
anybody there. If you'd like, you should even be able to prevent
people from accessing it directly via http://www.example.com/svnroot/
and make it only available via the proxied connection, like this:

<VirtualHost *:80>
        ServerName www.example.com
        DocumentRoot /wherever/the/document/root/is
        ...
        <Location /svnroot>
                Order allow,deny
                Allow from 127.0.0.1
        </Location>
</VirtualHost>

Note: This is your normal vhost, not your Subversion vhost.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Jul 16 04:33:04 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.