[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: stopping webcrawlers using robots.txt

From: Thomas Beale <thomas_at_deepthought.com.au>
Date: 2006-07-10 15:02:24 CEST

Ryan Schmidt wrote:
>

>
> I tried this suggestion from Todd which sounded promising:
>
> On Jul 10, 2006, at 01:44, Todd D. Esposito wrote:
>
>> Alias /robots.txt /some/non/svn/path/robots.txt
>> <Location /robots.txt>
>> SetHandler default-handler
>> </Location>
>
> But it doesn't seem to be working.

not for me either...I added the following, but it still doesn't help...

         Alias /robots.txt /usr/local/var/svn/robots.txt
         <Location /robots.txt>
                 SetHandler default-handler
                 allow from any ## added this
         </Location>

I'm not an apache specialist at all, so I don't really like messing
around in a trial and error fashion too much...I'm getting our sysadmin
to have a look at it.

The other thing that occurred to me is that we are running wsvn, and web
bots and crawlers use those URLs heavily as well (I just looked in the
logs - they are there alright). So I should block /wsvn/ from our main
server as well....

>
> This may not be a great help to you, but when I was unable to solve this
> within Apache, and since I was playing around with the lighttpd web
> server anyway, I arranged it so that web access to the repository
> occurred via lighttpd, which proxied all requests to Apache running on a
> different port -- all requests, that is, except for favicon.ico,
> robots.txt, and the CSS and XSLT stylesheets. Working copies themselves
> directly accessed the Apache port (since although lighttpd is supposed
> to support proxying to Apache / Subversion, it seems to be broken at the
> moment).

I can see that this would work, but I think I will persevere more with
the current vhost config to see if we can't get apache to do the right
thing (i.e., what I want it to do;-)

Any more advice is welcome of course...

thanks,

- thomas beale

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Jul 10 15:04:19 2006

This is an archived mail posted to the Subversion Users mailing list.