[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion Histroy question

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Fri, 8 Jan 2010 12:44:57 +0100

On Wed, Jan 6, 2010 at 9:07 PM, Ryan Schmidt
<subversion-2010a_at_ryandesign.com> wrote:
>
> On Jan 6, 2010, at 12:31, Andreas Hoegger wrote:
>
>> Yes I do. You can imagine if 150 developers have been working for 5 years nobody will use 'blame'/'annotate' commands except for you feel infinitely bored at least with version 1.4 (it would take hours). Is there really nobody having the same problems?
>
> You're saying you're using Subversion 1.4 and "svn blame" takes hours to run? That doesn't sound like it should be.
>
> I use "svn blame" probably daily in my work on the MacPorts project. It's not slow. We use Subversion 1.6 now, but I don't remember "blame" ever being slow; it returns in seconds. Our repository has over 62,000 revisions and is 7.5 years old. We have over 120 registered committers, but probably only a few dozen are particularly active at the moment. But "blame" is very helpful to me in trying to figure out why a file says what it says. Just a couple days ago I used "blame" to research the complete 2-year history of a particular line of code, to try to understand why it was there (and not because I was bored):
>

Just jumping in here to support Andreas' complaint about blame being
slow: yes it's definitely slow. See:
http://subversion.tigris.org/issues/show_bug.cgi?id=3074 - Improve
performance of svn annotate

However, the slowness only hits you when you blame a large file with a
lot of revisions. With "large file" meaning more than a couple hundred
KB's, and "lot of revisions" meaning more than thousand or so.
Apparently, this is not so common for "source" files (that you'd want
to blame), so not a lot of svn users actually experience this. But in
our repo we have a 2 Mb xml file with 6000 revisions on which blame is
very useful. Blame on that file takes more than 4 hours so we don't do
it anymore. Before SVN we were on CVS, and there it took 15 seconds.
So I can certainly feel Andreas' pain :(. I can only hope that more
and more users hit this problem, so that it gets some more attention
and the issue will be addressed some day...

Also, log is slow (though not as bad as blame): "svn log"-ing that
6000 revision file takes 4 minutes or so (back on cvs that was 5
seconds). Both problems are not network related (everything on a lan
here), but they do have different causes:
- blame is mainly client-side io-bound (client gets 6000 binary
delta's and computes the line-based blame on the client side).
- log is server-side io-bound (server crawls 6000 FSFS rev files to
get those 6000 log messages).

Our server is 1.5.4 on Solaris.
Access by https (I tried http and svnserve, no significant difference)
Back end is FSFS on a NAS (over NFS).

Having the backend on a NAS over NFS might be a factor for log
slowness (when we upgrade to 1.6.x, I hope that packing the repo might
help here). I've experimented with putting the repo on local disk, and
that about halved the log time (down to 2 minutes, woohoo :)). Also,
BDB might have better performance characteristics for this (I haven't
tested this, but this might be a reason why the svn dev's have not hit
this themselves).

Anyway, I can understand Andreas' desire to clean out old history
that's no longer needed, and that's only slowing things down. But
sometimes, having that entire history is very useful, so we chose not
to do that. Instead, we learned to live with the current limitations
for now, and we hope they will be improved someday...

Regards,
Johan
Received on 2010-01-08 12:45:35 CET

This is an archived mail posted to the Subversion Users mailing list.