[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: svn blame unworkably slow

From: Johan Corveleyn <johan.corveleyn_at_uz.kuleuven.ac.be>
Date: Wed, 6 May 2009 18:16:54 +0200

Some more info in case anyone is still interested :) ...

Using one of the "ignore-whitespace" options with svn blame (-x-b or -x-w) improved the performance a lot (though it's still too slow to be really useful in an "online" scenario). I guess that's because some large all-lines-changing-indentation-revisions can be filtered out.

For the collapsed history, I get with "svn blame -x-b" (over https):
- Still ~8 minutes for file1.xml (2 Mb large, 799 revisions)
- ~2,5 minutes for file2.xml (1,5 Mb, 471 revisions)

I guess file2.xml had some all-lines-changing-indentation-revisions in those 471 revisions, that are now effectively filtered out by -x-b.

For the full history, I get with "svn blame -x-b" (over https):
- ~40 minutes for file1.xml (2 Mb large, 5500 revisions)
- ~15 minutes for file2.xml (1,5 Mb large, 2300 revisions)

Still interested in any feedback, people with similar experiences/numbers, workarounds, ...

Maybe using another diff command on the client (--diff-cmd option), one that is really super-fast, would help?? Anyone has any suggestions on a highly optimized diff for Windows XP (don't know if this would help blame at all...)?

Regards,
Johan

> Anyone able to help with the issue below? Are these normal numbers, has
> anyone observed similar performance? Any suggestions on how I could
> improve the performance of svn blame?
>
> I have done some more refined and smaller tests by reducing the number
> of revisions on these large files, by collapsing a large part of
> history in cvs before migrating. I came to the following numbers (over
> https):
> - ~8 minutes for file1.xml (2 Mb large, 799 revisions)
> - ~12 minutes for file2.xml (1,5 Mb large, 471 revisions)
>
> This is already a lot better, but still quite unusable in a normal
> development scenario. Besides, collapsing our history is seen as a very
> undesirable workaround for this issue.
>
> Some more observations:
> 1) I think it's mainly client-side cpu bound. Most of the time my
> client cpu is 100% used by svn client (sometimes it gets a rest for a
> couple of seconds, during which time the server (httpd) is fully
> loading a cpu on the server).
> 2) I guess the size of the files plays a big role in this issue. A file
> with 500 revisions doesn't seem abnormal to me, so file2.xml in the
> test above should be fine, except that it's large ...
> 3) I've been trying to figure out why file2.xml was slower in this test
> than file1.xml, even though it's smaller and has less revisions. I
> think it's because there are a couple of changes in its lifetime that
> changed the entire file (all lines), because of changes in indentation
> (spaces->tabs and vice versa). So I guess that causes huge diffs at
> that point.
> 4) I've put my repo on a local disk (to the server) instead of on the
> NFS share. That improved log performance (more than 50% faster), but
> did nothing to improve blame. I guess that's to be expected because of
> point 1).
>
> Thanks for any suggestions or input about this problem.
> Regards,
> Johan
>
>
> > Hi all,
> >
> > We're preparing/analyzing/testing a migration from CVS to SVN for
> some
> > time now. All went well until I started focusing on a particular set
> of
> > files in our repo: two large xml files that are changed very often. I
> > already reported problems with svn log on these files (see
> >
> http://subversion.tigris.org/ds/viewMessage.do?dsMessageId=1980568&dsFo
> > rumId=10650). But that's still manageable/workaroundable (e.g. --
> limit
> > 100 and the like).
> >
> > However, running svn blame on one of these files is taking a very
> long
> > time:
> > - ~5 hours for file1.xml (2 Mb large, 5500 changes) (locally on
> server
> > with file:// protocol, since https remotely generates an error after
> 50
> > minutes)
> > - ~35 minutes for file2.xml (1,5 Mb large, 2300 changes) (remotely
> over
> > https)
> >
> > Note: to narrow down on the issue, I tried this on a SVN repo only
> > containing these two files (performed cvs2svn to a new repo, only
> > migrating the directory with these files).
> >
> > Obviously this is not usable (if only for the timeout/error when
> trying
> > this over https). For comparison, cvs annotate for file1.xml took
> only
> > 17 seconds.
> >
> > Some info on the setup:
> > SVN server 1.5.4 on Solaris 10 (package from sunfreeware.com)
> > FSFS backend via NFS on netapp
> > SVN client SlikSVN 1.5.5 on Windows XP (when testing remotely)
> >
> > After discussing with other developers, this is quite a big issue.
> Some
> > developers that have to work on those files perform an "annotate"
> quite
> > often, to investigate particular parts of the file (who/what/why/when
> > it changed)
> >
> > Is there anything I can do about this?
> >
> > Should I raise this on the dev list to discuss possible improvements
> > (e.g. caching line-based history info)? I have seen an open bug
> report
> > discussing this
> > (http://subversion.tigris.org/issues/show_bug.cgi?id=3074), but it
> > doesn't seem to have a high priority or likeliness to be addressed.
> >
> > Some more details of some tests:
> > --------
> > $ time svn blame https://svnserver/svn/trunk/file1.xml
> > svn: REPORT of '/svn/!svn/bc/96061/trunk/file1.xml': Could not read
> > chunk delimiter: Secure connection truncated (https://svnserver)
> >
> > real 49m39.066s
> > user 0m0.031s
> > sys 0m0.140s
> > --------
> >
> > When running it locally on the server with file://:
> > --------
> > $ time svn blame file:///path/to/repos/trunk/file1.xml
> >
> > [snip]
> > real 320m12.402s
> > user 305m40.880s
> > sys 11m54.947s
> > --------
> >
> > Any help is greatly appreciated.
> >
> > Regards,
> > Johan
> >
> > ------------------------------------------------------
> >
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessage
> > Id=2069871
> >
> > To unsubscribe from this discussion, e-mail: [users-
> > unsubscribe_at_subversion.tigris.org].
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessage
> Id=2079688
>
> To unsubscribe from this discussion, e-mail: [users-
> unsubscribe_at_subversion.tigris.org].

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2082352

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2009-05-06 18:18:01 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.