[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Fwd: Effects of importing over 18000 items into repository as one commit transaction

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2006-01-11 03:12:44 CET

Pavel Repin wrote:
> This did not result in any response on the users list.
> Hopefully someone among svn hackers might know about the impacts of a
> largish commit.

It looks like there was no response from the developers' list either. Sorry
about that. Probably lots of people read it but they all thought "Hmm, I don't
know. It's rather vague." At least that was my reaction.

> ---------- Forwarded message ----------
> From: *Pavel Repin* <prepin@gmail.com <mailto:prepin@gmail.com>>
> Date: Nov 29, 2005 8:18 AM

> We are rolling out a subversion install at work.
> One of the teams created a FSFS repository and svn-imported a snapshot
> of their entire source tree in one transaction. The resulting FSFS
> revision file "1" is 261 MB and contains 18834 items. Do you think that
> was a bad idea to import the entire tree in one shot?

That's not an unreasonable size of import, so it should be fine.

> I sort of suspect it was a bad idea because I am noticing that "svn log"
> takes considerable amount of time before it dumps anything to stdout on
> a repository that had only 35 checkins so far.

How slow is it? What exact command are you using? (Just "svn log" in an
up-to-date working copy?) What version of Subversion are you using (both
client and server)?

A whole "svn log" of a large repository does typically take a very long time,
but that's (I assume) because a large repository typically has a very large
number of revisions (e.g. tens of thousands). In old versions of Subversion it
used to take a long time before starting to print anything, but that was fixed
. It ought to be quick on only 35 revisions regardless how big each revision is.

Does it work quickly if you avoid the first revision, e.g. with "svn log
-rHEAD:2" or "svn log --limit=34" ?

Is it slow if you just request the log of the big revision ("svn log -r1")?

If the large revision is causing a massive slow-down in "svn log", that's
certainly something we ought to investigate, but it might be a low priority if
it is only moderate and/or only occurs on revision 1.

> I've seen much better "svn log" performance on a repository with vastly
> larger number of checked in revisions, but that repository grew
> naturally (it started from nothing and it grew little by little with
> each checkin).
> Should we have imported that tree as a set of smaller checkins?

Well, if this problem is a real nuisance then that would probably be a way to
avoid it. (Specifically: yes, breaking it into even a small number of pieces
might well speed it up a lot, if the slowdown is due to quadratic time required
somewhere in the implementation.) If you can live with it for the short to
medium term, the inefficiency may eventually be fixed. If you or someone you
know can help fix it, of course, it could happen much sooner!

Finally, I note from the CHANGES file
<http://svn.collab.net/repos/svn/trunk/CHANGES> that there were "svn log"
performance regressions in v1.2.0, fixed in v1.2.1, and there is a further
improvement in v1.3.0, so try that when you can. I don't know the detailed
effects of these.

- Julian

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jan 11 03:13:37 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.