[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN scalability problem as number of tags grows

From: Greg Hudson <ghudson_at_mit.edu>
Date: Sat, 21 Feb 2009 12:46:31 -0500

On Sat, 2009-02-21 at 12:23 -0500, John Coiner wrote:
> Is this a known issue? Are there plans to make this more scalable? I
> searched the issues database and did not find anything that looked like
> a duplicate. Should I file a new issue?

It is a known issue that svn's back end storage of directories with many
entries isn't terribly efficient. All revisions of all directory lists
are stored in full, so a directory with many entries takes O(n) time to
modify and O(n) space to hold each new revision (O(n^2) space total, if
the number of changes is proportional to the number of entries).

Since we use directories to hold tags, this issue applies to large
numbers of tags if they are stored in a single flat directory, as the
usual convention suggests.

I don't know of any plans to make this more scalable. It would require
a significant rearchitecting of directory storage. One approach would
be to use a balanced tree with many roots to hold all revisions of a
directory--but to do that, we'd have to store all revisions of a
directory together (not necessarily in the same disk blocks, but in some
fashion designed to avoid excessive seeking). In FSFS, because of other
design contraints, that's simply not practical. In BDB it might be more
tractable.

> Do you have any recommendations for a work around?

Organizing the tags in a tree structure is probably the best workaround,
as you have already found.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1204282
Received on 2009-02-21 18:46:52 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.