[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: tagging performance issues

From: John Szakmeister <john_at_szakmeister.net>
Date: 2005-05-20 20:51:15 CEST

Erik Huelsmann wrote:
> On 5/20/05, John Szakmeister <john@szakmeister.net> wrote:
>
>>On Thursday 19 May 2005 21:54, Ben Collins-Sussman wrote:
>>
>>>On May 19, 2005, at 6:33 PM, Sreekanth Puram wrote:
>>>
>>>>Any help would be appreciated.
>>>
>>>The time it takes to add a new entry to a directory is O(n). The
>>>repository compresses subsequent versions of file-nodes, but that's
>>>not true for directory-nodes. Every new version of a directory is
>>>written out in full: that is, the entire list of directory entries
>>>is written out every time you add a new child (and create a new
>>>revision).
>>
>>Does that then counter our claim that tagging is O(1) that we so
>>prominently advertise? From our web page:
>> * Branching and tagging are cheap (constant time) operations
>> There is no reason for these operations to be expensive, so they
>> aren't.
>
>
> No, as that refers to the number of files which are being tagged,
> which still is a constant factor.
>
>
>> Branches and tags are both implemented in terms of an underlying "copy"
>> operation. A copy takes up a small, constant amount of space. Any copy
>> is a tag; and if you start committing on a copy, then it's a branch as
>> well. (This does away with CVS's "branch-point tagging", by removing the
>> distinction that made branch-point tags necessary in the first place.)
>>
>>
>>>So, perhaps you shouldn't create 20,000 entries in /tags. Spread
>>>them out, create some sort of tree structure below /tags/.
>>
>>Or, perhaps, we should live up to our claim. I'm disappointed about this.
>>We don't generate that many tags or releases, so it we would never suffer
>>from it. But a lot of corporate guys see this as a big boon... and it
>>isn't true. :-(
>
>
> It *is* true. CVS isn't constant in the number of files to be tagged
> or in the number of tags to be created. Subversion only exhibits the
> second behaviour.

It's not clear from the wording that's the case. As an end user, I
don't make the difference between the two steps of copying the node, and
actually inserting it in the tags folder. I can't create a tag without
inserting it into a tags directory of some sort. So the operation as a
whole is constant time. It's a bit misleading. Don't get wrong, I'm
not saying that the team hasn't attacked the largest part of the
problem. But if one part of the process in O(n), then the whole thing
is O(n). You're only as fast as the slowest part. :-)

-John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri May 20 20:53:46 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.