Kean Johnston wrote:
>Ok I read through everyone's response to my request for
>enlightenment about the ever-increasing version number
>for an entire tree. My gut instinct still tells me this
>may lead to problems in the future, but for now I buy it.
>However, one thing I don't buy as being efficient is the
>way people suggest we do tags.
>Although it may be a "relatively cheap" operation to copy
>a directory, please consider the effect when subversion
>is asked to maintain very large trees. Lets say there are
>a quarter of a million files. That means at the very least,
>assuming a single 4-byte integer is used for each file
>as its "pointer", 1 megabyte (give or take a teeny bit)
>per tag. If you want to make weekly, intra-weekly or
>possibly even daily tags, this can get very expensive
On the server, copies are O(1) time and space. That means that any copy
takes the same amount of space regardless of the size of the tree.
>How about this. Since there is always just a single version
>that the tree is at at any given time, (lets say when I
>make the tag its at version 3261). If I was to use the
>yet-to-be-written "svn tag my_release_tag_name", it
>could use a single database record in an SVN specific file
>at the root of the tree that simply records the current
>tree version. Thus if I ever check out my_release_tag_name
>it knows I really mean release 3261. This then limits the
>data required to store the tag to 4 bytes for the revision
>number and however many bytes the symbolic tag name is.
>This also *HAS* to be a quicker operation than directory
>copying, no matter how fast a directory copy is.
It's quicker only by a constant factor, the time complexity is the same
-- i.e., constant.
>My other concern is with the "hidden cached copies of
>every file" scheme. For something the size of Apache,
>and subversion, maybe even something meatier like X11,
>that may be OK, but when your source tree is over 3G
>in size, you now double that to 6G. That's a huge hit.
>Can we at least open up a discussion about possibly
>rethinking why the cached copy is needed? Is it THAT
>important that you can revert a file on an aeroplane?
>Wouldn't keeping a simple CRC or even MD5 hash of the
>file to be able to *detect* changes suffice? Or at
>least give the svn repository manager the option of
>setting up his respository that way. Of course the
>problem becomes bigger when someone in the military
>decides to use subversion one day (aint that a pun?)
>to manage their 40G ADA repositories.
That, of course, it a different issue altogether. And it's a
client-side, not a server-side issue. We're well aware of this problem,
although the solution for now is "disk is cheap" :-) The _real_ solution
we've been talking about on and off is almost exactly what you propose.
Welcome to the club! :-)
>I hope this is food for thought. I don't mean to be
>a trouble maker :)
Not at all.
Brane Čibej <brane_at_xbc.nu> http://www.xbc.nu/brane/
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
Received on Wed Sep 25 03:13:40 2002