Please accept the standard apology if I'm asking a question that's already
been asked. I didn't see this in the issue trackers.
Also, please accept the usual spooge from a very happy user. Subversion
rocks. I got everybody here (about 25 programmers) to switch about a year
ago now, and we haven't had a single hiccup or problem. Rock solid. I
can't imagine anybody using any other product for version control.
(Especially considering the price.)
Now for the request:
Would it be difficult to add a mechanism to avoid storing the duplicate base
copy of every file in the administrative area? I understand the advantages
of this and for most situations I think it's really nice to be able to diff
and revert without contacting the server. Also I understand it facilitates
optimizing communications by sending deltas in both directions. However,
there are situations where these extra copies are quite wasteful. For
example, I'm in the video game industry, and we'd like to use subversion to
manage our binary assets, which can easily be 10's of GB's. Storing two
copies of all these files is a pretty big waste of disk space. What's more,
many of the assets are compressed assets where a simple binary delta often
isn't very effective, so sending the entire file over the wire rather than a
delta for every commit/update wouldn't be a loss. (And the repository runs
over a LAN, not the general internet, so we've got some bandwidth to spare.)
In our situation, there may be 10,000 files, and for 95% of those files, the
only action a user ever does is update to the next revision, he isn't
actually working on the file. And your virus scanner program is going to be
scanning BOTH files for viruses all the time.
In other words, there are situations where you've got more bandwidth than
disk space, and it'd be nice if subversion could work more optimally in
Here are some thoughts on how it could be implemented: (Coming from a user
who has never looked at the code.)
- Defer the fetching of the base copy into the administrative area
until it was actually needed, for example on revert or a diff. Whenever the
base revision was needed, Subversion would check if the file was missing and
if so, request it from the server. Then everything else would (mostly) work
the same. As I mentioned above, I know some operations depend on sending
deltas, so I don't know how complicated it would be to handle this situation
where there was no file from which to compute the delta, so the entire file
would have to be sent over the wire.
- When would these duplicates be deleted? Maybe during a cleanup,
or an update to a different revision.
- I'm not sure what the best mechanism would be to activate this
feature. Maybe a directory svn:xxx property (with the ability to inherit
props being really useful). Or maybe it's 100% a client setting, and the
repository just is capable of dealing with some clients sending entire files
rather than deltas?
Please let me know if this is feasible. There are two features which are
making us hesitate to use subversion to manage our binary assets. One is
the more robust props (repository-side control and inherited props) and this
one is the other.
- Fletcher Dunn
Received on Fri Jul 21 05:20:38 2006