On Tue, Aug 4, 2009 at 6:19 AM, Bolstridge,
> I was thinking about how binaries are stored in the WC. Currently,
> checkout any WC and you get a copy of the files in the metadata directories,
> which is fine for things like source files and other text files, but when
> people come to store binaries in their repositories, this starts to be a
> Storing binaries, I hope, isn’t controversial. I do it for shipped
> releases, but also store word documents, images and similar.
> Thinking about why a copy of each file is stored locally, it makes
> perfect sense for text, as you can perform certain operations really quickly
> – diff for example. However, I’m not sure these operations can be reasonably
> applied to binary files?
> So, would it be a good optimisation to store just the hash of a
> binary in the WC instead of the full contents? It could cut down the size of
> the WC considerably which is often cited as a problem with SVN, and might
> improve checkout times. I don’t think it would have any impact on the
> operation of the WC either. Revert is the only operation I can think of that
> would be an issue, but in this case the original would be retrieved from the
The base file is also used to produce and exchange deltas with the
server. When you commit a change to the binary, only the deltas are
sent to the server. Likewise on update. On some binaries, this work
is all a tremendous waste of CPU as the delta is meaningless, but on a
lot of binaries it does produce a significant savings.
> Would this be something to consider for wc-ng?
I believe these are all things on the table for post wc-ng. Meaning
once the new design is in place, then we can start adding new
I'd like to see an svn: property for binaries that says not to store a
base copy and just send the full file when committing. If we wanted
to get fancy, this property could even control the delta algorithm
used, or perhaps provide some hints to the algorithm so that it could
handle the file better.
Received on 2009-08-04 15:25:13 CEST