On 28.02.2013 08:04, Magnus Thor Torfason wrote:
> Hey all,
>
> I've been following the discussion about FSFS format7, and had a
> question: Is there any chance that the format would improve storage
> efficiency for documents that are stored as compressed (zipped)
> bundles of XML files and other resource files (Read MS Office
> Documents, but OpenOffice is similar).
>
> I'm finding that making very small changes in big documents (with
> embedded images) results in rapid growth of the repository, since the
> binary diff algorithm seems to not be able to figure out efficient
> deltas for this type of documents, even though analysis of the
> contents shows that they are almost unchanged.
>
> This may be outside the scope of format7, but I thought I'd ask the
> question nevertheless.
It is outside the scope, format7 is about physical storage layout and
does not affect the delta/compression layer -- which is the one
responsible for the effect you're seeing.
We're aware of the issues regarding compressed files, and I expect will
eventually come up with a solution. The problem just hasn't seemed all
that important compared to other things we're trying to solve.
That said, I'm sure we'd welcome any suggestions about how to handle
such files more efficiently. I can think of a few (e.g., decompress the
files before deltifying them), but it's always good to hear other points
of view.
-- Brane
--
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com
Received on 2013-02-28 17:25:35 CET