[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS format7 and compressed XML bundles

From: Ben Reser <ben_at_reser.org>
Date: Thu, 28 Feb 2013 10:58:07 -0800

On Thu, Feb 28, 2013 at 8:37 AM, Ben Reser <ben_at_reser.org> wrote:
> I just don't see this happening unless someone has a very clever idea
> that I haven't thought of.

Speaking with Julian here at ApacheCon he mentioned that gzip has a
rsyncable option. Looking into this turns out that there is a patch
applied to Debian's gzip that provides this option. It resets the
compression algorithm every 1000 bytes and thus makes blocks that can
be saved between revisions of the file. gzip uses the same DEFLATE
algorithm that most zip files use, so the same idea could be applied
to it. If we want to deal with something like this in Subversion, I
think we'd deal with it via some sort of plugin for specific file
types that could convert to the more efficient to deltify encoding
before committing. Unfortunately, we don't have any sort of plugin
type infrastructure for this today.

Even still there are things that can be done today. I made a couple
trivial Microsoft Office Word documents. One with the characters
"abc" in them and one with "abcdef" in it. I saved the files in .docx
and in the 2003 flat XML format. The .docx file produced a delta of
3262 bytes, the .xml format produced a file with a delta of just 358
bytes.

OpenOffice/LibreOffice support flat versions of their format (e.g.
.fodt) that are not compressed and can also be more efficiently stored
in Subversion. LibreOffice even has a wiki about this:
https://wiki.documentfoundation.org/Libreoffice_and_subversion
Received on 2013-02-28 19:58:45 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.