Re: FSFS format7 and compressed XML bundles
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Thu, 28 Feb 2013 19:53:39 +0000 (GMT)
Ben Reser wrote:
> Speaking with Julian here at ApacheCon he mentioned that gzip has a
Use of such a zip format would be ideal -- Subversion's binary-delta would then calculate an excellent delta as long as each inserted chunk is are smaller than the delta window size (currently 100 KB, Stefan's proposal 1 MB).
I'm not sure about the details of how the restartable compression works, but it somehow selects points in the uncompressed data that don't depend on the absolute byte offset from the start of the file, and resets the compression at those points.
As I understand it, only the compressor needs the special logic, and the resulting compressed file is still in the same format and fully compatible with the standard decompression libraries.
But unfortunately although patches for this "restartable" or "rsyncable" mode of compression has been around for years, and it can have a very low overhead, nevertheless it doesn't yet seem to have been implemented in the common compression libraries (such as zlib), and OpenOffice doesn't offer that mode.
Therefore this is not a practical solution at the moment.
> be saved between revisions of the file. gzip uses the same DEFLATE
Yes, a client-side plug-in -- either to Subversion or to OpenOffice -- seems to me the best practical solution.
There exists a plug-in to OpenOffice, "OOoSVN", which, when you want to commit the current version of the doc that you are editing, uncompresses the doc file into a tree of files in its own private svn working copy (that it creates in your home directory) and commits that. Similarly, to update your doc to an old version, or to retrieve two versions and diff them, it updates this hidden WC and then compresses the files in the WC into a ".odt" or whatever, and lets OpenOffice load or diff that file.
I have tried "OOoSVN" and it works but it is very crude -- the user interface is poor and it is not flexible -- it only supports a local dedicated svn repository, for example.
> Even still there are things that can be done today. I made a couple
We should talk to the OpenOffice folks and see if we can convince them of the value of using a restartable compression by default, and find out how possible that is. It would be great if that Wiki page could even say, "We'd like to use restartable compression for this reason but we need the compression library developers to make it available."
But for a practical solution until restartable compression becomes the norm (if it ever does), if you (Magnus) would like to help by designing some kind of solution, that would be great. Please do keep discussing it here if you have any thoughts in this direction. FWIW I think it's an important and interesting issue.
- Julian
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.