[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Proposal - option to store unzipped office documents on server side.

From: Philip Martin <philip_at_codematters.co.uk>
Date: Fri, 04 Aug 2017 13:12:14 +0100

Paul Hammant <paul_at_hammant.org> writes:

> Git doesn't store deltas, and uses a DEFLATE algorithm for
> storage. Diffs are meaningless on binary files, of course.

I don't know about git but Subversion does quite a good job on some
binary files. Take the compressed tarballs of a couple of Subversion
tags:

   $ svn export http://svn.apache.org/repos/asf/subversion/tags/1.9.5
   $ svn export http://svn.apache.org/repos/asf/subversion/tags/1.9.4
   $ tar cfz foo1.tar.gz 1.9.5
   $ tar cfz foo2.tar.gz 1.9.4
   $ svnadmin create repo
   $ svnmucc -mm -U file://`pwd`/repo put foo1.tar.gz f.tgz
   $ svnmucc -mm -U file://`pwd`/repo put foo2.tar.gz f.tgz

How big are the tarballs?

   $ ls -lh foo*
   -rw-r--r-- 1 pm pm 15M Aug 4 13:00 foo1.tar.gz
   -rw-r--r-- 1 pm pm 15M Aug 4 13:00 foo2.tar.gz

How big in the repository?

   $ ls -lh repo/db/revs/0/[12]
   -r--r--r-- 1 pm pm 15M Aug 4 13:00 repo/db/revs/0/1
   -r--r--r-- 1 pm pm 13M Aug 4 13:00 repo/db/revs/0/2

Saving about 2M. But we can do better if we do compression knowing that
deltification will be used:

   $ tar cf foo1.tar 1.9.5
   $ tar cf foo2.tar 1.9.4
   $ gzip --rsyncable foo1.tar
   $ gzip --rsyncable foo2.tar

The resulting tarballs are little bigger:

   -rw-r--r-- 1 pm pm 16M Aug 4 13:05 foo1.tar.gz
   -rw-r--r-- 1 pm pm 16M Aug 4 13:05 foo2.tar.gz

but Subversion can do better deltification:

   -r--r--r-- 1 pm pm 16M Aug 4 13:05 repo/db/revs/0/1
   -r--r--r-- 1 pm pm 5.6M Aug 4 13:05 repo/db/revs/0/2

We have stored two 15MB compressed tarballs in a 21MB repository.

-- 
Philip
Received on 2017-08-04 14:12:24 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.