[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Compressed Pristines (Design Doc)

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: Thu, 22 Mar 2012 21:43:00 +0100

Hi Ash,

Thanks for picking up the initiative to implement this feature.

On Thu, Mar 22, 2012 at 7:01 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:

> On Thu, Mar 22, 2012 at 18:30, Daniel Shahaf <danielsh_at_elego.de> wrote:
> > OK, I've had a cruise through now.
> >
> > First of all I have to say it's an order of magnitude larger than what
> > I'd imagined it would be. That makes the "move it elsewhere" idea I'd
> > had less practical than I'd predicted. I'm also not intending to take
> > you up on your offer to proxy me to the doc, though thanks for making it.
> >
> > Design-wise I'm a bit surprised that the choice ended up being rolling
> > a custom file format.
> >
> > Thanks for your work.
> >
> +1. I believe we should implement compressed pristine in simple way:
> just compress pristine files itself, without inventing some new
> format.

As the others, I'm surprised we seem to be going with a custom file format.
You claim source files are generally small in size and hence only small
benefits can be had from compressing them, if at all, due to the fact that
they would be of sub-block size already.

To substantiate that claim, I took the pristines directory from my
Subversion working copy and did some experimenting. See results below:

 $ ls -ls uncompressed-pristines/*/*.svn-base | awk '{ tot += $1; } END {
print "total size " tot; }'
total size: 188724

 $ cp -Rp uncompressed-pristines/ compressed-pristines
 $ gzip compressed-pristines/*/*.svn-base
 $ ls -ls compressed-pristines/*/*.svn-base.gz | awk '{ tot += $1; } END {
print "total size " tot; }'
total size: 52320

 $ cat compressed-pristines/*/*.svn-base.gz > combined-compressed-file
 $ ls -ls combined-compressed-file
41812 ....

So, if I look at the Subversion pristines in my working copy, the reduction
in allocated blocks goes from 100% to 27%. To be honest, I doubt the
complexity we'll be importing just to reduce the allocated number of blocks
from 27% to 22% is really worth it: the savings are already tremendous.
Won't the creation of a custom storage format just serve to destabilize our
working copy?

Do you have data which triggered you to design this custom format?


Received on 2012-03-22 21:43:32 CET

This is an archived mail posted to the Subversion Dev mailing list.