[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Auto-cleaning of log files?

From: Tobias Ringstrom <tori_at_ringstrom.mine.nu>
Date: 2003-04-04 18:22:23 CEST

On 4 Apr 2003, Ben Collins-Sussman wrote:

> Tobias Ringstrom <tori@ringstrom.mine.nu> writes:
>
> > As I mentioned earlier I use svnadmin dump to perform backups. I do that
> > because I imagine that the dump format is more likely to be readable one
> > year from now, and also becuase the result is actually smaller than a
> > backup of the whole db.
>
> I don't understand how this is possible. The dumpfile format *never*
> expresses textual deltas. If a file changes, the entire file is
> dumped. How can the dumpfile be smaller than the repos, which is
> compressing like crazy?

The difference is not huge. In one case the compressed dump took 2.3 MiB
and the compressed tar file took 2.8 MiB, i.e. a 20% difference.

<speculation>

A part of it comes from the fact that one log file cannot be removed since
it is in use somehow, but that only accounts for 43 kiB so it is not the
main part.

Where the rest comes from I do not know. Perhaps the DB contains indices
to speed up searching or something.

It can also come from the behaviour of Ziv-Lempel type compression
algorithms that are more efficient for long streams of data. Example:

~/u> ls -l big
-rw-rw-r-- 1 tori tori 4274832 apr 4 18:16 big
~/u> cat big | bzip2 -9 | wc
   5711 33091 1484337
~/u> split big
~/u> gzip -9 x*
~/u> ls -l x*
-rw-rw-r-- 1 tori tori 162812 apr 4 18:16 xaa.gz
-rw-rw-r-- 1 tori tori 261809 apr 4 18:16 xab.gz
-rw-rw-r-- 1 tori tori 118616 apr 4 18:16 xac.gz
-rw-rw-r-- 1 tori tori 153492 apr 4 18:16 xad.gz
-rw-rw-r-- 1 tori tori 116379 apr 4 18:16 xae.gz
-rw-rw-r-- 1 tori tori 89019 apr 4 18:16 xaf.gz
-rw-rw-r-- 1 tori tori 196817 apr 4 18:16 xag.gz
-rw-rw-r-- 1 tori tori 231401 apr 4 18:16 xah.gz
-rw-rw-r-- 1 tori tori 232841 apr 4 18:16 xai.gz
~/u> cat x* | bzip2 -9 | wc
   6006 35130 1565679

</speculation>

/Tobias

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Apr 4 18:23:16 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.