[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: admin driven deltification

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2003-11-14 15:39:47 CET

solo turn <soloturn99@yahoo.com> writes:

> karl, you wrote, that space aware admins should set a deltification
> in the post-commit hook. does this mean:
> - the user gets the response as late as up to know?

To know what? As Karl mentioned, aside from the painful side-effects
of the time it takes to do deltification in-process, the operation
isn't user visible. There is no response, no printf() that says
"Deltified your last revision, sir."

> - if you do not run it in post-commit does it slow down "svn up" of
> other users, or is it just a server space issue?

If you don't deltify, you should actually have faster 'svn up's of
non-HEAD revisions than if you do. But the price is disk usage.

> would it make sense to run deltification by a cron job (and remove
> it from post-commit), and doing packaging for a specific os should
> include in providing and registering such a job by default?

To run deltification as a cron-job is fine, but it means a few things:

1. (As noted already) you have to install that cron entry for each
   new repository that you make.

2. You have to keep metadata about which revisions you already
   deltified (or waste the time of deltifying them again).

3. Finally (and this is kinda hard to explain), the longer you go
   without running deltification, the larger your repository gets, and
   administrators should understand that deltification does *not*
   immediately reduce the footprint of your repository. That's right.
   If you run 'svnadmin deltify' on your repository, you can expect it
   *not* to shrink. What happens during/after deltification is that
   because you are now storing less data in a database file than you
   were before, Berkeley DB simply marks a bunch of its internal
   pages as "free", and those are the first to be re-used when
   Berkeley needs to store more data in the repository.

   This is why I made 'svnadmin load' go ahead an deltify every each
   loaded revision. When I didn't, you had a repository that was
   *huge* on-disk -- so huge that if you then deltified the whole
   thing, and then instituted a per-commit deltification policy, you
   might *never* actually make use of all that extra allocated space
   (at least not for something like 3x the number of loaded
   revisions).

> another thing i am anclear of: is looking for objects which need to
> be deltified performance relevant? up to know you exaclty know the
> object, you "just" need to do the deltification.

No. I almost literally just took the internal-use function that did
deltification before and just gave it a public name and a spot in
svn_fs.h. We only deltify nodes made mutable (modified) during the
commit. But since you can't make a node mutable without making it's
parent directory mutable, we know that all things changed during a
commit are necessarily connected to each other as a tree of mutable
nodes. So, if we walk the revision tree, only recursing on things
that were mutable in the commit, we'll hit them all. There's nothing
really to "look up".

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 14 15:41:19 2003

This is an archived mail posted to the Subversion Dev mailing list.