[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS "rev cache" operation [was: obliterate in trunk]

From: Branko Cibej <brane_at_xbc.nu>
Date: Wed, 14 Oct 2009 07:34:07 +0200

David Glasser wrote:
> Well, the actual problem here is that the rep cache doesn't have a ref count.
> So let's say you want to obliterate the node /foo/bar_at_1234. Its text
> may be in the rep r1234/9876. If you actually want to remove that
> data from the backend (and not just mask it from clients), you need to
> know if any other nodes (which may not be related to this node via
> ancestry relations) use that rep. Since the DB doesn't even have a
> refcount, you can't even know if it's safe to wipe the rep text (let
> alone where the other uses are).

You could possibly solve this by adding another flag column o the
rep-cache table ... (lots of hand-waving now):

    * svn_fs_obliterate would just say, "this rep may have been
      obliterated". New commits that see this hint keep using the same rep.
    * A separate (periodic?) scanner would look at each such rep-cache
      entry, mark it as "now obliterating" and proceed to scan the
      repository to determine if any references remain.
    * Wave hands, prevent races with new commits that need this
      representation. It could be a simple as letting new commits just
      change that flag to "not obliterated" and let the scanner notice
      that before it actually tries to remove data.

> As Greg said, you can basically get around this by doing a bunch of
> slow repository walks to try to find other places that use the same
> rep. (Of course half the reason that we want an obliterate feature is
> that people find dump/load to be too slow :) )

These scans should obviously be offline, or rather, they shouldn't block
normal repository operations. This would require some more admin work to
set up a repository, but only when you use rep-sharing *and* need the
space-saving obliterate. And with proper design, the repository admin
could enable such periodic scans at any time after the repository was

Better yet, when a poor admin notices that his lusers tend to obliterate
every second commit, she could turn off rep-sharing to make obliterate
more efficient. :D

-- Brane

Received on 2009-10-14 07:34:36 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.