Re: FSFS "rev cache" operation [was: obliterate in trunk]

From: Branko Cibej <brane_at_xbc.nu>
Date: Wed, 14 Oct 2009 07:34:07 +0200

David Glasser wrote:
> Well, the actual problem here is that the rep cache doesn't have a ref count.
>
> So let's say you want to obliterate the node /foo/bar_at_1234. Its text
> may be in the rep r1234/9876. If you actually want to remove that
> data from the backend (and not just mask it from clients), you need to
> know if any other nodes (which may not be related to this node via
> ancestry relations) use that rep. Since the DB doesn't even have a
> refcount, you can't even know if it's safe to wipe the rep text (let
> alone where the other uses are).
>

You could possibly solve this by adding another flag column o the
rep-cache table ... (lots of hand-waving now):

    * svn_fs_obliterate would just say, "this rep may have been
      obliterated". New commits that see this hint keep using the same rep.
    * A separate (periodic?) scanner would look at each such rep-cache
      entry, mark it as "now obliterating" and proceed to scan the
      repository to determine if any references remain.
    * Wave hands, prevent races with new commits that need this
      representation. It could be a simple as letting new commits just
      change that flag to "not obliterated" and let the scanner notice
      that before it actually tries to remove data.

> As Greg said, you can basically get around this by doing a bunch of
> slow repository walks to try to find other places that use the same
> rep. (Of course half the reason that we want an obliterate feature is
> that people find dump/load to be too slow :) )
>

These scans should obviously be offline, or rather, they shouldn't block
normal repository operations. This would require some more admin work to
set up a repository, but only when you use rep-sharing *and* need the
space-saving obliterate. And with proper design, the repository admin
could enable such periodic scans at any time after the repository was
created.

Better yet, when a poor admin notices that his lusers tend to obliterate
every second commit, she could turn off rep-sharing to make obliterate
more efficient. :D

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2407408
Received on 2009-10-14 07:34:36 CEST

This message: [ Message body ]
Next message: Branko Cibej: "Re: Version inconsistency?"
Previous message: Branko Cibej: "Re: [PATCH] fix to use repository in root drive letter under win32"
In reply to: David Glasser: "Re: FSFS "rev cache" operation [was: obliterate in trunk]"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]