[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fs-rep-sharing branch

From: David Glasser <glasser_at_davidglasser.net>
Date: Wed, 22 Oct 2008 09:31:36 -0700

While I'm at it, a few more bits of belated code-review:

- If a commit gets aborted (or the server crashes, or whatever) after
having written new references to the database, the database will still
contain the references; future commits of files with the same contents
will then use the garbage at those offsets. You can probably fix this
by opening a transaction at the top of commit_body and commit it after
updating "current". (If there's a crash between updating "current"
and committing the transaction, then the only downside is that you
fail to do a bit of sharing, which is OK.)

- svn_fs_fs__inc_rep_reuse (unlike svn_fs_fs__set_rep_reference) is
used outside of the control of the FSFS lock. Thus its
read-modify-write can have race conditions, leading to two different
rep key with the same reuse number. Now, I'm not 100% sure how much
of a problem this is; it would be good to document the point of the
reuse number somewhere (unless I'm missing it). But I think it's to
enable our dumb
compare-if-a-node-has-changed-by-if-its-rep-key-has-changed thing to
work, right? And so there could be a subtle, difficult-to-diagnose
problem if two reps end up with the same reuse number. I'm not sure
how to fix that; the obvious thing to try is wrapping the read and
write inside a transaction (with retry loop), but I don't remember if
SQLite transactions work that way. Alternatively, maybe SQLite has
special handling for atomically incrementing counters?

- At some point before the next release, 'structure' needs to be updated.

--dave

On Mon, Oct 6, 2008 at 8:59 PM, Hyrum K. Wright
<hyrum_wright_at_mail.utexas.edu> wrote:
> The fs-rep-sharing branch is functionally complete, and I'd like to get the
> branch merged to trunk soon. These are the stats for various copies of of our
> repository for the different branch/backend combinations.
>
> BDB: 1.5: 1.4GB
> trunk: 627MB
> reps-shared: 490MB
>
> FSFS: 1.5: 586MB
> trunk: 578MB
> reps-shared: 523MB
>
> The effect is quite pronounced on BDB, with around a 20% space savings compared
> with our current trunk (and over 67% compared with 1.5!) FSFS doesn't show as
> much improvement, partly due to the size of the index required to enable
> rep-sharing, partly due to decreased sharing opportunities in same-revision and
> parallel revision objects, and mostly due to the absolute floor on repo size due
> to inode usage.
>
> We may be able to tune the FSFS implementation just a bit. For instance, it may
> not be likely that directory content representations are likely to be shared, in
> which case we shouldn't bother
>
> The remaining issue is the failing blame tests. Blame tests 10 and 11, which
> test 'blame -g', both fail for both backends. Before the recent commits to add
> rep-sharing to fsfs, the tests only failed for bdb. I'm slightly puzzled here
> because 'blame -g' should be FS-agnostic. If anybody has some insight, I
> welcome it.
>
> [Note: Because SQLite is still not an official dependency, to compile the
> rep-sharing stuff with FSFS, you'll need to add -DENABLE_SQLITE_TESTING to the
> CPPFLAGS when configuring.]
>
> -Hyrum
>
>
>

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-22 18:31:50 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.