[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Status of wc-propcaching branch

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2005-11-28 14:58:10 CET

Peter N. Lundblad wrote:
> Hi,
> I'd like to give a short status report on the wc-propcaching branch. The
> original plan for wc-propcaching is now implemented. And it seems to work
> pretty well.
> The following things have changed regarding how properties are stored:

Is this set of changes, and/or the new situation, documented anywhere more
permanent than here? I'm going to review this as if it's a log message.

> - There is no base-props file if there are no properties.

Excellent. Presumably "if there are no _base_ properties", so this file can
never be present and empty? (An alternative could be "if there are no base or
working properties".)

> - There is no working props file if there are no prop changes.

Excellent. So, this file can be present and empty, iff there were base props
and they have all been deleted locally?

> - Thrre new fields have been added to the entries file:
> - has-props keeps track of wheter the entry has any (working) props.

So, to check I understand properly, this is a boolean field, present iff:
   (working-props file is present and non-empty)
   || ((working-props file is absent)
       && (base-props file is present (and, by definition, non-empty)))

> - cached-props: is a space-separated list of property names.
> If a property is mentioned here, the working props for this entry has
> a property of this name. Only svn:needs-lock, svn_special and
> svn:externals may be present in this field.

And if one of those three properties is not mentioned, does that mean the
property is not present in the working props? So this caches the presence or
absence of those three particular properties? (See below.)

> - prop-mods: Is true or false (attribute absent) depending on whether
> this entry has property modifications.

Wouldn't the name "has-prop-mods" be better for a boolean (complementing
"has-props")? Otherwise its name implies it's a list of modifications.

So, this is present iff the "working props" files is present?

> - The prop-time fields isn't present anymore.


> It has been suggested to store properties in one single file per
> directory, both for regular props and wcprops. I think that seems like a
> good idea, but I think it falls outside of the scope of the propcaching
> branch. Also, I want to merge this work as early as possible in the 1.4
> cycle to get it wider tested.


> What we are waiting for now is that Erik wants to require checksums on all
> files. Since that would require even another WC format bump, we think it
> is best to do that on wc-propcaching before merging.

That makes me uncomfortable. Maybe it's a small and simple change, but it's
got nothing to do with prop-caching. Saying that format numbers are cheap, and
then doing this to avoid another bump, is inconsistent. Is there any other way
we could work around this format-number-bumping issue? Please could we either
keep bumps for released versions, and provide developers with an alternative
way to get their WCs upgraded and working yet end up with a format number "5",
or just do a bump for each new WC feature? Mixing different features in the
same branch could get ugly and isn't scaleable.

> So, in short, I think wc-propcaching is approaching its merge back to
> trunk and I want to encourege people to review it. If no one objects, I
> want to merge as soon as Erik's work is done. (Also, if anyone has a good
> reason to add (or remove) any property from the cached-props field, it is
> easier to do so before people start using this code in their working
> copies.)

Apologies for making this comment at this late stage.

We have:

   cached-props = "svn:special"


   svn:needs-lock is absent, svn:special is present, svn:externals is absent.

Note that it means certain properties are absent, that aren't mentioned in it,
as well as meaning that certain properties are present (that are mentioned in it).

The library has built-in knowledge of which three specific properties they are.
  Calling the field "cached-props" seems wrong. It implies a generic cache, in
which the presence or absence of an item should indicate only whether that item
happens to have been cached. Writing the name of the property for one boolean
state ("present") but not for the other state ("absent") is asymmetric.

I feel that either the field should be named so as to identify what it is
caching, e.g.:

   has-needslock-special-externals = "0 1 0"

or it should be a generic cache, e.g.:

   props-presence-cache = "svn:needs-lock=0 svn:special=1 svn:externals=0"

   cached-props-present = "svn:special"
   cached-props-absent = "svn:needs-lock svn:externals"

   cached-props-names = "svn:needs-lock svn:special svn:externals"
   cached-props-presence = "0 1 0"

It seems to me that a generic cache is strongly preferably from a design point
of view.

Given that I'm late with saying this, and you've already implemented a cache
for three specific values, may I persuade you to at least change the field so
that it doesn't look like a generic cache?

> In each column below, there are two numbers. The first indicates the
> performance impprovements when the disk cache was flushed. The second
> number is an approx. average of four runs of the same command after the
> first one, i.e. when the data is in memory.
> GCC tree GCC tree w/prop svn tree
> svn st 11% 33% 64% 92% 53% 25%
> svn diff 12% 27% 30% 45% 40% 0% (*)
> svn ci 40% 86% 80% 95% 78% 50%

By "performance improvements" you seem to be talking about speed here; the disk
space is the other important factor. Since you describe some of these as
"dramatic", I assume these percentages are reduction in wall-clock time, so
that "95%" means twenty times faster, rather than speed increases in which case
"95%" would mean nearly twice as fast.

> What I think is interesting is that we have improved performance for all
> operations. On some operations (i.e. commit), we have dramatic
> improvements. In summary, I feel that this work has been worth it.

Yup, it certainly both feels intuitively and looks from the numbers that this
is very worthwhile.

- Julian

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 28 15:06:32 2005

This is an archived mail posted to the Subversion Dev mailing list.