[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: wc-ng question -- is this an optimization?

From: Greg Stein <gstein_at_gmail.com>
Date: Sun, 21 Feb 2010 21:08:53 -0500

On Fri, Feb 5, 2010 at 04:31, Julian Foad <julian.foad_at_wandisco.com> wrote:
> On Fri, 2010-02-05 at 03:01 +0100, Neels J Hofmeyr wrote:
>> In this commit, I run svn_wc__db_read_info() in check_tree_conflicts(), when
>> half or more of its callers have already called svn_wc__db_read_info().
>> So would it be an optimization to pass STATUS, maybe more, as args to
>> check_tree_conflict(), rather than calling svn_wc__db_read_info() twice?
> Let's do a mental experiment. Think about what it would be like if you
> chose to pass several of the values that had already been read from the
> database. That would start to get messy if they were all separate
> params, so we would build a structure that held the values, and pass a
> pointer to it. We could generalize the structure to hold all possible
> __db_read_info() data, with each value optionally present or not (marked
> by an invalid value). It would effectively be a cache. It could also
> hold a pointer to the DB it came from, so we wouldn't need to pass that
> separately, and we could call the pointer to the struct "a handle to the
> DB". See where I'm going?
> It seems to me that we should rely on svn_wc__db_read_info() to be
> efficient. If it needs caching to be fast, it should implement the
> equivalent of that hypothetical struct internally.

Yes, this is the intent behind the API design. We're create
optimizations internally when possible, and expose more APIs/data when
we cannot. The "many params" style is also to avoid passing structures
that are simply gloms of parameters. That results in a style of coding
where you cannot tell whether any particular value is necessary, and
(in the case of entry_modify) whether any given value is actually
*present* (look at flag bits!). It ends up making the system very
opaque and hard to analyze the data flows: who needs what.

You may also note that we've already added a simplified version of
read_info() called read_kind(). We may end up with one or two more,
once we're done and we see the usage patterns.

(tho, honestly, ref: read_kind, I'm still a bit queasy with it
returning node_unknown, but we can eval that later)

On the larger topic of passing-params-via-struct... I am a firm
believer against that. ISTR that the merge_baton was constructed for
this very purpose. And again, it makes it very difficult to reason
about the operation of the merge code. I also note that it tends to
create massive functions in the merge code, rather than smaller
functions taking targeted parameters. If somebody wants to nuke that
"pattern" of coding, then I will personaly buy you several beers (or
some alternative drink of your choice).

Received on 2010-02-22 03:09:29 CET

This is an archived mail posted to the Subversion Dev mailing list.