[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: API review - svn_ra_get_mergeinfo2()

From: Paul Burba <ptburba_at_gmail.com>
Date: Tue, 12 Jul 2011 12:20:57 -0400

On Fri, Jul 8, 2011 at 1:04 PM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> On Fri, 2011-07-01, Paul Burba wrote:
>> Paul Burba wrote:
>> > Julian Foad wrote:
>> >> I will just ask once more: as a matter of principle, are we comfortable
>> >> it's OK to provide only an indication that "the server did in fact do
>> >> this for you this time", but not to have a way of finding out in general
>> >> whether the server is capable of doing this?
>> >
>> > On further reflection, I think using a server capabilities check is
>> > the better approach:
> [...]
>> > I'll add the new capability if there are no objections.
>>
>> Done http://svn.apache.org/viewvc?view=revision&revision=1141981
>
> Hi Paul. That looks good. I have some queries and suggestions about
> the details, not all related specifically to this change, which I'm
> attaching in the form of a patch (not to be committed as-is), which I'd
> like your feedback on.

Hi Julian,

Sorry for the long wait, your questions have prompted some extensive
thinking about this issue #3668 and issue #3669 code.

I've included the feedback on the patch inline at the end of this message.

> I wonder, do any callers really want to receive invalid mergeinfo? Is
> this optional just because the server-side processing is slow and some
> callers don't care?

It doesn't matter to some callers (merge.c:calculate_left_hand_side(),
merge.c:find_unmerged_mergeinfo()), so they don't impose the overhead
on the server...

...but sometimes we really do want the invalid mergeinfo -- see my
comments on your patch for merge.c:get_full_mergeinfo().

> I noticed some functions take an 'ignore_invalid_mergeinfo' parameter;
> that confused me because I thought it was referring to the same thing
> but instead it refers to a syntatically invalid svn:mergeinfo property
> value.

> By contrast, what we're dealing with in the 'invalid_inherited'
> case is mergeinfo that's syntactically valid but semantically invalid or
> at least redundant because it refers to non-existent path-revs. I
> wonder if it would be worth either using a different term (I know there
> are loads of terms already, which can be confusing in itself)

It would be more accurate to compare the boolean
ignore_invalid_mergeinfo parameter to the boolean
validate_inherited_mergeinfo parameter. Those are both input
parameters where 'invalid_inherited' is an svn_mergeinfo_t output
parameter. I'm very open to any suggested name changes, but there's
no getting away from the reality that there are a lot of terms.

I have been guilty of occasionally referring to non-existent path-rev
mergeinfo (which is otherwise syntactically valid) as "invalid" in
some comments. I'll clean that up to the extent possible, but I do
rely on context at times to point to which type of "invalid" I'm
talking about (it is such a useful word after all :-)

> or
> otherwise working towards simply eliminating this kind of mergeinfo from
> our attention altogether by never exposing it and never giving the
> caller the choice.

Your question has prompted me to change the default behavior of
svn_client__get_wc_or_repos_mergeinfo and
svn_client__get_wc_or_repos_mergeinfo_catalog such that they always
request inherited mergeinfo validation *if* contacting the server. I
can't see that any callers actually rely on the previous behavior and
I'd rather default to correct mergeinfo and deal with the problems (if
any) that causes.

[[[
> Index: subversion/libsvn_client/merge.c
> ===================================================================
> --- subversion/libsvn_client/merge.c (revision 1145163)
> +++ subversion/libsvn_client/merge.c (working copy)
> @@ -3164,22 +3164,24 @@
> mergeinfo.
>
> Query the repository for the mergeinfo TARGET_ABSPATH inherits at its
> - base revision and set *VALIDATED to indicate to the caller if we can
> - determine what portions of that inherited mergeinfo are invalid.
> + base revision.
>
> If no mergeinfo is inherited set *INVALID_INHERITED_MERGEINFO to NULL.
>
> If only empty mergeinfo is inherited set *INVALID_INHERITED_MERGEINFO to
> an empty hash.
>
> - If non-empty inherited mergeinfo is inherited then, if the server
> + If non-empty mergeinfo is inherited then, if the server
> supports the SVN_RA_CAPABILITY_VALIDATE_INHERITED_MERGEINFO capability,
> remove all valid path-revisions from the raw inherited mergeinfo, and set
> *INVALID_INHERITED_MERGEINFO to the remainder.
>
> + If non-empty mergeinfo is inherited but the server does not
> + support the SVN_RA_CAPABILITY_VALIDATE_INHERITED_MERGEINFO capability,
> + then ... ###?

If the server does not have this capability and non-empty mergeinfo is
inherited, then *INVALID_INHERITED_MERGEINFO would be set to an empty
hash...*BUT* this function is only intended to be called if the server
has that capability. I'll add that to the doc string.

> Note that if validation occurs, but all inherited mergeinfo describes
> - non-existent paths, then *INVALID_INHERITED_MERGEINFO is set to an empty
> - hash.
> + existent paths, then *INVALID_INHERITED_MERGEINFO is set to an empty hash.

Your change is the complete opposite of what the code is doing, but
perhaps it's because the original wording isn't clear enough. How's
this instead:

Note that if validation occurs and all inherited mergeinfo is
discarded as non-existent, then *INVALID_INHERITED_MERGEINFO is set to
an empty hash.

> RA_SESSION is an open session that points to TARGET_ABSPATH's repository
> location or to the location of one of TARGET_ABSPATH's parents. It may
> @@ -3187,6 +3189,8 @@
>
> RESULT_POOL is used to allocate *INVALID_INHERITED_MERGEINFO, SCRATCH_POOL
> is used for any temporary allocations. */
> +/* ### JAF: This function looks redundant. See the note "What are we doing
> + here?" at its call site. */

I reworked this entire doc string in r1145653. Hopefully it makes bit
more sense now.

> static svn_error_t *
> get_invalid_inherited_mergeinfo(svn_mergeinfo_t *invalid_inherited_mergeinfo,
> svn_ra_session_t *ra_session,
> @@ -3311,6 +3315,10 @@
> inherit, ra_session,
> target_abspath,
> ctx, scratch_pool));
> + /* ### JAF: Why is the variable called "inherited" when the output
> + * parameter from that function is called "indirect"? I'm unsure
> + * whether these words mean the same thing or else whether one
> + * implies the other. */
> if (indirect)
> *indirect = inherited;

They are synonymous, I purged use of the 'indirect' term in r1145202

> @@ -3320,6 +3328,18 @@
> {
> svn_mergeinfo_t invalid_inherited_mergeinfo;
>
> + /* ### JAF: What are we doing here? We're asking for the *invalid*
> + * inherited mergeinfo (of TARGET_ABSPATH), and removing that from
> + * the total mergeinfo (of TARGET_ABSPATH) that was obtained from
> + * svn_client__get_wc_or_repos_mergeinfo(), so that we end up with
> + * what's known as 'validated' mergeinfo.

What's going on here is part of
http://subversion.tigris.org/issues/show_bug.cgi?id=3669, specifically
handling the case where the target of a subtree merge inherits its
mergeinfo from a WC parent and we don't know if that inherited
mergeinfo contains non-existent merge sources.

A simple case:

1) Given a ^/trunk and ^/branches/branch1 created from ^/trunk_at_100

2) Given an infinite depth checkout of ^/branches/branch1_at_200 called b1-wc

3) Assume the only explicit mergeinfo anywhere in the on ^/branches/branch1

Now suppose we want to do this merge:

svn merge ^/trunk/docs/README b1-wc/docs/README

b1-wc/docs/README inherits its mergeinfo from b1-wc, but we have no
way of knowing if that inherited mergeinfo describes existent merge
sources. We want to remove the non-existent parts, but how? Well we
can ask the repository for the README's inherited mergeinfo from the
repository and request validation and use that, but what if the
mergeinfo inherited from the working copy contains local mergeinfo
changes? We want to keep those (although that mergeinfo is subject to
problems, see http://subversion.tigris.org/issues/show_bug.cgi?id=3756).
 So we want to keep what README inherits from its working copy parent
less any inherited mergeinfo that the repository can confirm describes
non-existent paths.

In r1145653 I conditioned this block so it only occurs when mergeinfo
is inherited from the working copy. We could further optimize this by
only entering the block if the target inherits mergeinfo from a
working copy parent with local mergeinfo changes, but I want to get
your feedback first.

> But this is the only
> + * use of get_invalid_inherited_mergeinfo() - to remove the
> + * invalid mergeinfo from the result of another function. It
> + * seems this should be done (and could be done more efficiently)
> + * by asking svn_client__get_wc_or_repos_mergeinfo() to ignore
> + * invalid mergeinfo, and indeed its counterpart
> + * svn_client__get_wc_or_repos_mergeinfo_catalog(), which it
> + * calls, already has an option to do so. */

As I think you realized based on your other questions, the
ignore_invalid_mergeinfo parameter has nothing to do with validating
inherited mergeinfo parameter.

> SVN_ERR(get_invalid_inherited_mergeinfo(
> &invalid_inherited_mergeinfo,
> ra_session, target_abspath, ctx,
> Index: subversion/libsvn_client/mergeinfo.c
> ===================================================================
> --- subversion/libsvn_client/mergeinfo.c (revision 1145163)
> +++ subversion/libsvn_client/mergeinfo.c (working copy)
> @@ -610,6 +610,7 @@
> const char *session_url = NULL;
> apr_pool_t *sesspool = NULL;
> svn_boolean_t validate_inherited_mergeinfo = FALSE;
> + /* ### JAF: Shouldn't we use a passed-in parameter instead? */

I got rid of the needless local variable in r1145653 but I don't
follow what you want re a passed-in parameter.

> if (ra_session)
> {
> @@ -1067,6 +1068,7 @@
> {
> svn_mergeinfo_catalog_t tmp_catalog;
> svn_boolean_t validate_inherited_mergeinfo = FALSE;
> + /* ### JAF: Shouldn't we use a passed-in parameter instead? */

Ditto.

> rev = peg_rev;
> SVN_ERR(svn_client__get_repos_mergeinfo_catalog(
> Index: subversion/libsvn_client/mergeinfo.h
> ===================================================================
> --- subversion/libsvn_client/mergeinfo.h (revision 1145163)
> +++ subversion/libsvn_client/mergeinfo.h (working copy)
> @@ -168,17 +168,11 @@
> doesn't support a mergeinfo capability and SQUELCH_INCAPABLE is
> TRUE, set *TARGET_MERGEINFO to NULL.
>
> - If the *TARGET_MERGEINFO for REL_PATH path is inherited and
> - *VALIDATE_INHERITED_MERGEINFO is TRUE, then *TARGET_MERGEINFO
> - will only contain merge source path-revisions that actually
> - exist in repository.
> -
> - If the *TARGET_MERGEINFO for REL_PATH path is inherited,
> + If the mergeinfo for REL_PATH path is inherited,
> VALIDATE_INHERITED_MERGEINFO is TRUE, and the server supports
> the #SVN_RA_CAPABILITY_VALIDATE_INHERITED_MERGEINFO capability,
> - then request that the server validate the mergeinfo in
> - *TARGET_MERGEINFO, so it contains only merge source path-revisions
> - that actually exist in repository. */
> + then *TARGET_MERGEINFO will only contain merge source path-revisions
> + that actually exist in the repository. */

Committed this bit in r1145269.

> svn_error_t *
> svn_client__get_repos_mergeinfo(svn_ra_session_t *ra_session,
> svn_mergeinfo_t *target_mergeinfo,
> Index: subversion/libsvn_ra_neon/options.c
> ===================================================================
> --- subversion/libsvn_ra_neon/options.c (revision 1145163)
> +++ subversion/libsvn_ra_neon/options.c (working copy)
> @@ -450,14 +450,8 @@
> else
> cap_result = capability_yes;
>
> - if (strcmp(capability, SVN_RA_CAPABILITY_MERGEINFO) == 0)
> - apr_hash_set(ras->capabilities,
> - SVN_RA_CAPABILITY_MERGEINFO, APR_HASH_KEY_STRING,
> - cap_result);
> - else
> - apr_hash_set(ras->capabilities,
> - SVN_RA_CAPABILITY_VALIDATE_INHERITED_MERGEINFO,
> - APR_HASH_KEY_STRING, cap_result);
> + apr_hash_set(ras->capabilities,
> + capability, APR_HASH_KEY_STRING, cap_result);

There are no guarantees that the hash key you use here, the CAPABILITY
argument to svn_ra_neon__has_capability(), has the necessary lifetime.
 What if the caller of svn_ra_has_capability didn't pass a string
literal? Unlikely I know, but possible.

> }
> else
> {
> Index: subversion/libsvn_ra_serf/options.c
> ===================================================================
> --- subversion/libsvn_ra_serf/options.c (revision 1145163)
> +++ subversion/libsvn_ra_serf/options.c (working copy)
> @@ -611,14 +611,8 @@
> else
> cap_result = capability_yes;
>
> - if (strcmp(capability, SVN_RA_CAPABILITY_MERGEINFO) == 0)
> - apr_hash_set(serf_sess->capabilities,
> - SVN_RA_CAPABILITY_MERGEINFO, APR_HASH_KEY_STRING,
> - cap_result);
> - else
> - apr_hash_set(serf_sess->capabilities,
> - SVN_RA_CAPABILITY_VALIDATE_INHERITED_MERGEINFO,
> - APR_HASH_KEY_STRING, cap_result);
> + apr_hash_set(serf_sess->capabilities,
> + capability, APR_HASH_KEY_STRING, cap_result);

Same concern as with svn_ra_neon__has_capability().

> }
> else
> {
>
]]]
Received on 2011-07-12 18:21:32 CEST

This is an archived mail posted to the Subversion Dev mailing list.