I'll write a larger reply tomorrow (after looking through the code a bit more), but I think this is about how I feel about this. Great summary!
From: "Julian Foad" <julianfoad_at_gmail.com>
Sent: 6-9-2015 15:15
To: "Bert Huijben" <bert_at_qqmail.nl>
Cc: "dev" <dev_at_subversion.apache.org>
Subject: Re: svnsync crash on empty repo
I have had an idea about what may be bothering you. Is it that you
don't like us to have any interface where a valid revnum value could
be greater than HEAD, because it's counter-intuitive and we rarely do
it and in most cases a valid revnum cannot be greater than head?
If that's the case, I have sympathy for that reasoning. For new APIs,
I would certainly agree.
For replay_range we could:
* require a non-empty range, breaking strict backward compatibility;
* make it accept an empty range and return without doing anything
(choosing to implement the test either as (start_rev > end_rev) or as
(start_rev == end_rev + 1));
* rev the API, and make the old version accept an empty range, and
the new version reject an empty range;
* rev the API, and make the old version accept an empty range as
before, and make the new version take an "exclusive" instead of an
"inclusive" start revision, and accept an empty range (which would be
represented with start_rev == end_rev and neither of these would ever
be greater than HEAD).
To make the (existing) API take an empty range while ensuring that a
greater-than-head revnum will never be passed through to the RA layer
implementations, the check and the early return could be implemented
in svn_ra_replay_range() directly.
We can of course decide to break API compatibility and update the
caller, but we should not do that without considering whether we have
On 6 September 2015 at 09:48, Julian Foad <julianfoad_at_gmail.com> wrote:
> Bert Huijben wrote:
>> Julian Foad wrote:
>>> Bert wrote:
>>>> I don't think we should fix this with a 'revision+1' to explicitly allow
>>>> many bad ranges,
>>> An empty range isn't inherently "bad". Allowing an empty range isn't
>>> bad. Being able to specify an empty range in many different ways isn't
>>> bad. Changing a released API to make a previously no-op case now throw
>>> an assertion failure breaks our compatibility guarantees. That is bad.
>> An empty range isn't bad, but your patch appears to also explicitly allow undefined ranges, if I read it correctly.
>> I would call A range of r1 upto r0 defined (empty), but I would call a range or r4 upto r3 with current HEAD r3 undefined.
>> We use r1 and r0 as special in quite a few places where we really just want to point at the initial revision,
> I am totally against the idea of treating r1 as a special value.
> I didn't think we used r1 as special anywhere. Can you point to any of
> those places, please?
> (There was a case added to the mergeinfo parsing a few years ago that
> treated r1 as special -- treating "change r1" as an invalid merge
> source -- and I removed that special case because it was illogical.
> That is the only case I recall seeing.)
>> I see no problem with that but I don't want to introduce a definition of 'HEAD+1' as there is no we can declare stable behavior on that... It may even change partway through an operation, the moment that revision is committed.
> I am not proposing that "HEAD+1" is a special value. I propose that
> "r4 up to r3" specifies an empty range because there are no integers R
> where (R >= 4 and R <= 3). It has nothing to do with the value of
> HEAD, so doesn't matter if HEAD changes.
> Logically a range of revisions (in the sense of *changes*) is defined
> by a starting state (aka snapshot) and an ending state. When we want
> the 'replay_range' API to replay the single change between (state) r2
> and (state) r3, the API is defined in such a way that we have to pass
> (starting_state + 1, ending_state), that is (r3, r3).
> Now consider if we request to replay an empty range. To replay the
> empty range between (state) r3 and (state) r3, we have to pass in
> parameters (r3+1, r3). That doesn't mean we're actually going to
> dereference the state (r3+1). If HEAD is r3, dereferencing r4 would be
> an error. But the API only takes this input (r4) in order to refer to
> a starting state r3, because that's how the API was defined. It
> doesn't dereference r4, so no problem. Like how a pointer to the end
> of a memory block in C is defined as valid even if the byte it points
> to is outside the allocated block, as long as we don't dereference it.
> The only place where the value of HEAD matters is when you want to
> check that the revision range we specified is no higher than HEAD.
> Think of it logically. If we specified an empty range using (R+1, R)
> then we have not specified a range that goes higher than HEAD even if
> (R+1) == (HEAD+1). Therefore, HEAD+1 is acceptable as the "start"
> parameter *iff* the "end" parameter is is less than "start".
>> (It would be nicer if we used SVN_INVALID_REVNUM in more cases as the magic MIN value, but at least ra_svn will assert if we would try to do that, as it doesn't allow transferring r-1 as a normal serialized revision. So fixing that requires a timemachine)
>> Svnrdump had a bug where it actually passed HEAD+1 as base revision of a commit in a specific case... It is better to catch these errors on the client than to rely on not having concurrent commits to define behavior of our tools.
> I'm not suggesting to rely on not having concurrent commits!
> BTW what's your opinion on maintaining backward compatibility of this
> API? One option, if you want to disallow specifying an empty range, is
> to create a new API and keep the old one as it was.
> - Julian
Received on 2015-09-06 21:29:54 CEST