TL;DR: I'm debugging this issue, and finding multiple problems. This is 
a summary of where I have got to so far.
http://subversion.apache.org/issue/4840
I am investigating an assertion failure.  I received a stack trace 
without parameter values, from which I quote the relevant part:
#4  svn_sort__array_insert (array=?, new_element=?, insert_index=?)
     at subversion/libsvn_subr/sorts.c:310
     assert(0 <= insert_index && insert_index <= array->nelts);
#5  svn_rangelist_merge2 (rangelist=?, chg=?,)
     at subversion/libsvn_subr/mergeinfo.c:1231
     svn_sort__array_insert(rangelist, &range_copy, i++);
#6  svn_mergeinfo_merge2 (mergeinfo=?, changes=?,)
     at subversion/libsvn_subr/mergeinfo.c:1838
#7  combine_forked_mergeinfo_props (...)
     at subversion/libsvn_wc/props.c:138
#8  apply_single_mergeinfo_prop_change (...)
     at subversion/libsvn_wc/props.c:1087
This was reported in Subversion 1.10.4. I have reproduced with 1.10.6 
and trunk_at_1871000.
There seem to be multiple problems contributing to this.
* Ill-defined canonical form for a rangelist.
We have (the same in 1.10 and trunk):
   - svn_rangelist__is_canonical()
     "all ranges in RANGELIST are in ascending order and do not overlap 
and are not adjacent" and ranges are non-empty.
   - svn_rangelist__canonicalize()
     "sort the ranges, and combine adjacent or overlapping ranges into 
single ranges. If overlapping ranges have different inheritability, 
return an error."
   - The inputs to svn_rangelist_merge2() "must be sorted according to 
svn_sort_compare_ranges" but need not be "compacted"; the exact 
specification is unclear.
A rangelist returned by 'canonicalize' is not necessarily 'canonical' as 
defined by svn_rangelist__is_canonical().  (We have a similar situation 
with other functional areas in Subversion such as path canonicalization.)
By code inspection, it looks like if any of the individual input ranges 
to svn_rangelist__canonicalize() are non-canonical (reversed or empty) 
then it gives undefined results (an error return, a plausible result or 
a totally bogus result, depending on the inputs).
By random-input testing svn_rangelist_merge2() with general 
non-canonical inputs, I can reproduce this particular case and some 
other failure modes (in 1.10 and trunk).
By random-input testing svn_rangelist_merge2() with non-canonical inputs 
that obey the doc string's "sorted" requirement, I can reproduce one 
failure mode: non-canonical output from canonical inputs (in 1.10 and 
trunk).
To reproduce the particular reported failure mode, I had to compile with 
some assertions disabled as they would be in a release build.
The function svn_rangelist_merge2() ends with a debug-mode-only 
assertion that output is canonical, which strongly implies it does not 
really need to support non-canonical inputs (that obey a lesser "sorted" 
criterion).
Conclusions so far:
   - The input requirements for svn_rangelist_merge2() (and other 
functions) should be simple and clear, and enforced in code.
   - We need test coverage for svn_rangelist_merge2() over the full 
range of inputs that meet those requirements.
   - Non-canonical output from canonical inputs is an error that needs 
to be fixed.
   - Apparently, svn_rangelist_merge2() only needs to support canonical 
inputs, so should be defined and enforced as such.
   - As the reported failure mode only occurs with inputs violating the 
requirements, presumably the fault is in the caller 
(svn_mergeinfo_merge2) or its callers, so I will have to pursue my 
search there.
* Rangelist "merge" is a set-union operation.
Both the naming and the implementation of svn_rangelist_merge2() are 
obscure.  The desired functionality is a set-union operation, or rather 
two set-unions:
     let set RRB be the set of revisions that apply to a base path, 
which includes both the 'non-inheritable' and 'inheritable' ranges; and
     let set RRC be the set of revisions that are inheritable to child 
paths, which means only the 'inheritable' ranges; then
     rangelist_merge(input1, input2):
       output.RRB = union(input1.RRB, input2.RRB)
       output.RRC = union(input1.RRC, input2.RRC)
assuming no "reversed" ranges are to be supported.
The current implementation is 275 lines long and contains 22 "if" 
statements.  No wonder it is broken.
An implementation based on set unions would surely be simple and reliable.
* The function range_to_string(), when given a non-canonical 
representation of an empty range such as .start == .end == 10, yields 
the totally wrong output "10-11".  This can be very confusing when seen 
in error messages and debugging.  I propose to fix this to issue a clear 
debugging output, like "[empty-range_at_10]", in this case.
* Poor and inconsistent error handling in svn_sort__array_insert and 
_delete.
svn_sort__array_insert() aborts on out-of-range inputs. 
svn_sort__array_delete() silently ignores out-of-range inputs, doing 
nothing.
I propose to change them both to report a (catchable) error on 
out-of-range inputs.
(Error checking is generally in keeping with our overall design 
principles.  Silently allowing certain kinds of bad input is an 
alternative and sometimes appropriate design option for library 
functions, in cases where the expected behaviour can be specified 
clearly and simply.  For example, we could specify that 'delete' shall 
delete array elements with indexes in the intersection of the existing 
indexes and the specified indexes, and ignore any other specified 
indexes.  However that's not what _delete is currently doing: it is 
ignoring the whole request if any part of it is out of range.)
* Our testing is clearly inadequate.
It shows me that we need to be using random-input testing on functions 
like this (functions in the category of mergeinfo arithmetic).
I propose to add some, starting with svn_rangelist_merge2().
=== Next steps ===
I will send patches demonstrating the bugs soon, not right now.
(Right now I have everything mixed up in a single WC for debugging.)
Then I will send patches for some of the other issues mentioned.
Then I will investigate further up the call stack in a similar way.
- Julian
Received on 2019-12-11 23:03:43 CET