[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[#4840] Merge assertion failure in svn_sort__array_insert

From: Julian Foad <julianfoad_at_apache.org>
Date: Wed, 11 Dec 2019 22:03:40 +0000

TL;DR: I'm debugging this issue, and finding multiple problems. This is
a summary of where I have got to so far.

http://subversion.apache.org/issue/4840

I am investigating an assertion failure. I received a stack trace
without parameter values, from which I quote the relevant part:

#4 svn_sort__array_insert (array=?, new_element=?, insert_index=?)
     at subversion/libsvn_subr/sorts.c:310
     assert(0 <= insert_index && insert_index <= array->nelts);

#5 svn_rangelist_merge2 (rangelist=?, chg=?,)
     at subversion/libsvn_subr/mergeinfo.c:1231
     svn_sort__array_insert(rangelist, &range_copy, i++);

#6 svn_mergeinfo_merge2 (mergeinfo=?, changes=?,)
     at subversion/libsvn_subr/mergeinfo.c:1838
#7 combine_forked_mergeinfo_props (...)
     at subversion/libsvn_wc/props.c:138
#8 apply_single_mergeinfo_prop_change (...)
     at subversion/libsvn_wc/props.c:1087

This was reported in Subversion 1.10.4. I have reproduced with 1.10.6
and trunk_at_1871000.

There seem to be multiple problems contributing to this.

* Ill-defined canonical form for a rangelist.

We have (the same in 1.10 and trunk):

   - svn_rangelist__is_canonical()
     "all ranges in RANGELIST are in ascending order and do not overlap
and are not adjacent" and ranges are non-empty.

   - svn_rangelist__canonicalize()
     "sort the ranges, and combine adjacent or overlapping ranges into
single ranges. If overlapping ranges have different inheritability,
return an error."

   - The inputs to svn_rangelist_merge2() "must be sorted according to
svn_sort_compare_ranges" but need not be "compacted"; the exact
specification is unclear.

A rangelist returned by 'canonicalize' is not necessarily 'canonical' as
defined by svn_rangelist__is_canonical(). (We have a similar situation
with other functional areas in Subversion such as path canonicalization.)

By code inspection, it looks like if any of the individual input ranges
to svn_rangelist__canonicalize() are non-canonical (reversed or empty)
then it gives undefined results (an error return, a plausible result or
a totally bogus result, depending on the inputs).

By random-input testing svn_rangelist_merge2() with general
non-canonical inputs, I can reproduce this particular case and some
other failure modes (in 1.10 and trunk).

By random-input testing svn_rangelist_merge2() with non-canonical inputs
that obey the doc string's "sorted" requirement, I can reproduce one
failure mode: non-canonical output from canonical inputs (in 1.10 and
trunk).

To reproduce the particular reported failure mode, I had to compile with
some assertions disabled as they would be in a release build.

The function svn_rangelist_merge2() ends with a debug-mode-only
assertion that output is canonical, which strongly implies it does not
really need to support non-canonical inputs (that obey a lesser "sorted"
criterion).

Conclusions so far:
   - The input requirements for svn_rangelist_merge2() (and other
functions) should be simple and clear, and enforced in code.
   - We need test coverage for svn_rangelist_merge2() over the full
range of inputs that meet those requirements.
   - Non-canonical output from canonical inputs is an error that needs
to be fixed.
   - Apparently, svn_rangelist_merge2() only needs to support canonical
inputs, so should be defined and enforced as such.
   - As the reported failure mode only occurs with inputs violating the
requirements, presumably the fault is in the caller
(svn_mergeinfo_merge2) or its callers, so I will have to pursue my
search there.

* Rangelist "merge" is a set-union operation.

Both the naming and the implementation of svn_rangelist_merge2() are
obscure. The desired functionality is a set-union operation, or rather
two set-unions:

     let set RRB be the set of revisions that apply to a base path,
which includes both the 'non-inheritable' and 'inheritable' ranges; and
     let set RRC be the set of revisions that are inheritable to child
paths, which means only the 'inheritable' ranges; then

     rangelist_merge(input1, input2):
       output.RRB = union(input1.RRB, input2.RRB)
       output.RRC = union(input1.RRC, input2.RRC)

assuming no "reversed" ranges are to be supported.

The current implementation is 275 lines long and contains 22 "if"
statements. No wonder it is broken.

An implementation based on set unions would surely be simple and reliable.

* The function range_to_string(), when given a non-canonical
representation of an empty range such as .start == .end == 10, yields
the totally wrong output "10-11". This can be very confusing when seen
in error messages and debugging. I propose to fix this to issue a clear
debugging output, like "[empty-range_at_10]", in this case.

* Poor and inconsistent error handling in svn_sort__array_insert and
_delete.

svn_sort__array_insert() aborts on out-of-range inputs.
svn_sort__array_delete() silently ignores out-of-range inputs, doing
nothing.

I propose to change them both to report a (catchable) error on
out-of-range inputs.

(Error checking is generally in keeping with our overall design
principles. Silently allowing certain kinds of bad input is an
alternative and sometimes appropriate design option for library
functions, in cases where the expected behaviour can be specified
clearly and simply. For example, we could specify that 'delete' shall
delete array elements with indexes in the intersection of the existing
indexes and the specified indexes, and ignore any other specified
indexes. However that's not what _delete is currently doing: it is
ignoring the whole request if any part of it is out of range.)

* Our testing is clearly inadequate.

It shows me that we need to be using random-input testing on functions
like this (functions in the category of mergeinfo arithmetic).

I propose to add some, starting with svn_rangelist_merge2().

=== Next steps ===

I will send patches demonstrating the bugs soon, not right now.

(Right now I have everything mixed up in a single WC for debugging.)

Then I will send patches for some of the other issues mentioned.

Then I will investigate further up the call stack in a similar way.

- Julian
Received on 2019-12-11 23:03:43 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.