Re: Symmetry between dump and load

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 19 Dec 2014 14:06:29 +0000

Branko ÄŒibej <brane_at_wandisco.com>
> Julian Foad wrote:
>> I believe the following symmetries should be true, and testable, and we
>> should test them.
>>
>>Â For any valid repository:
>>
>> Â Â * we can dump it
>> Â Â * we can load the dump file into a new repository
>> Â Â * the new repo is equivalent to the old repo
>>
>>Â For any valid dump file:
>>
>> Â Â * we can load it into a new repository
>> Â Â * we can dump that repository
>> Â Â * the new dump file is equivalent to the old dump file
>
> I agree that this should be our goal. However, consider that some of
> these symmetries depend on specific features of the repository
> implementation.
>
> For example, at some point you mentioned dump files with non-UTF-8
> paths. Such dump files are clearly invalid, since we've maintained the
> restriction that all strings used internally must be encoded in UTF-8.
> So, such a dump file can only be the result of manual fiddling, or a bug
> in some version of some repository back-end implementation. A different
> and/or fixed backend will not accept non-UTF-8 paths at all; thus, we
> cannot maintain this particular symmetry.

Yes, exactly. By testing, we could discover this kind of problem. The solution to this kind of issue is not necessarily that we have to prioritize total symmetry across all versions and implement 'fixes'; rather, part of the goal of testing is to discover such asymmetries so that we can be aware of them, document them, and decide what further action to take if any, which may be accepting the asymmetry and adjusting the testing if necessary to account for it.

> Conversely, if we decide that maintaining strict dump/load symmetry is
> more important, we'reâ€”unnecessarily, IMOâ€”complicating future development
> (e.g., the idea that repos path lookup should preserve but ignore
> differences in Unicode character representation).

I don't propose to maintain strict symmetry in all cases. The point is to *discover* issues, to make them visible, and then decide, for each issue, whether we should declare it a bug to be fixed or accept and document the asymmetry and adjust the tests for it.

> I'm sure there are other cases where maintaining strict symmetry will
> turn out to be too constraining. An example from your own bailiwick:
> when we store mergeinfo in a more reasonable structure than a versioned
> property, a load from an older dumpfile will most likely loose details
> of exactly how the mergeinfo was represented; even though a later dump
> may produce svn:mergeinfo values that are different but semantically
> equivalent to the original.

Yes, sure, that's an entirely reasonable course.

> Clearly, dump/load asymmetry can be preserved even in the cases I
> mentioned, at the cost of maintaining more complex medatada (and related
> code) in the repository back-end. The question we have to answer is:
> what's the point, as long as semantics are not affected?

The point is not that strict symmetry is the number 1 priority, but rather that we use a general goal of symmetry to help us find and address problems.

- Julian
Received on 2014-12-19 15:07:22 CET

This message: [ Message body ]
Next message: Ivan Zhakov: "FSFS caching and apr_thread_rwlock_t performance on Windows"
Previous message: Mark Phippard: "Re: Symmetry between dump and load"
In reply to: Branko ÄŒibej: "Re: Symmetry between dump and load"
Next in thread: Julian Foad: "Issue #4544, Symmetry between dump and load"
Reply: Julian Foad: "Issue #4544, Symmetry between dump and load"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]