[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

From: Daniel Shahaf <danielsh_at_apache.org>
Date: Wed, 25 Jan 2017 05:04:32 +0000

[to skimmers: grep for 'of independent interest' for a dumpstream
enhancement idea]

C. Michael Pilato wrote on Tue, Jan 24, 2017 at 12:37:34 -0500:
> On 01/24/2017 12:01 PM, Daniel Shahaf wrote:
> > The bug is about 1.9 using a different order to 1.8. If we make
> > svnadmin use the 1.8 unconditionally, then 1.9 will use a different
> > order to 1.10, which essentially recreates the bug for other users.
> >
> > That is: the option serves to allow admins to choose which of the two
> > orders to be consistent with, 1.8 or 1.9.
> >
> > An alternative to having this option would be a dumpfile comparator that
> > ignores header order differences.
> As a bug report alone, this one seems pretty easy: Closed/INVALID.
> Dumpfile headers were never promised in a particular order, therefore
> that their order should differ in one version than in another is an
> interesting factoid, but not actionable *as a defect*. I think it
> unwise to introduce an option to dictate that new code should try to
> adhere to an old promise that wasn't.

If you mean in general, then I agree: not every behaviour of a release
libsvn is an API promise of it. That's why we should pick and choose
which behaviours to retroactively support.

In this case, Luke has existing data generated with 1.8 that he wants to
verify. Patching the 1.10 producer to emit files that are bytewise
identical to the files produced by 1.8, so that they may be verified
with plain cmp(1), is a very robust approach: if the cmp(1) check
passes, the then the repository administrator needn't worry about bugs
in the new code in the producer, and can be confident his data is well.

In contrast, if a shiny new 'svndumpcmp' tool returns 0, a good
administrator wouldn't trust that fully: he'd take into account the
possibility of a false negative answer due to a bug. That's why I think
it makes sense, in this particular case, to have a compatibility option.
(See also the last reply hunk about textdeltas)

> Now, it's completely reasonable to introduce into the current release a
> promise regarding future header ordering, though. And it is completely
> reasonable to backport an enforcement of that ordering (minus its
> attached promise?) to older releases for the benefit of users that
> care. And it may very well be that "the 1.8 ordering" is the very
> ordering you'd settle on, but perhaps not. But to even get here, I
> think folks have to decide if this is, in fact, a promise that
> Subversion wants to make.

As a datapoint, we've added a similar promise once: we're now writing
serialized hashes in alphabetical ordering, precisely so the result
would compare as equal not only after svn_hash_read2() parsing but also
as unparsed svn_string_t's. I unfortunately don't recall which hashes
are the ones we dump like this — I think it was some dirent or property
hash in fsfs or in 'svnadmin dump'.

> And if so, how universally? Does this order apply to svndumpfilter
> and svnrdump, too?

That's a good question. Luke?

> In all these scenarios though, nothing can be done about released code
> save, as you suggest, Daniel, to introduce some external comparator.

About the 'external comparator' option: the dumpstream format includes
the checksum of the fulltext resulting from delta application, but not
the checksum of the serialized delta itself (only its length), so
a 'dumpfile comparator' wouldn't be able to detect, say, bitflips
somewhere inside the serialized deltas. (Not without loading at least
one of the dump to an empty repository)

This might be an enhancement of independent interest, as they say...

Thanks for the review, Mike!


Received on 2017-01-25 06:08:42 CET

This is an archived mail posted to the Subversion Dev mailing list.