[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Tue, 27 Oct 2015 02:52:59 +0100

Bert,

As the OP of this mail-thread, which spun out of the discovery of a
loss of information by 'dump' in 1.9 [1], I'd like to add some things.

I found out about this problem during the Berlin hackathon, when I
tested various dumped/loaded repositories. The loss of information is
real, and is IMO significant (we're losing a, possibly intended,
relationship between a log message and a particular path [2]).

On Mon, Oct 26, 2015 at 6:16 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: Evgeny Kotkov [mailto:evgeny.kotkov_at_visualsvn.com]
>> Sent: maandag 26 oktober 2015 17:45
>> To: Bert Huijben <bert_at_qqmail.nl>; Stefan Fuhrmann
>> <stefan.fuhrmann_at_wandisco.com>
>> Cc: Johan Corveleyn <jcorvel_at_gmail.com>; Julian Foad
>> <julianfoad_at_btopenworld.com>; dev <dev_at_subversion.apache.org>
>> Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9
>
>
>> This means that after r1572363 and r1573111, svn_ra_get_file_revs2() and
>> svn_repos_get_file_revs2() were skipping some of the "interesting"
>> revisions,
>> according to the FS API defining the concept. Moreover, this behavior could
>> be inconsistent even within a single function like svn_ra_get_file_revs2()
>> that calls svn_ra_get_log2() for old servers, as get-log notices revisions
>> with empty deltas.
>>
>> I think that it's another example of where r1572363 and r1573111 introduce
>> an
>> inconsistent and unwanted behavior change.
>
> And 1.9.x assumes that the old behavior is a bug... and in many cases I agree.

Did the old behavior have serious bugs that were visible to users?
Evgeny seemed to think not [3], and no-one said otherwise. And even
so: is it okay to introduce new bugs while fixing old ones? IMO a bug
in 'dump' is a big deal, because it changes your repository / history.
Moreover, people who do a dump/load might not notice the change until
years later, after they have piled up tons more history on top of it.

Maybe there is some doubt whether this is a bug or a feature, but
while we're in doubt I think the safest option is the 0-option: keep
the old and known behavior (or rollback to it), which didn't lose this
information.

> This is exactly where the document Julian wrote comes in.

As I said earlier in this thread, I think that document [4] is a great
effort. But if you read it carefully, you'll see it does not
contradict having no-op changes in the repository history, and
exposing them for instance through 'svn log'. If we're supporting
those (and we have until 1.8), we must be able to dump them.

See specifically the final section titled "ARBITRARY DIFF VS.
SINGLE-REVISION CHANGE ", where Julian argues:

[[[
As best I understand it, the idea of recording a no-op-change is meaningful
and relatively straightforward to define at the level of a single state
transition. We think of a commit as such a transition, and it is, but as
mentioned above it's not in general the exact same transition that the
client described.

Attempting to derive a notion of 'no-op-change' that applies to a
difference taken between an arbitrary pair of points in the version
history, on the other hand, is not at all straightforward, and we do not
have a concept of its meaning in relation to merging and so on.

Now, the "svn log -v" output clearly applies to a single commit, a single
state transition, and thus we find the indication of no-op changes there to
be somewhat satisfactory. The code that generates this output, on the other
hand, uses APIs that compare arbitrary points in history, such as

    svn_fs_contents_changed(root1:path1, root2:path2)
    svn_fs_props_changed

Comparing arbitrary points in history is an operation that, throughout
pretty much all of the version control system, is used really only when we
want and need to know about real changes. Hence the definition of a new
pair of APIs,

    svn_fs_contents_different
    svn_fs_props_different

to specifically provide that meaning.

What purpose remains for the original _changed() APIs, then? At first it
wasn't clear there was any real purpose, but if we want "svn log" etc. to
continue as before, then we need something like them. Except for this
purpose we don't need APIs that compare two arbitrary states; we need APIs
that compare two successive states, because this 'touched' concept only
makes sense in this context.
]]]

Whether we go for a complete redesign of the APIs or not, the above
text nicely explains some different ways of looking at this, and gives
"no-op changes" a place in our feature set.

> If we wanted 1.9.x to behave in all ways identical to 1.8.x, we wouldn't have created 1.9. We would have never released something different than the old thing. Stefan spend quite some time in improving things, and upto now most users agreed that this was an improvement. (The time to speak up was during the release candidates)
>
>
> Every new feature or bugfix changes behavior.
> Just 'thinking that this is another inconsistent behavior change' doesn't make a new argument on why this behavior change should be backported to 1.9.x.
>

In my opinion the changed behavior of dump is a bug, not just a
behavior change. Unfortunately, I only found the bug after release.
But even if you don't think it's a bug, it was definitely an
unexpected side-effect of the refactoring done by stefan2.

Stefan proposed another way of fixing this, different from Evgeny's
patch, but both agreed that the dump behavior was a bug and that it
should be fixed. Julian too agreed that the change (the new code)
should be reverted [5].

>
> I don't think reporting something as changed, when it is clearly not changed is a good thing.
>
> We should decide when we want to see something as 'changed' and what definition of 'changed' should be used where.
>
> Just going back to 1.8 is not the way to approach this.
> That just changes one 'somehow broken implementation' (in one definition) with a 'somehow broken implementation' (with a different definition).
> We should define what behavior we really want (where)... and document why we want that behavior there. Until then I don't think we should backport anything.
>
> Both the 1.8.x behavior and the 1.9.x behavior are released... Going back to 1.8.x is not going to fix everybody's usecases.

Okay, well, I agree we should eventually go for a clear specification
and design, and then implement that. But in the meantime we have a
real bug in 1.9 [1] which can cause damage (or in any case "doubtful
changes") to repositories (when an admin performs a dump+load). I
would prefer that we try to fix the dump bug and backport it as soon
as possible (getting us back to a good working state), and then take
time to work out the long term solution.

[1] Issue #4598 "No-op changes no longer dumped by 'svnadmin dump' in
1.9", http://subversion.tigris.org/issues/show_bug.cgi?id=4598

[2] http://svn.haxx.se/dev/archive-2015-09/0290.shtml

[3] http://svn.haxx.se/dev/archive-2015-10/0085.shtml

[4] http://svn.haxx.se/dev/archive-2015-10/0082.shtml

[5] http://svn.haxx.se/dev/archive-2015-09/0292.shtml

-- 
Johan
Received on 2015-10-27 02:53:37 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.