[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: No-op changes no longer dumped by 'svnadmin dump' in 1.9

From: Bert Huijben <bert_at_qqmail.nl>
Date: Tue, 27 Oct 2015 09:31:18 +0100

Summarizing a few mental notes:


Broken behavior (1 revision to the direct next/previous)

· Replay / svnadmin dump
(svnadmin dump is tightly integrated with replay. Svnrdump works on top of replay)

· Log

Mostly caused by having the old behavior for years… never intended/designed, but ‘works’.


è Needs design/documentation + tests + backport




Good behavior (calculating deltas of multiple changes):

· Update/checkout/switch/diff/merge

No-op changes make no sense when we collapse changes

· What about last-* values?



Design issues:

· Editor v2

o can’t express no-op changes yet

o Is used as a 100% replacement for the delta editor even in these cases

o Partially released in 1.9 via JavaHL

· (Editor v3??)



File revisions:

· 1.9 behavior works best for ‘svn annotate’ (avoids calculations on not-changed files)

· 1.5-1.8 behavior required for ‘svn annotate -g’ (Wants to know all revisions. Uses that to trigger behavior. Deltas sometimes between unexpected revisions)

Api only used for blame or otherwise determining actual changes.





From: bert_at_qqmail.nl [mailto:bert_at_qqmail.nl]
Sent: dinsdag 27 oktober 2015 08:44
To: Johan Corveleyn <jcorvel_at_gmail.com>
Cc: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>; Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>; Julian Foad <julianfoad_at_btopenworld.com>; dev <dev_at_subversion.apache.org>
Subject: RE: No-op changes no longer dumped by 'svnadmin dump' in 1.9


But as Julian and Branko pointed out Subversion's update operation works on calculating deltas over the actual changes. Seeing non-changes as changes there introduces unwanted behavior.


Going back to the old code that assumes something is changed in these cases + in some unknown/undocumented/unintended other cases is not the way to design our software.


We should *decide* when we want which behaior. We should not decide we want to go back to that unknown/undocumented/unintended everywhere, without documenting this.



Just going back *everywhere* without deciding why will make it impossible to improve Subversion in the future as we will always break things.



We never designed the old behavior; we just used the functions that were there. If we want it back we should document it, probably add regression tests... and determine in which places we want it.



The original request is about legacy behavior of a cvs import during svn log.



But this 'go back to 1.8' suggestio changes Subversion everywhere. It will make 'svn annotate' slower... Makes 'svn update' slower and report more tree conflicts, etc. etc.


Handling non-changes as changes make a lot less sense in those cases... while we already have the code to fix those cases.



We should revert the behavior where it makes sense. Reverting it everywhere 'just because' doesn't make sense.





Sent from Mail <http://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10



From: Johan Corveleyn
Sent: dinsdag 27 oktober 2015 02:53
To: Bert Huijben
Cc: Evgeny Kotkov;Stefan Fuhrmann;Julian Foad;dev
Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9





As the OP of this mail-thread, which spun out of the discovery of a

loss of information by 'dump' in 1.9 [1], I'd like to add some things.


I found out about this problem during the Berlin hackathon, when I

tested various dumped/loaded repositories. The loss of information is

real, and is IMO significant (we're losing a, possibly intended,

relationship between a log message and a particular path [2]).


On Mon, Oct 26, 2015 at 6:16 PM, Bert Huijben <bert_at_qqmail.nl <mailto:bert_at_qqmail.nl> > wrote:



>> -----Original Message-----

>> From: Evgeny Kotkov [mailto:evgeny.kotkov_at_visualsvn.com]

>> Sent: maandag 26 oktober 2015 17:45

>> To: Bert Huijben <bert_at_qqmail.nl <mailto:bert_at_qqmail.nl> >; Stefan Fuhrmann

>> <stefan.fuhrmann_at_wandisco.com <mailto:stefan.fuhrmann_at_wandisco.com> >

>> Cc: Johan Corveleyn <jcorvel_at_gmail.com <mailto:jcorvel_at_gmail.com> >; Julian Foad

>> <julianfoad_at_btopenworld.com <mailto:julianfoad_at_btopenworld.com> >; dev <dev_at_subversion.apache.org <mailto:dev_at_subversion.apache.org> >

>> Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9



>> This means that after r1572363 and r1573111, svn_ra_get_file_revs2() and

>> svn_repos_get_file_revs2() were skipping some of the "interesting"

>> revisions,

>> according to the FS API defining the concept. Moreover, this behavior could

>> be inconsistent even within a single function like svn_ra_get_file_revs2()

>> that calls svn_ra_get_log2() for old servers, as get-log notices revisions

>> with empty deltas.


>> I think that it's another example of where r1572363 and r1573111 introduce

>> an

>> inconsistent and unwanted behavior change.


> And 1.9.x assumes that the old behavior is a bug... and in many cases I agree.


Did the old behavior have serious bugs that were visible to users?

Evgeny seemed to think not [3], and no-one said otherwise. And even

so: is it okay to introduce new bugs while fixing old ones? IMO a bug

in 'dump' is a big deal, because it changes your repository / history.

Moreover, people who do a dump/load might not notice the change until

years later, after they have piled up tons more history on top of it.


Maybe there is some doubt whether this is a bug or a feature, but

while we're in doubt I think the safest option is the 0-option: keep

the old and known behavior (or rollback to it), which didn't lose this



> This is exactly where the document Julian wrote comes in.


As I said earlier in this thread, I think that document [4] is a great

effort. But if you read it carefully, you'll see it does not

contradict having no-op changes in the repository history, and

exposing them for instance through 'svn log'. If we're supporting

those (and we have until 1.8), we must be able to dump them.


See specifically the final section titled "ARBITRARY DIFF VS.

SINGLE-REVISION CHANGE ", where Julian argues:



As best I understand it, the idea of recording a no-op-change is meaningful

and relatively straightforward to define at the level of a single state

transition. We think of a commit as such a transition, and it is, but as

mentioned above it's not in general the exact same transition that the

client described.


Attempting to derive a notion of 'no-op-change' that applies to a

difference taken between an arbitrary pair of points in the version

history, on the other hand, is not at all straightforward, and we do not

have a concept of its meaning in relation to merging and so on.


Now, the "svn log -v" output clearly applies to a single commit, a single

state transition, and thus we find the indication of no-op changes there to

be somewhat satisfactory. The code that generates this output, on the other

hand, uses APIs that compare arbitrary points in history, such as


    svn_fs_contents_changed(root1:path1, root2:path2)



Comparing arbitrary points in history is an operation that, throughout

pretty much all of the version control system, is used really only when we

want and need to know about real changes. Hence the definition of a new

pair of APIs,





to specifically provide that meaning.


What purpose remains for the original _changed() APIs, then? At first it

wasn't clear there was any real purpose, but if we want "svn log" etc. to

continue as before, then we need something like them. Except for this

purpose we don't need APIs that compare two arbitrary states; we need APIs

that compare two successive states, because this 'touched' concept only

makes sense in this context.



Whether we go for a complete redesign of the APIs or not, the above

text nicely explains some different ways of looking at this, and gives

"no-op changes" a place in our feature set.


> If we wanted 1.9.x to behave in all ways identical to 1.8.x, we wouldn't have created 1.9. We would have never released something different than the old thing. Stefan spend quite some time in improving things, and upto now most users agreed that this was an improvement. (The time to speak up was during the release candidates)



> Every new feature or bugfix changes behavior.

> Just 'thinking that this is another inconsistent behavior change' doesn't make a new argument on why this behavior change should be backported to 1.9.x.



In my opinion the changed behavior of dump is a bug, not just a

behavior change. Unfortunately, I only found the bug after release.

But even if you don't think it's a bug, it was definitely an

unexpected side-effect of the refactoring done by stefan2.


Stefan proposed another way of fixing this, different from Evgeny's

patch, but both agreed that the dump behavior was a bug and that it

should be fixed. Julian too agreed that the change (the new code)

should be reverted [5].



> I don't think reporting something as changed, when it is clearly not changed is a good thing.


> We should decide when we want to see something as 'changed' and what definition of 'changed' should be used where.


> Just going back to 1.8 is not the way to approach this.

> That just changes one 'somehow broken implementation' (in one definition) with a 'somehow broken implementation' (with a different definition).

> We should define what behavior we really want (where)... and document why we want that behavior there. Until then I don't think we should backport anything.


> Both the 1.8.x behavior and the 1.9.x behavior are released... Going back to 1.8.x is not going to fix everybody's usecases.


Okay, well, I agree we should eventually go for a clear specification

and design, and then implement that. But in the meantime we have a

real bug in 1.9 [1] which can cause damage (or in any case "doubtful

changes") to repositories (when an admin performs a dump+load). I

would prefer that we try to fix the dump bug and backport it as soon

as possible (getting us back to a good working state), and then take

time to work out the long term solution.



[1] Issue #4598 "No-op changes no longer dumped by 'svnadmin dump' in

1.9", http://subversion.tigris.org/issues/show_bug.cgi?id=4598


[2] http://svn.haxx.se/dev/archive-2015-09/0290.shtml


[3] http://svn.haxx.se/dev/archive-2015-10/0085.shtml


[4] http://svn.haxx.se/dev/archive-2015-10/0082.shtml


[5] http://svn.haxx.se/dev/archive-2015-09/0292.shtml


Received on 2015-10-27 09:31:46 CET

This is an archived mail posted to the Subversion Dev mailing list.