[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Tue, 27 Oct 2015 13:01:48 +0300

On 27 October 2015 at 04:52, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> Bert,
> As the OP of this mail-thread, which spun out of the discovery of a
> loss of information by 'dump' in 1.9 [1], I'd like to add some things.
> I found out about this problem during the Berlin hackathon, when I
> tested various dumped/loaded repositories. The loss of information is
> real, and is IMO significant (we're losing a, possibly intended,
> relationship between a log message and a particular path [2]).
> On Mon, Oct 26, 2015 at 6:16 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
>>> -----Original Message-----
>>> From: Evgeny Kotkov [mailto:evgeny.kotkov_at_visualsvn.com]
>>> Sent: maandag 26 oktober 2015 17:45
>>> To: Bert Huijben <bert_at_qqmail.nl>; Stefan Fuhrmann
>>> <stefan.fuhrmann_at_wandisco.com>
>>> Cc: Johan Corveleyn <jcorvel_at_gmail.com>; Julian Foad
>>> <julianfoad_at_btopenworld.com>; dev <dev_at_subversion.apache.org>
>>> Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9
>>> This means that after r1572363 and r1573111, svn_ra_get_file_revs2() and
>>> svn_repos_get_file_revs2() were skipping some of the "interesting"
>>> revisions,
>>> according to the FS API defining the concept. Moreover, this behavior could
>>> be inconsistent even within a single function like svn_ra_get_file_revs2()
>>> that calls svn_ra_get_log2() for old servers, as get-log notices revisions
>>> with empty deltas.
>>> I think that it's another example of where r1572363 and r1573111 introduce
>>> an
>>> inconsistent and unwanted behavior change.
>> And 1.9.x assumes that the old behavior is a bug... and in many cases I agree.
> Did the old behavior have serious bugs that were visible to users?
> Evgeny seemed to think not [3], and no-one said otherwise. And even
> so: is it okay to introduce new bugs while fixing old ones? IMO a bug
> in 'dump' is a big deal, because it changes your repository / history.
> Moreover, people who do a dump/load might not notice the change until
> years later, after they have piled up tons more history on top of it.
> Maybe there is some doubt whether this is a bug or a feature, but
> while we're in doubt I think the safest option is the 0-option: keep
> the old and known behavior (or rollback to it), which didn't lose this
> information.
>> This is exactly where the document Julian wrote comes in.
> As I said earlier in this thread, I think that document [4] is a great
> effort. But if you read it carefully, you'll see it does not
> contradict having no-op changes in the repository history, and
> exposing them for instance through 'svn log'. If we're supporting
> those (and we have until 1.8), we must be able to dump them.
> See specifically the final section titled "ARBITRARY DIFF VS.
> SINGLE-REVISION CHANGE ", where Julian argues:
> [[[
> As best I understand it, the idea of recording a no-op-change is meaningful
> and relatively straightforward to define at the level of a single state
> transition. We think of a commit as such a transition, and it is, but as
> mentioned above it's not in general the exact same transition that the
> client described.
> Attempting to derive a notion of 'no-op-change' that applies to a
> difference taken between an arbitrary pair of points in the version
> history, on the other hand, is not at all straightforward, and we do not
> have a concept of its meaning in relation to merging and so on.
> Now, the "svn log -v" output clearly applies to a single commit, a single
> state transition, and thus we find the indication of no-op changes there to
> be somewhat satisfactory. The code that generates this output, on the other
> hand, uses APIs that compare arbitrary points in history, such as
> svn_fs_contents_changed(root1:path1, root2:path2)
> svn_fs_props_changed
> Comparing arbitrary points in history is an operation that, throughout
> pretty much all of the version control system, is used really only when we
> want and need to know about real changes. Hence the definition of a new
> pair of APIs,
> svn_fs_contents_different
> svn_fs_props_different
> to specifically provide that meaning.
> What purpose remains for the original _changed() APIs, then? At first it
> wasn't clear there was any real purpose, but if we want "svn log" etc. to
> continue as before, then we need something like them. Except for this
> purpose we don't need APIs that compare two arbitrary states; we need APIs
> that compare two successive states, because this 'touched' concept only
> makes sense in this context.
> ]]]
> Whether we go for a complete redesign of the APIs or not, the above
> text nicely explains some different ways of looking at this, and gives
> "no-op changes" a place in our feature set.
>> If we wanted 1.9.x to behave in all ways identical to 1.8.x, we wouldn't
>> have created 1.9. We would have never released something different than
>> the old thing. Stefan spend quite some time in improving things, and
>> upto now most users agreed that this was an improvement. (The time to
>> speak up was during the release candidates)
>> Every new feature or bugfix changes behavior.
>> Just 'thinking that this is another inconsistent behavior change' doesn't
>> make a new argument on why this behavior change should be backported to
>> 1.9.x.
> In my opinion the changed behavior of dump is a bug, not just a
> behavior change. Unfortunately, I only found the bug after release.
> But even if you don't think it's a bug, it was definitely an
> unexpected side-effect of the refactoring done by stefan2.
> Stefan proposed another way of fixing this, different from Evgeny's
> patch, but both agreed that the dump behavior was a bug and that it
> should be fixed. Julian too agreed that the change (the new code)
> should be reverted [5].
>> I don't think reporting something as changed, when it is clearly not
>> changed is a good thing.
>> We should decide when we want to see something as 'changed' and what
>> definition of 'changed' should be used where.
>> Just going back to 1.8 is not the way to approach this.
>> That just changes one 'somehow broken implementation' (in one definition)
>> with a 'somehow broken implementation' (with a different definition).
>> We should define what behavior we really want (where)... and document why
>> we want that behavior there. Until then I don't think we should backport
>> anything.
>> Both the 1.8.x behavior and the 1.9.x behavior are released... Going back
>> to 1.8.x is not going to fix everybody's usecases.
> Okay, well, I agree we should eventually go for a clear specification
> and design, and then implement that. But in the meantime we have a
> real bug in 1.9 [1] which can cause damage (or in any case "doubtful
> changes") to repositories (when an admin performs a dump+load). I
> would prefer that we try to fix the dump bug and backport it as soon
> as possible (getting us back to a good working state), and then take
> time to work out the long term solution.
+1 on all above.

Ivan Zhakov
Received on 2015-10-27 11:02:24 CET

This is an archived mail posted to the Subversion Dev mailing list.