[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Fri, 2 Oct 2015 13:06:16 +0200

On Tue, Sep 29, 2015 at 3:09 PM, Julian Foad <julianfoad_at_gmail.com> wrote:

> Thanks for the comments, Stefan.
>
> Stefan Fuhrmann wrote:
> > On Mon, Sep 28, 2015, Stefan Fuhrmann wrote:
> >> I suggest that we apply the patch below. It is more efficient
> >> that the original behaviour but ensures that "no-op" changes
> >> will be retained.
> >
> > Now that I've read through the whole thread, here my view
> > on this problem plus an explanation why I think my patch is
> > the right approach.
> >
> > First some facts and things that I believe:
> >
> > * Internally, FSFS can record no-op prop and no-op content
> > changes (the delta between new representation and the
> > respective proplist / text of its predecessor is empty). There
> > is even the possibility of a no-op node change, i.e. a new
> > node without touching text nor props (not happening atm).
>
> When you say "not happening ATM", are you referring to 1.9 or all
> released versions of FSFS or something else?
>

It does not happen in FSFS, is not planned to happen
and probably did never happen in FSFS. In FSX, though,
it is a side-effect of rep-sharing: Because the reps in the
noderev struct don't have unifiers any more, they become
identical to the entries in the predecessor node.

> * Relying on implementation details of a lower layer is bad.
> >
> > * Via FS and Repos layer API, new instances of no-op changes
> > to either text and properties can be created.
> > * Dump files (produced by 3rd party tools or non-incremental
> > svnadmin dumps) are another source of those no-op changes.
> > * SVN editors don't produce no-op changes.
>
> An svn_delta_editor can implicitly describe all sorts of no-op
> changes. For example, if I add svn_delta__get_debug_editor() into
> svnmucc's commit, we can make no-op changes to a file's text and
> properties as follows:
>
[snip]

I see. I didn't have svnmucc on my radar. It actually supports
my point: "normal" SVN usage via the svn client should never
produce no-op changes but there still are other legal interfaces
that may create new instances.

So, it is not a problem of the past (e.g. created by CVS->SVN
conversions) that could be fixed by e.g. updating the log messages.
We will have to deal with them in the future and need to preserve
that information.

> * A commit links user-controlled revision meta data (e.g. comment)
> > to a user-controlled list of changes. That link shall not be
> > lost when tweak implementation details.
>
> When you say "list of changes", I agree if you mean the "difference
> between snapshots" in principle, rather than the specific list that's
> stored at the end of a revision file.
>

Yes, and there is a clear mismatch in the output between diff
and log. It is due to the changes list being (mostly?) redundant
information with a tendency to become inconsistent is some
sense for edge cases like no-op changes.

> > * Dump / load is a compatibility interface between SVN versions
> > and allows for forward and backward migration.
> > * A plain node modification (no text nor prop change) will be
> > ignored by "svnadmin load". Even if we changed that on /trunk,
> > it would not fix old releases.
>
> What's a "plain node modification (no text nor prop change)"?
> Ambiguous w.r.t. no-op changes.
>

A dump file may say "modified node X" but not give the new
props nor the new contents. This is exactly what happens today
for no-op file modifications where neither text nor props actually
differ from their predecessors.

The loader will not create the new node upon seeing the "modified"
record but only as a side-effect of an actual text / prop set - which
also populates the changed paths list in the target repo.

> > From that, we can derive a few requirements:
> >
> > * If the a revision contains a "M" change, svnadmin must produce
> > a dump file that causes previous releases to produce the "M"
> > change - within reason.
>
> The problem is, we don't have an agreed definition of an 'M' change.
>

My point here is that we only need to agree that we don't want
to lose the information contained in the association of the rev
meta data and the changed paths list. The patch is simply stops
us losing some information in a somewhat rare case.

Beyond that, we may agree that there shall be no future instances
of 'M' with empty deltas. Personally, I don't think that empty deltas
should break any part of SVN. E.g. the result of a merge can be an
empty delta (which we then do not commit as a individual change
but may only track though mergeinfo). IOW, preventing no-op
changes may end up not actually eliminating edge cases from the
more complicated parts of SVN.

> > * Load shall produce no-op changes for the same data as in previous
> > releases - within reason. If the FS API would no longer support
> > creating no-op changes at some point in the future, the load process
> > should issue a warning.
> > * The dump file contents shall be independent of FS implementation
> > details, i.e. use "strict" FS API functions.
>
> I'm not sure to what you're referring here. The 'strict' flag in the
> text and props comparison functions? That would omit no-op changes.
>

What I mean is to use the parts of the FS API whose behaviour is
not implementation-specific. I'm not 100% sure whether we will be
fine with applying the same logic as in my patch but using the API
variants that allow for false positives. It feels better to have a more
predictable / stable output.

> > The
> > text / proplist asymmetry in the patch is due to the fact that all
> > files due have a text but maybe no proplist representation. So,
> > dumping the text is always safe.
> >
> > V2 of the patch also covers no-op prop changes to directories.
>
> I don't see that we've decided whether we want to preserve no-op text
> changes, no-op prop changes, no-op node changes, and/or other.
>

Right now, we are losing information. In rare cases and probably
nothing too important but still. This aspect makes me consider
the current behaviour a bug. Whether creating that situation in
the first place was o.k. nor not is a separate issue.

> It looks like your patch is aimed at preserving the no-op node change.
> This correlates with an 'M' line in the 'svn log -v' output.
>

Exactly.

> Do we also care about no-op text and property changes, such as are
> reported in 'svn log -v --xml'? More precisely, we should define what
> we care about in terms of an API.
>

Good point, log -v --xml could say "modified" but not set the
flags for either prop nor text change. That would need fixing,
I guess and it would be quite expensive in terms of CPU load.
But it makes sense at least as an option.

-- Stefan^2.
Received on 2015-10-02 13:06:23 CEST

This is an archived mail posted to the Subversion Dev mailing list.