Hello Eric.
TL;DR: I explain why I am convinced no-op changes don't belong in the
Subversion versioning semantics. With your work on Subversion
repository and dump stream semantics, is this something you can offer a
view on? I have previously failed to convince the developer community [1].
In examining the Subversion versioned data semantics and how the
protocols and APIs represent them, I have come across a number of kinds
of what could be called a "no-op change" or perhaps better described as
"I touched this but did not change its value".
Example:
- I changed the text of file F from T1 to T1;
- now "svn log -v" tells me the text of F was "modified";
- some variants of "svn diff" show no output;
- some variants of "svn diff" show a diff header with no body.
That is the best known user-visible example. Other kinds are possible
too, and a number of examples exist on the server side, e.g. [#4623].
A Subversion client generally does not send no-op changes to a
repository, but in certain cases it does. A Subversion repository
generally does not record and play back any no-op changes that may be
sent to it, but in certain cases it does.
I am convinced "no-op changes" should be considered meaningless and
removed from the data model presented to the user. In protocols and
APIs, a no-op change should be considered a non-canonical form and a
transient implementation detail of that particular interface, and
implementations should not attempt to preserve it.
In the rest of this note I try to explain some angles to the issue.
The Subversion system is built on a main design principle of tree
snapshots and differencing and merging of trees. A no-op change is
out-of-tree metadata about certain pairs of trees. Carrying such
metadata around the system in general is fundamentally incompatible with
that principle.
One practical reason the existing system does not preserve that metadata
is because, with very few exceptions, the existing interfaces convey
no-op changes only implicitly, as a side effect of how their explicit
operations are formulated, and so one differs from another. For
example, an interface that represents a file change through multiple
optional operations, one of which is "on file F, property P changes to
value V" can convey "property P1's value no-op-changed from V1 to V1,
while property P2's value was not touched" if we invoke the "change
property" operation for property P1 but do not invoke it for P2. On the
other hand, an interface that represents a file change as a single
operation, "new file := {text, {properties}}" cannot; the only no-op
change it can convey is at a coarser granularity, "file F no-op-changed
its value from {T1,PROPS1} to {T1,PROPS1}". The kinds of no-op change
an interface can convey locally is an implementation detail of that
particular interface, and so cannot be expected to match any other
interface unless explicitly required and tested, which they mostly are not.
Because the existing interfaces convey no-op-change information only
incidentally, the system cannot be expected to preserve any particular
no-op change when data flows through multiple interfaces, through
commit, checkout, branch, diff, merge, and so on. Subversion only
preserves some within very limited scopes (such as the "file changed"
flag in the "changed paths" list in "svn log -v").
Some of the existing svn protocols and APIs explicitly preserve certain
no-op changes. For example, one user reported [2] that in their svn
history (converted from CVS) they would "hate to lose" the historical
record that "svn log -v" reports "file text changed" for a certain no-op
file change. When I eliminated this no-op change from "dump", without
due care to backward compatibility, it was considered a regression and
reverted [#4598]. There are valid arguments for preserving backward
compatibility in some places. However, I propose such behaviour should
be considered obsolete and broken, and a migration path should be
planned to get away from it.
The snapshots argument is diluted because we already have at least one
other kind of metadata outside a pure tree-snapshots system: the
"copy-from" links. I am not immediately planning to ditch copy-from
links, though I think there are good reasons, analogous to the reasoning
about no-op changes, to replace them in a possible future system. I
have given some thought to it. That would be a more visible change to
the system, of course, though not so much as it might first appear.
The example of a no-op file text change is a simple one. An example
with deeper implications is a directory copy combined with replacing one
of its implicitly copied child with an explicit copy of that child from
the same source as it was implicitly copied. Addressing a case like
this may be as simple as declaring one version as the canonical form, or
may require further travel down the road of copy-from semantics.
In conclusion, I consider svn would be a better system -- more
predictable, testable, composable, etc.; more generally dependable --
and would lose no significant value at all -- if we were to explicitly
remove no-op changes.
Does this all ring true and obvious to you, or can you explain better
what I am getting at and what I'm missing?
- Julian
[1] Email: "No no-op changes", from me to dev@, 2014-09-19,
https://svn.haxx.se/dev/archive-2014-09/0082.shtml
https://mail-archives.apache.org/mod_mbox/subversion-dev/201409.mbox/%3c1411138196.98623.YahooMailNeo@web87703.mail.ir2.yahoo.com%3e
[2] Email: "No-op changes no longer dumped by 'svnadmin dump' in 1.9",
from Johan Corveleyn to dev@, 2015-09-21,
https://svn.haxx.se/dev/archive-2015-09/0269.shtml
http://mail-archives.apache.org/mod_mbox/subversion-dev/201509.mbox/%3CCAB84uBVe8QnEpbPVAb__yQjiDDoYjFn2+M9mPcdBXZCwMCpOLw@mail.gmail.com%3E
[#4598] "No-op changes no longer dumped by 'svnadmin dump' in 1.9",
https://subversion.apache.org/issue/4598
https://issues.apache.org/jira/browse/SVN-4598
[#4623] "no-op prop change not preserved across dump/load"
https://subversion.apache.org/issue/4623
https://issues.apache.org/jira/browse/SVN-4623
Received on 2019-10-11 16:56:33 CEST