[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Svn delta editor API: Missing methods: remove-node-props, remove-children

From: Julian Foad <julianfoad_at_apache.org>
Date: Fri, 09 Nov 2018 12:47:16 +0000

The delta-editor API should have a method to remove all node props of a node, and a method to remove all children of a directory.

== Existing Use Case: Loading a Dump Stream ==

The dump-stream non-delta format uses a full list of node-props. The parser (svn_repos_parse_dumpstream3) converts this to editor operations by first calling the consumer's remove_node_props() method and then change_node_prop() to set each one. The dump stream consumers in both 'svnadmin load' and 'svnrdump load' implement remove_node_props() by querying for all node props and then deleting each one. Then they process the change_node_prop() calls and reinstate some or all of the properties. The redundant deletes could of course be optimized out, but the query cannot.

In 'svnadmin load' the performance penalty is small, as it has fast access to the FSFS API. In 'svnrdump load', however, this adds an extra network round-trip on each node.

The non-deltas dump format is largely superseded by the deltas format, and so the performance characteristics of this use case may be unimportant, yet the code to support it must still be maintained.

== Why add more methods? ==

When a driver requires these operations, and the editor lacks them, the driver needs to work around the lack by first querying the target for the current list and then issuing a "delete" for each item in the current list that is not going to be in the final list. The query step adds:

  * a performance penalty (an extra round-trip)
  * a complexity penalty (a query interface to the target, that must be implemented outside the delta-editor).

== How about a single Set-All method? ==

Instead of having both "remove-all-node-props" and "set-one-node-prop" methods, an API could have a single method to set all node props at once. (Similarly, in principle, for children.)

 If the API provides only the ability to declare all properties (or children) at once, then a caller that wants to make a change to one property (or child) and does not already know the full set is forced to first obtain the full set by another (out-of-band) method. The caller does not just need to know enough metadata to describe the change by indexing into the receiver's data, but must supply a complete copy of the data.

That would cause the same kind of problem. Existing examples are: loading a dump stream deltas format, implementing the "svn mkdir URL" command, and any code that performs a conversion from the current editor semantics to this new semantics.

== Comparison with Ev2 ==

Ev2 (svn_editor_t, experimental) uses full "Set-All" methods. It says "Note that the driver must have a perfect understanding of the tree which the receiver will be applying edits upon." For example, svn_editor_alter_directory() can specify the complete list of children and the complete list of properties. Either children or properties can be omitted if there is no change at all to it, but a partial change cannot be expressed without supplying the full context.

Ev2 uses "shim callbacks" to fetch the required out-of-band data to implement the semantic conversion. Each implementation of an Ev2 editor needs to supply these callbacks in case the driver needs them.

== Comparison with Text Deltas ==

The text delta does not have the same problem. The delta is able to represent a change of the form "replace all the existing text with this new text X" without having to know in advance anything about the existing text.

== Use Something Like Text Delta ==

A more universal API should be able to express property changes with the same expressive power that a text delta has. Imagine a text delta operating on the serialized complete set of properties. This, or its functional equivalent, would allow not just a complete replacement of a whole property or set of properties, but efficiently adding revisions to svn:mergeinfo (never mind that mergeinfo should ideally not be stored in a user-space property) or renaming a property that has a large value.

== What to Do? ==

I am not currently planning to make such a change, just pointing it out for when and if we do make changes to the editor API.

-- 
- Julian
Received on 2018-11-09 13:47:24 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.