Re: SVNSERVE Tests Failing

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2003-02-11 17:11:09 CET

Regarding the `apply_text' change, I wrote:
> Hmmm. I won't be doing more coding on this before Monday, at which
> time I'll be flying out to CA to be in the CollabNet office for a
> week, where Greg Stein is. We've been talking a lot about this
> change, he's very familiar with it; will ponder and talk with him.
> (Not trying to avoid list discussion, it's just that I won't be on the
> list again until sometime late Monday or early Tuesday.)

Greg Stein and I did talk about this some more. He pointed out that
the editor will be carrying more information, so naturally the
connection between two editors must grow (in complexity or
"bandwidth") to carry that information too. In other words, the
change doesn't destroy the piping property, it just means that the
connectors have to get a bit more sophisticated, in order to send all
the information across the pipe.

That said, I've been thinking some more about this. The rest of this
mail retraces those thoughts; sorry for the length, but thoroughness
is important here.

First of all, things would probably be clearer without the `base'
stream into the interface. It's not strictly necessary, and perhaps
should go away. Consider:

OLD INTERFACE:
--------------
   driver of apply_textdelta()
   deltifies according to some
   BASE, then pushes deltas
   at the editor =========================> driver on the other side
                                            receives windows, pushes
                                            them directly at the
                                            editor over here, which
                                            uses the same implied
                                            BASE to recover fulltext.

NEW INTERFACE:
--------------
   driver of apply_text()
   hands BASE and TARGET
   streams to editor, which
   optionally deltifies TARGET
   and sends that across to ==============> driver the other side,
                                            which can use the same
                                            implied BASE to recover
                                            the fulltext of TARGET,
                                            which it then hands to
                                            the editor over here.

(Strictly speaking, the driver on the right side of the new-style
interface could also hand BASE to that editor, but it's usually
useless to do so, which is why the apply_text() interface explicitly
says that BASE is always optional.)

The point I'm trying to bring out with this comparison is that the
base stream just happens to be explicit in the new interface, but it's
not necessary that it be so. It could be implicit, as it was in the
old interface. The documentation for both apply_text() and
apply_textdelta() contains the following:

  /*
   * @a file_baton indicates the file we're creating or updating, and the
   * ancestor file on which it is based; it is the baton set by some
   * prior @c add_file or @c open_file callback.
   */

...which kind of implies that the base stream always produces not just
some random text, but specifically the text of the ancestor -- if it
were anything else, the driver on the right-hand side would probably
guess wrong when it tried to obtain the base stream locally, and would
therefore undeltify against the wrong data.

Since the source of the base stream is implied anyway, that makes me
think we shouldn't be passing in the base stream as a parameter.
Instead, apply_text() should just take the target_stream. If it
*chooses* to deltify that target stream according to an implied
ancestor, that's fine, because the driver on the other side will know
how to undeltify according to the same ancestor -- that driver is the
first editor's partner, after all. It's not driving the editor, it's
being driven by it, thus bits from the editor's state (such as the
implied ancestor) are available to it.

Then the new interface wouldn't be carrying any more information than
the old interface. It would be carrying the same information, just in
a different way. Deltification would still become strictly a decision
of the connecting pipe editor, not a requirement of the interface, and
that's a good thing IMHO. But there would be no need to name the base
stream when calling apply_text(). If the connection is to be
efficient, the base's identity is implied anyway, so we should just
use that base.

Oh. But I think I see a concern... Maybe the same concern you were
getting at before?

The problem with "implied base" is that it shifts data use away from
data source. In the diagram below, data flow is from left to right.
For example, in a commit, the "source" data on the far left is the
text base plus modified working file, and the "dest" on the far right
is the filesystem, the base node plus the new node resulting from this
edit. The "X" marks indicate which component is responsible for
deltifying/undeltifying using base data:

             s d e ------------------> d e d
             o r d ------------------> r d e
             u i i ------------------> i i s
             r v t ------------------> v t t
             c e o ------------------> e o
             e r r ------------------> r r
  .................................................................
  old api: X X

new api: X X

As the diagram shows, we've shifted the need for base text inward,
away from the sources of base texts.

So the trouble with the new api is that that the components that use
the base text don't directly "touch" the base's source (that is, the
working copy or the repository).

Now, on the left side, we've gotten around that problem by having the
driver create a stream to supply the base text to the editor. But how
will we get around it on the right side? I suppose the driver could
reach into "dest" to get a base by which to undeltify data... which it
would then pass to the rightmost editor, who writes it into the dest!
Is that weird, though? Should the rightmost driver be independent of
dest, and get everything it needs solely from the information coming
from its left?

I don't know the answer yet; would love some feedback. Are the
problems identified here the same ones you were thinking of?

In any case, it's clear to me that this change cannot continue to be
designed in phone calls and whiteboard sessions between me, Greg
Stein, and the Chicago mafia. It needs to be discussed on the list,
until we're sure we've got all bases covered.

Issue #510 does not block any pre-1.0 issues; it's pre-1.0 only
because it's an API finalization issue. Since it doesn't block
anything, I'm going to move it out of the 0.18 milestone while we
discuss. (If we don't figure something out soon, I can move some of
the changes already made over to a branch.)

-Karl

> Greg Hudson <ghudson@MIT.EDU> writes:
> > Before apply_text, the editor interface worked like plumbing. You could
> > take an arbitary driver and plug it into an arbitrary editor, more or
> > less. More interestingly, you could insert a connecting pipe between
> > driver and editor:
> >
> > driver --> [editor --> network --> driver] --> editor
> >
> > libsvn_ra_svn/editor.c is such a connecting pipe. So was the XML
> > editor, before we got rid of it.
> >
> > With apply_text, there is no way to build such a connecting pipe without
> > sending full texts over the wire. The receiving end of the connecting
> > pipe no longer has enough information to make editing calls.
> >
> > I can see why apply_text is appealing. It moves a little bit of code
> > from driver to editor, and it lets us avoid deltifying over ra_local for
> > free (by omitting code, rather than by adding conditionals). But it
> > destroys an important property of the editor interface and, as a result,
> > will significantly complicate the ra_svn code if it stands.
>
> Yah. It wasn't just that, it was also that it made it easy to not
> deltify for new imports (which we formerly had to do). It also
> changes the interface from driver-push (pushing to window handlers,
> that is) to editor-pull, although that could probably be accomplished
> without changing the input type.
>
> I was aware we were losing this property, but assumed (perhaps rashly)
> that it wasn't a real loss, because the driver could reconstruct the
> correct stream.
>
> Hmmm. I won't be doing more coding on this before Monday, at which
> time I'll be flying out to CA to be in the CollabNet office for a
> week, where Greg Stein is. We've been talking a lot about this
> change, he's very familiar with it; will ponder and talk with him.
> (Not trying to avoid list discussion, it's just that I won't be on the
> list again until sometime late Monday or early Tuesday.)
>
> > If absolutely no one else agrees with me, I'll withdraw my veto. I
> > think I have a strong argument here, though.
>
> You do. Need to think; I grok the objection, I just don't fully grok
> the solution space yet.
>
> I guess the summary of this email is: "Hmmm. Objection understood.
> Off to think now." :-)
>
> If we do end up trying to press forward with the change, I will make
> sure that ra_svn's editor is the first converted, so that the
> complications become apparent as early as possible.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Feb 11 17:42:53 2003

This message: [ Message body ]
Next message: Karl Fogel: "Re: bug, using latest version"
Previous message: Garret Wilson: "Re: libsvn_auth going to solve the user/passwd file problem?"
In reply to: Greg Hudson: "Re: SVNSERVE Tests Failing"
Next in thread: Greg Hudson: "Whither apply_textdelta (was Re: SVNSERVE Tests Failing)"
Reply: Greg Hudson: "Whither apply_textdelta (was Re: SVNSERVE Tests Failing)"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]