[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

ra_svn replay-range + svnsync + mod_dav_svn = corrupted revisions!

From: David Glasser <glasser_at_davidglasser.net>
Date: Mon, 10 Aug 2009 19:55:26 -0700

This is a pretty hilarious bug that relies on a combination of a bug
in the ra_svn replay-range, a bug in svnsync, and the unfortunate fact
that mod_dav_svn commits are semi-stateless.

Problem: if you run svnsync to sync from an svnserve server to a
mod_dav_svn server, and an error occurs during one of the actions of
the commit, the revision may end up being committed anyway.

This means that if for some reason you have an svnsync mirror with a
tiny bit of skew from the original (eg, a missing file) you'll start
committing revisions with more and more and more missing files! Wee!


BUG #1: The ra_svn client for replay_range handles errors poorly.

In ra_svn_replay_range, if there is an error driving the "commit
editor" (but see BUG #2 for an important twist here) (inside
svn_ra_svn_drive_editor2), the driver will correctly call abort_edit
on the commit editor. However, in this case, svn_ra_svn_drive_editor2
returns SVN_NO_ERROR and ra_svn_replay_range goes on to call
revfinish_func and go on to the next revision. And revfinish_func is
svnsync/main.c replay_rev_finished, which calls close_edit on the
editor. So if some call in the editor (say, an apply_textdelta)
fails, an ra_svn client will call first abort_edit and then

SUGGESTED SOLUTION: See that NULL in the call to
svn_ra_svn_drive_editor2? Pass in a pointer to a false local variable
called 'aborted', and if it gets to set to true by the call, bail out
immediately instead of continuing.

But wait! OK, so ra_svn is illegally calling close_edit (ie,
"commit") after the abort_edit. But shouldn't the abort_edit
actually, well, abort the edit?


BUG #2: svnsync's "sync editor" drops abort_edit on the floor.

svnsync wraps the RA commit editor with a "sync editor"
(get_sync_editor), which mostly just does things like tweak copyfrom
paths. This editor doesn't define an abort_edit function! So the
abort_edit that ra_svn sends gets completely ignored; as far as the
destination repository's RA commit editor is concerned, it got
whatever edits happened before the bad edit, followed by the buggy
edit which it returned an error to, followed by a close_edit. So
it'll try to commit! (And copy revprops!)

SUGGESTED SOLUTION: Define an abort_edit in get_sync_editor which
passes abort_edit through to the RA commit editor. This should make
the later commit fail (in the absence of fixing bug #1).


DESIGN FLAW: mod_dav_svn activities are overly resilient to errors.

When writing to svnserve, as soon as any editor command receives an
error, the server instantly aborts the transaction. DAV/SVN does not
work the same way. There's no reason that after an open_file fails,
the transaction shouldn't be automatically aborted... either by the
server (as soon as it serves an error request under the activity URL)
or by the ra_neon client (before it returns to the application code).
But it doesn't, and so the two bugs above allow for disastrous

SUGGESTED SOLUTION: All commit editors, but especially ra_neon and
ra_serf, should track whether or not any of their operations returned
an error and if so, turn close_edit into abort_edit. (In fact,
perhaps do this for *every* operation.)

I don't have time to fix these bugs, but Ben says he'll take it on.


glasser_at_davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/
Received on 2009-08-11 04:56:43 CEST

This is an archived mail posted to the Subversion Dev mailing list.