maxb@tigris.org writes:
> Log:
> Fix a nasty race condition in 'svnadmin load', which could result in incorrect
> copyfrom revisions being chosen if other commits were occurring at the same
> time as the load - a common occurrence in large active repositories, like the
> ASF main repository.
>
> * subversion/libsvn_repos/load.c (new_revision_record): Move the recording of
> the dump-file-revision -> in-repos-revision mapping from here, where the
> in-repos-revision is not yet known, and was being guessed as HEAD+1, to ...
> (close_revision): ... here, where the actual in-repos-revision is known.
>
>
> --- trunk/subversion/libsvn_repos/load.c (original)
> +++ trunk/subversion/libsvn_repos/load.c Wed Dec 15 20:28:24 2004
> @@ -801,22 +801,16 @@
> struct parse_baton *pb = parse_baton;
> struct revision_baton *rb;
> svn_revnum_t head_rev;
> - svn_revnum_t *old_rev, *new_rev;
>
> rb = make_revision_baton (headers, pb, pool);
> SVN_ERR (svn_fs_youngest_rev (&head_rev, pb->fs, pool));
>
> - /* Calculate the revision 'offset' for finding copyfrom sources.
> + /* FIXME: This is a lame fallback loading multiple segments of dump in
> + several seperate operations. It is highly susceptible to race conditions.
> + Calculate the revision 'offset' for finding copyfrom sources.
> It might be positive or negative. */
> rb->rev_offset = (rb->rev) - (head_rev + 1);
It's not clear to me what the possible race condition is here. Could
you describe it?
I wouldn't expect there to be any race conditions, if we do this
correctly, no matter how quickly new commits are coming in during our
'svnadmin load'. From the incoming dumpstream, the source rev of a
copy gets mapped to a possibly-different revision in the new
repository. Then later, if we see copies from that source rev in the
dumpstream, we just adjust them according to the rev_map. Since we
store a new mapping in the rev_map *after* a commit (from the
dumpstream) is made, how can there be a race condition?
> - /* Store the new revision for finding copyfrom sources. */
> - old_rev = apr_palloc (pb->pool, sizeof(svn_revnum_t) * 2);
> - new_rev = old_rev + 1;
> - *old_rev = rb->rev;
> - *new_rev = head_rev + 1;
> - apr_hash_set (pb->rev_map, old_rev, sizeof(svn_revnum_t), new_rev);
> -
> if (rb->rev > 0)
> {
> /* Create a new fs txn. */
> @@ -1110,13 +1104,18 @@
> struct revision_baton *rb = baton;
> struct parse_baton *pb = rb->pb;
> const char *conflict_msg = NULL;
> - svn_revnum_t new_rev;
> + svn_revnum_t *old_rev, *new_rev;
> svn_error_t *err;
>
> if (rb->rev <= 0)
> return SVN_NO_ERROR;
>
> - err = svn_fs_commit_txn (&conflict_msg, &new_rev, rb->txn, rb->pool);
> + /* Prepare memory for saving dump-rev -> in-repos-rev mapping. */
> + old_rev = apr_palloc (pb->pool, sizeof(svn_revnum_t) * 2);
> + new_rev = old_rev + 1;
> + *old_rev = rb->rev;
> +
> + err = svn_fs_commit_txn (&conflict_msg, &(*new_rev), rb->txn, rb->pool);
Wow, I realize you didn't come up with this code, but does anyone here
think the loss of clarity really worth saving one apr_palloc() call?
(Well, obviously the original committer does...)
> if (err)
> {
> @@ -1127,29 +1126,34 @@
> return err;
> }
>
> + /* After a successful commit, must record the dump-rev -> in-repos-rev
> + mapping, so that copyfrom instructions in the dump file can look up the
> + correct repository revision to copy from. */
> + apr_hash_set (pb->rev_map, old_rev, sizeof(svn_revnum_t), new_rev);
> +
This is what I was talking about above.
-K
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Dec 17 20:02:37 2004