[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS7: 'svnadmin hotcopy' requires write access to the source

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Fri, 19 Jun 2015 21:37:22 +0300

Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:

> Well, it "works" based on SVN's concept of OS-side readonlyness.
> svnadmin_tests.py 51 passes on Windows.

Did you try the test after doing 'svn merge -c-r1686232'? I tried this on both
Windows and Unix, and the test passed, so it's testing something else than the
discussed issue with db/pack-lock permissions:

  ll subversion/tests/cmdline/svn-test-work/repositories/svnadmin_tests-51/db

  drwxrwsrwx 6 user user 4096 Jun 19 19:57 ./
  drwxrwxr-x 6 user user 4096 Jun 19 19:57 ../
  -rw-rw-rw- 1 user user 2 Jun 19 19:57 current
  -rw-rw-rw- 1 user user 41 Jun 19 19:57 format
  -rw-rw-rw- 1 user user 1500 Jun 19 19:57 fsfs.conf
  -rw-rw-rw- 1 user user 5 Jun 19 19:57 fs-type
  ...

> IIRC, post-commit is always asynchronous and would not prevent further
> commits unless you write specific hook code to e.g. block in the
> pre-commit.
>
> However, you do have a point here. The 'svnadmin pack' instances run
> from the post-commit hook will pile up (one per new revision) while
> hotcopy runs. Under normal circumstances that should not be a problem
> (tens of relatively small, blocked processes).

[...]

They are still synchronous from the client's point of view. Are we going to
tell people that their clients hanging with "Committing transaction..." are
fine, because other people can still commit during this time? :) Furthermore,
I think that tens of hanging 'svnadmin' processes are a problem. How many
commits can happen during a backup of a large repository under a highload
situation? 500? 750? More? It certainly depends on the environment, but
I wouldn't be pleased to see half a thousand svnadmin processes just hanging
around.

[...]

> Back to the original issue, I think this is what we should do:
>
> * Revert r1686232 (but keep the test case added in r1686239) because it
> is mainly ineffective.

Perhaps a better reason for reverting r1686232 would be that it doesn't really
solve the problem, at least on Windows? The test itself isn't a regression
test, as it passes without the fix on at least two platforms.

> * 2-line change to the hotcopy code to disable the pack lock
> * Release that for 1.9.0

I believe that we disagree on this part, as I think that it's not really that
important how big the change is, but rather what it does. Reverting a change
is atomic, reverting a part of the change is not. Furthermore, the hotcopy
bootstrapping code is known to be stable in 1.8.x, and disabling just the
locking code means that every other related change was done for no reason.
We don't really know if there are any other issues with what we did; it would
be fine to keep it for a purpose, but given that we are about to stop locking
over db/pack-lock, I don't see a reason to keep the supplementary changes.

> * Make 'svnadmin upgrade' create missing lock files (not a strict
> pre-condition but something we should always have done)
> * Add a "with_optional_pack_lock" function that does not require write
> access to the repository. Where available, it will acquire a shared lock
> on the pack-lock file. Inter-thread synchronization needs to be switched
> to reader-writer locks.
> * Use the new function to prevent packing while hotcopying non-packed shards.
> * Release this with 1.9.1
>
> Are you o.k. with that plan?

Frankly, this sounds like too much for a patch release — especially, since the
hotcopy-pack-ENOENT issue is not a regression. I propose an alternative plan:

1. Revert the changeset that added locking over db/pack-lock in the hotcopy
   source (r1589284) and all supplementary changes that are no longer
   necessary.

2. Nominate and backport them to 1.9.x (this step resolves the regression).

3. Fix the problem with a race between hotcopy and pack causing ENOENTs,
   while keeping 'svnadmin hotcopy' a read-only and lockless operation for
   the source repository, as it always was. I believe that we can add a retry
   logic into hotcopy_revisions(). Doing so means that in case of a race,
   hotcopy сould have to be restarted from the very first shard — in order
   not to have {unpacked-unpacked-packed-packed} shard sequences in the
   destination, but I believe that we can work it out.

4. The latter fix is going to eventually become a part of Subversion 1.10.

This plan, to my mind, has the following benefits:

- We minimize the potential destabilization of /branches/1.9.x by restoring
  necessary parts of the code to their 1.8.x state. Furthermore, we do not
  keep the code that was once required to support taking the db/pack-lock file.

- We only backport a fix for a regression that happened between 1.8.x and 1.9.x
  into /branches/1.9.x. We don't try to push potentially destabilizing changes
  (shared lock on the db/pack-lock file?) into a stable branch.

- The fix for the ENOENT problem works for sharded repositories irrespectively
  of their format and doesn't turn an initially read-only operation into an
  operation that sometimes requires locking over the source. It doesn't
  affect multiple parallel hotcopies and doesn't somehow block repository
  packing.

How does this sound?

Regards,
Evgeny Kotkov
Received on 2015-06-19 20:37:55 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.