You mean svnadmin pack doesn't just work within the FSFS write lock? That
would seem logical to me... fits with svnadmin upgrade, eg.
--dave
On Jan 5, 2009 12:22 AM, "Daniel Shahaf" <d.s_at_daniel.shahaf.name> wrote:
Problem:
Race condition in 'svnadmin pack' may cause corruption (data loss).
Details:
See pack_shard() in fs_fs.c. Note the two svn_io_remove_dir2() calls.
Suppose two 'svnadmin pack' run in parallel, and that one of them
reaches the first call (which removes the pack dir) just after the first
of them reaches the second call (which removes the non-packed shard dir).
Result:
The whole shard (that the two 'svnadmin pack' worked on) is physically
deleted from the (OS) filesystem. Effectively, it does:
% rm -rf db/revs/5/
% rm -rf db/revs/5.pack/
and then (one of the svnadmin's) throws an error (because it tries to
pack the shard that had just been removed).
Proof:
I can reproduce this (on a testing shard_size=4 repository) by placing
breakpoints in the function around the remove_dir2() calls.
(s/place breakpoints/call svn_cmdline_prompt_user2/)
Fix:
Hyrum, who identified the problem on IRC, also suggested the solution of
adding a lock file to 'svnadmin pack'. That looks doable to me, although
it should require passing an FS object (svn_fs_t) to svn_fs_fs__pack (so
there is a mutex to pass to with_some_lock()).
I intend to look into a patch along these lines, but posting here for
comments on the approach.
Daniel
(For example, another way might be to abort if '5.pack' already exists
(rather than remove it and proceed). But would that work?)
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1004476
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1004537
Received on 2009-01-05 09:57:42 CET