[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

packing race condition

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Mon, 5 Jan 2009 09:38:06 +0200 (Jerusalem Standard Time)

Problem:
Race condition in 'svnadmin pack' may cause corruption (data loss).

Details:
See pack_shard() in fs_fs.c. Note the two svn_io_remove_dir2() calls.
Suppose two 'svnadmin pack' run in parallel, and that one of them
reaches the first call (which removes the pack dir) just after the first
of them reaches the second call (which removes the non-packed shard dir).

Result:
The whole shard (that the two 'svnadmin pack' worked on) is physically
deleted from the (OS) filesystem. Effectively, it does:

    % rm -rf db/revs/5/
    % rm -rf db/revs/5.pack/

and then (one of the svnadmin's) throws an error (because it tries to
pack the shard that had just been removed).

Proof:
I can reproduce this (on a testing shard_size=4 repository) by placing
breakpoints in the function around the remove_dir2() calls.

(s/place breakpoints/call svn_cmdline_prompt_user2/)

Fix:
Hyrum, who identified the problem on IRC, also suggested the solution of
adding a lock file to 'svnadmin pack'. That looks doable to me, although
it should require passing an FS object (svn_fs_t) to svn_fs_fs__pack (so
there is a mutex to pass to with_some_lock()).

I intend to look into a patch along these lines, but posting here for
comments on the approach.

Daniel

(For example, another way might be to abort if '5.pack' already exists
(rather than remove it and proceed). But would that work?)

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1004476
Received on 2009-01-05 09:21:54 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.