Re: svnsync crashes on a huge commit

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Wed, 17 Jul 2013 16:01:25 +0200

On Wed, Jul 17, 2013 at 1:49 PM, Lieven Govaerts <lgo_at_apache.org> wrote:

> (bringing this to dev)
>
> Devs,
>
> Anatoly sent me some more info in this issue and log files of svnsync
> 1.8.0+serf 1.2.1 with logging enabled.
>
> He's running svnsync from a https repository (ra_serf) to a local
> repository (ra_local). This on an Ubuntu 12.08 VM with 64-bit binary.
> In the log files I received doesn't abort but stops with this error
> (after hours of syncing):
>
> [2013-07-12T01:37:35.650011-07] [l:192.168.222.132:53349
> r:10.14.3.25:443] outgoing.c: cleanup - closed socket, status 9
> svnsync: E000005: Can't open file
> '/media/windowsshare/db/transactions/391384-8e0c.txn/next-ids':
> Input/output error
> ... (cleanup and end program)
>
> As far as I can see in the logs in this particular run there's nothing
> wrong on the receiving end in ra_serf, but there's a problem during
> the commit action.
>
> I suppose (to be confirmed) that his target repository is stored on an
> nfs or smb share.
>
> Svn has a retry feature in fsfs to retry opening files after receiving
> an EIO error, see libsvn_fs_fs/fs_fs.c RECOVERABLE_RETRY_COUNT, but I
> don't see this being used in read_next_ids (fs_fs.c) when opening the
> next-ids file.
>
> Can anyone more knowledgeable about fsfs confirm that this is a
> possible explanation for this issue?
>

We only use the retry method when we access "contested"
files, i.e. those that might be written to by other threads or
processes at the same we try to open them.

read_next_ids accesses a txn-local file which should never
be accessed by multiple processes / threads at the same
time. It also gets closed explicitly such that there should be
no pool / file handle lifetime issue.

All that said, it may still be necessary or at least be useful
to retry opening any FSFS file (in case of network glitches
etc.). Because there opening files should never fail as part
of the normal operation, retrying when it does should do no
extra harm.

Bottom line: Not retrying is not a bug here but retrying might
have prevented the failure.

-- Stefan^2.
Received on 2013-07-17 16:02:05 CEST

This message: [ Message body ]
Next message: Ivan Zhakov: "Re: 1.7.11 up for signing/testing"
Previous message: Mark Phippard: "Re: 1.8.1 up for signing/testing"
In reply to: Lieven Govaerts: "Re: svnsync crashes on a huge commit"
Next in thread: Philip Martin: "Re: svnsync crashes on a huge commit"
Reply: Philip Martin: "Re: svnsync crashes on a huge commit"
Reply: Anatoly Zapadinsky: "Re: svnsync crashes on a huge commit"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]