RE: Unable to hotcopy to a NAS shared directory: E720002

From: Bert Huijben <bert_at_qqmail.nl>
Date: Wed, 21 Jan 2015 20:43:22 +0100

> -----Original Message-----
> From: Philip Martin [mailto:philip.martin_at_wandisco.com]
> Sent: woensdag 21 januari 2015 20:17
> To: Evgeny Kotkov
> Cc: Cory Riddell; Subversion Development
> Subject: Re: Unable to hotcopy to a NAS shared directory: E720002
>
> Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com> writes:
>
> > The reason why this error is propagated up the stack is that we only examine
> > the 'ignore_enoent' argument after the first apr_file_remove() call. This is
> > racy â€” if we get a EACCES during the first attempt to remove a file, and the
> > file is simultaneously removed from the disk, the next attempt to remove it
> > would fail with a ENOENT, even with 'ignore_enoent'. I think we should
> > fix this by suppressing ENOENTs from every apr_file_remove() call, not
> > just the first one.
>
> Sounds plausible.
>
> Windows code is tricky. When svn_io_remove_file2() gets EACCES it calls

For something to return access denied on Windows it must exist.

> svn_io_set_file_read_write() passing ignore_enoent. That function has
> different handling of ignore_enoent as it only checks ENOENT while
> svn_io_remove_file2() and checks both ENOENT and ENOTDIR.

> svn_io_set_file_read_write() also doesn't have a WIN32_RETRY_LOOP. Are
> those differences intentional?

File attributes are typically not involved with locking of the files.

I prefer *not* to loop when in doubt, as bad loops can cause much bigger problems than a forgotten loop.

A loop that just waits for something that isn't going to fix itself, is just a 12 second delay... Turn yet another delay loop around that in its caller and you are waiting for minutes. Another loop around that and it will be days.
(Note that there are some retries in apr!)

We had quite a few bugs in previous versions, where scenarios could cause major lockups caused by retries waiting for the wrong error conditions.

The problem here is that we have a NAS that shows itself as a Windows device, but behaves differently.

A typical Windows test run *never* triggers a retry loop for IO errors... nor should it.

The io retry loops are workarounds for externally caused problems. Virusscanners, etc.

In this case: are we really saying that hotcopy should work to a network drive?

Even if it doesn't support the proper locking primitives?
(We certainly recommend not to use servers on such a setup)

Perhaps the proper recommendation is: hotcopy to a local drive first, and then copy to network storage.

Bert
>
> --
> Philip Martin | Subversion Committer
> WANdisco // *Non-Stop Data*
Received on 2015-01-21 20:44:43 CET

This message: [ Message body ]
Next message: Philip Martin: "Re: Unable to hotcopy to a NAS shared directory: E720002"
Previous message: Philip Martin: "Re: Unable to hotcopy to a NAS shared directory: E720002"
In reply to: Philip Martin: "Re: Unable to hotcopy to a NAS shared directory: E720002"
Next in thread: Philip Martin: "Re: Unable to hotcopy to a NAS shared directory: E720002"
Reply: Philip Martin: "Re: Unable to hotcopy to a NAS shared directory: E720002"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]