On 16.06.2011 12:50, Philip Martin wrote:
> Philip Martin<philip.martin_at_wandisco.com> writes:
>> That's using the sync option on the NFS server. Using async
>> Checkout using 1.6:
>> Elapsed: 73s CPU: 16s
>> Checkout using 1.7
>> Elapsed: 180s CPU: 26s
> By comparison the same checkout to a local disk takes about 5s elapsed
> for both 1.6 and 1.7.
> I tried an experiment with the update editor used by checkout. At
> present it inserts a not-present NODES row for each file in add_file()
> and then replaces it with a normal NODES row in close_file(). I removed
> the code that inserts the not-present row, the checkout still works
> provided it runs to completeion. This change removes one transaction
> per-file from the checkout, and reduces the elapsed time by 27s or 15%.
This matches exactly what I discovered 3 weeks ago
but hadn't found the time, yet to investigate in detail.
So, take the following with a grain of salt.
My hypothesis is that we need only a single db transaction
(plus maybe one for managing the pristine store). Without
changing the editor logic, a file c/o into an empty directory
should look like this:
(1) Receive content and stream to pristine temp
(2) Move to pristine store
(3) Copy & translate to w/c temp
(4) Set flags and time stamp
(5) Add row to NODES
(6) Move from w/c temp to w/c final location
(preserving flags and time stamp)
The above should be valid workflow that can be interrupted
at any point without corrupting the w/c:
(before 2 is finished) w/c is locked, temps to be cleared upon cleanup
(after 2 is finished) orphaned pristine entry, no idea whether we are
strict about that today; becomes (most likely) used after w/c update
(before 5 is finished) w/c is locked, temps to be cleared upon cleanup
(after 5 is finished) w/c looks like the file had been added but got
deleted manually; update should simply "restore" it
(6) supposed to be atomic and non-modifying
In the editor, one of the transactions is commented with
"mark the parent as incomplete". That mark seems to be
unnecessary: In case of an interruption, the w/c will need
to be cleaned up as described above. After that, it looks
like either the file had not been sent at all or the user
removed it manually. There is no intermediate state that
needs to be tracked here.
> So to get better performance on network disks we have to remove or
> combine transactions.
Eliminating transactions should speed up c/o for any
type of "disk".
Received on 2011-06-18 11:19:28 CEST