Re: Moves in FSFS

From: Branko Čibej <brane_at_wandisco.com>
Date: Fri, 13 Sep 2013 12:41:43 +0200

On 13.09.2013 11:32, Philip Martin wrote:
> Branko Čibej <brane_at_wandisco.com> writes:
>
>> That said, I still do not understand why a different ID would be needed
>> before the copy-on-write happens. Is it because the client doesn't have
>> the full history available? If that's the case, I suggest we explore
>> this on a case-by-case basis, including determining how the initial
>> state of a working copy for each case can actually occur.
> Is the FS going only able to provide the moves that apply between two
> consecutive revisions? Or is it able to provide the combined moves
> between arbitrary revisions?
>
> One way to provide the moves between arbitrary revisions is to iterate
> through the intervening revisions accumulating and combining the moves.
> That has the potential to be slow.
>
> Another way to provide the moves between arbitrary revisions is to have
> an id to path map per revision which allows the FS to find the path
> associated with a given id. However with lazy-copy this map is harder
> to implement.
>
> There is another aspect to the lazy-copy which is when does the new
> copy-id get assigned to the lazy children. If we commit
>
> move A/f A/g
>
> then move does not allocate a new copy-id and A/f has the same copy-id
> as A/g. I think we intend this to be true if the commit combines a move
> and a modification to the node. Now commit:
>
> copy A B
>
> here B gets a new copy-id and lazily copied children of B still have the
> old copy-id. Now what about this commit:
>
> move B/g B/h
>
> Does move preserve the copy-id so that B/h is still a reference to A/g?

A move through the copied parent has to be interpreted as a write to the
subtree, which means that the copy-on-write semantics kick in. The move
then breaks down into:

make-mutable B/g <-- lazy copy, assigns a new copy-id
move B/g B/h <-- move semantics, B/h keeps same copy-id

You'll not that "make-mutable" is an implementation detail of the
top-down DAG FS model, and it already does what I described above. This
is not some new code we'd have to write to implement moves this way; we
just have to obey existing rules, i.e., before operating on a path
within an FS transaction, the path must first be made mutable. In other
words, the FS implementation already works the way I described,
regardless of whether the actual operation is "move" or something else.

> If the commit was a combined move/modification then B/h would have to
> get a new copy-id.

It has to get one in any case, as explained above. Operations that
affect the state of a node, when performed on a path that contains a
copied parent, must produce the same result regardless of whether the
filesystem implements lazy copying or not. For the purpose of this
model, the node's path is part of its state; although of course the path
is not in fact a unique property of the node.

But it's not necessary to assign new IDs when the node's state is
/read/; i.e., we do not need copy-on-read in order for this model to
work. The only consideration is that all operations must work correctly
with or without lazy copying. It's OK IMO to let the lazy-copy detail
leak outside the FS implementation, as long as we do not make it a
required feature.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com

Received on 2013-09-13 18:01:07 CEST

This message: [ Message body ]
Next message: Ivan Zhakov: "Re: Error during 'svn export' over http with serf 1.3.1"
Previous message: Philip Martin: "Re: Moves in FSFS"
In reply to: Philip Martin: "Re: Moves in FSFS"
Next in thread: Philip Martin: "Re: Moves in FSFS"
Reply: Philip Martin: "Re: Moves in FSFS"
Reply: Julian Foad: "Re: Moves in FSFS"
Reply: Stefan Fuhrmann: "Re: Moves in FSFS"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]