On 08.03.2012 10:26, Philip Martin wrote:
> Stefan Fuhrmann<stefanfuhrmann_at_alice-dsl.de> writes:
>
>> * undo the above-mentioned revs on /trunk
> Good. I was starting to have concerns about other races in the current
> code. For example:
>
> - first process reads generation file N
> - second process writes new revprops
> - first process reads revprops
> - second process writes generation file N+1
>
> This means the cache doesn't have revprops for N but for some n>=N. I
> suppose the system might work with this sort of cache but I'm not sure
> it has been considered.
In fact, this has been considered. The definition of
"revprop generation" is somewhat blurred: it actually
is lower bound to the updates that a fs_t sees, rather
than a full snapshot over all data.
> I suppose the reader could loop, rereading the
> generation file after reading the revprops file until it sees the same
> generation before and after revprops.
>
> Another example:
>
> - first process reads/caches revprops for generation N
> - second process writes new revprops and gets killed before updating
> the generation file
> - third process reads/caches revprops and gets new revprops plus
> generation N
>
> This means two readers that have different revprops for generation N.
> Both of them will believe they have up-to-date values if they reread the
> generation file.
>
Yes, crashing servers are bad. I will implement a design
that will take care of this on the new branch. It will detect
and fix incomplete revprop + generation updates after
a pre-defined period of time, e.g. 10s:
* new definition: generations must be even numbered
* writer stores timeout value (e.g. now() + 10s)
* writer increments generation -> odd number
(if the result is an even number, there are concurrent
writes or one process crashed; increment until we
reach an odd generation)
* writer replaces revprop file
* writer increments the generation until it is even again
* reader gets current generation from shm
* if even -> proceed, a write may or may not be in progress
* if odd -> a writer *might* have been stalled / aborted
* if timeout > now() -> proceed with (gen-1) for lookups,
the writer may still run
* timeout expired -> increase the generation until it is even
(causes everybody to re-read revprops, if writer is still
alive, it will increase the value further)
So, in case of an aborted writer process, the other
processes behave like proxies that see outdated data for
a short period of time only.
The critical parameter here is the timeout. It must be large
enough that no "move-into-place" operation could be stalled
longer than that (Q: how is that guaranteed to be atomic /
self-healing in the first place?). Otherwise, a crash between
move-into-place and bumping the generation number
would still cause an undetected change.
-- Stefan^2.
Received on 2012-03-09 11:20:10 CET