[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Revprop caching 'n stuff

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Fri, 09 Mar 2012 16:12:55 +0000

Stefan Fuhrmann <eqfox_at_web.de> writes:

> * new definition: generations must be even numbered
> * writer stores timeout value (e.g. now() + 10s)
> * writer increments generation -> odd number
> (if the result is an even number, there are concurrent
> writes or one process crashed; increment until we
> reach an odd generation)
> * writer replaces revprop file
> * writer increments the generation until it is even again
>
> * reader gets current generation from shm
> * if even -> proceed, a write may or may not be in progress
> * if odd -> a writer *might* have been stalled / aborted
> * if timeout > now() -> proceed with (gen-1) for lookups,
> the writer may still run
> * timeout expired -> increase the generation until it is even
> (causes everybody to re-read revprops, if writer is still
> alive, it will increase the value further)
>
> So, in case of an aborted writer process, the other
> processes behave like proxies that see outdated data for
> a short period of time only.
>
> The critical parameter here is the timeout. It must be large
> enough that no "move-into-place" operation could be stalled
> longer than that (Q: how is that guaranteed to be atomic /
> self-healing in the first place?). Otherwise, a crash between
> move-into-place and bumping the generation number
> would still cause an undetected change.

I'm not sure exactly what you are trying to implement.

Concurrent writes? Are you planning to remove the existing revprops
write lock? That would require a repository format bump. I think it is
also incompatible with having multiple machines write to the same
repository.

I think you plan to address the problem of how/when to detect that the
read cache is out-of-date after a write by having the readers check the
shared memory on every read. We could achieve the same by checking the
generation file itself, at the cost of disk IO. What sort of gain is
the shared memory compared to the disk approach? Perhaps it would be
better to implement the non-shared memory option first? That would also
allow multiple machines to access the repository.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com
Received on 2012-03-09 17:13:33 CET

This is an archived mail posted to the Subversion Dev mailing list.