[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS format 6

From: Mark Mielke <mark_at_mark.mielke.cc>
Date: Sun, 20 Feb 2011 12:35:25 -0500

On 02/20/2011 03:50 AM, Ivan Zhakov wrote:
> On Wed, Dec 29, 2010 at 22:37, Stefan Fuhrmann<eqfox_at_web.de> wrote:
>> The fopen() calls should be eliminated by the
>> file handle cache. IOW, they should already be
>> addressed on the performance branch. Please
>> let me know if that is not the case.
> My belief that file handles cache should be implemented at OS level
> and I pretty sure that it's implemented. And right way to eliminate
> number of duplicate fopen()/reads() is improving our FS API.
> I didn't reviewed how file handles cache is implemented in
> fs-performance branch, but I'm nearly to -1 against implementing cache
> of open file handles in Subversion.

What OS implements file handle caching? The OS file system layer for
most operating systems does implement caching - but open()/close() can
easily invalidate some or all of this cache due to required POSIX
behaviour, especially if the backend storage is remote and shared
between multiple clients such as would be the case over NFS. This is
required to implement consistency across clients. The local operating
system cannot arbitrarily cache everything, and every bit of data it
does decide to cache could be wrong at any point in time without other
aspects in use such as file locking.

Of particular concern to me is how slow Subversion gets over NFS, and
this thread grabbed my attention as a result. When using NFS Subversion
operations can take many times longer (20 seconds -> 20 minutes). I
think people may be testing and making assumptions that a "local file
system" will be in use. Do people working on the fs-performance branch
check with NFS?

I don't know... just dropping in... feel free to set me straight. :-)

That said, I'm also (in principle) against implementing cache of open
file handles. I prefer architectures that cache intermediate data in a
processed form that the application has made a determined choice to make
use of such that the cache is the most useful to the application, rather
than a transparent caching layer that guesses at what is safe. The OS
file system layer is exactly this - any caching it does is transparent
to the application and a guess. Guesses are dangerous, which is exactly
why the OS file system layer cannot do as much caching unless it has
100% control of the file system (= local file system).


Mark Mielke<mark_at_mielke.cc>
Received on 2011-02-20 18:36:29 CET

This is an archived mail posted to the Subversion Dev mailing list.