On Tue, May 11, 2010 at 07:43:33AM -0400, Mark Phippard wrote:
> On Tue, May 11, 2010 at 7:27 AM, Stefan Sperling <stsp_at_elego.de> wrote:
> > On Tue, May 11, 2010 at 01:36:26AM +0200, Johan Corveleyn wrote:
> >> As I understand your set of patches, you're mainly focusing on saving
> >> cpu cycles, and not on avoiding I/O where possible (unless I'm missing
> >> something). Maybe some of the low- or high-level algorithms in the
> >> back-end can be reworked a bit to reduce the amount of I/O? Or maybe
> >> some clever caching can avoid some file accesses?
> > In general, I think trying to work around I/O slowness by loading
> > stuff into RAM (caching) is a bad idea. You're just taking away memory
> > from the OS buffer cache if you do this. A good buffer cache in the OS
> > should make open/close/seek fast. (So don't run a windows server if
> > you can avoid it.)
> > The only point where it's worth thinking about optimizing I/O
> > access is when you get to clustered, distributed storage, because
> > at that point every I/O request translated into a network packet.
> You had me until that last part. I think we should ALWAYS be thinking
> about optimizing I/O. I have little doubt that is where the biggest
> performance bottlenecks live (other than network of course). I agree
> that making a big cache is probably not the best way to go, but I
> think we should always be looking for optimizations where we avoid
> repeated open/closes that are not necessary.
That's true. Avoiding repeated open/close of the same file
is a good optimisation. Even with a good buffer cache it will
make a difference.
So s/The only point/One point/ :)
> I think it is extremely common that our customers have their
> repositories on NFS-mounted or SAN storage. While these often have
> fast disk subsystems there is still a noticeable penalty for file
> opens. Have you looked at Blair's wiki before?
Thanks, that was an interesting read.
Of course, network filesystems like NFS have the same network
overhead penalty (except that caching on the local client is
probably a bit easier than with truly distributed storage,
but that's a minor detail).
Received on 2010-05-11 13:57:30 CEST