On Tue, May 11, 2010 at 7:27 AM, Stefan Sperling <stsp_at_elego.de> wrote:
> On Tue, May 11, 2010 at 01:36:26AM +0200, Johan Corveleyn wrote:
>> As I understand your set of patches, you're mainly focusing on saving
>> cpu cycles, and not on avoiding I/O where possible (unless I'm missing
>> something). Maybe some of the low- or high-level algorithms in the
>> back-end can be reworked a bit to reduce the amount of I/O? Or maybe
>> some clever caching can avoid some file accesses?
> In general, I think trying to work around I/O slowness by loading
> stuff into RAM (caching) is a bad idea. You're just taking away memory
> from the OS buffer cache if you do this. A good buffer cache in the OS
> should make open/close/seek fast. (So don't run a windows server if
> you can avoid it.)
> The only point where it's worth thinking about optimizing I/O
> access is when you get to clustered, distributed storage, because
> at that point every I/O request translated into a network packet.
You had me until that last part. I think we should ALWAYS be thinking
about optimizing I/O. I have little doubt that is where the biggest
performance bottlenecks live (other than network of course). I agree
that making a big cache is probably not the best way to go, but I
think we should always be looking for optimizations where we avoid
repeated open/closes that are not necessary.
I think it is extremely common that our customers have their
repositories on NFS-mounted or SAN storage. While these often have
fast disk subsystems there is still a noticeable penalty for file
opens. Have you looked at Blair's wiki before?
Received on 2010-05-11 13:44:04 CEST