On Thu, Nov 6, 2008 at 8:04 AM, Mark Phippard <markphip_at_gmail.com> wrote:
> On Thu, Nov 6, 2008 at 7:55 AM, Ben Collins-Sussman
> <sussman_at_red-bean.com> wrote:
>> On Thu, Nov 6, 2008 at 1:35 AM, David Glasser <glasser_at_davidglasser.net> wrote:
>>
>>> There's a lock around appending to the proto-rev file, yes. And it
>>> does write a bunch of little files for noderevs, directory listings,
>>> and props, and gloms them on at finalization time, yes. But there are
>>> no locks around editing the little files themselves; so for example,
>>> if concurrent processes make two files in a directory at the same
>>> time, there can be a race condition and only one will end up in the
>>> listing. This situation isn't possible if you only access a
>>> transaction from a single process (say, via the commit editor).
>>
>> But if all writes go through a single process, that sort of defeats
>> the goal of saturating the bandwidth with parallel PUTs. Maybe it
>> would be a worthy goal to make FSFS safe? I know it's something we'll
>> have to do for libsvn_fs_bigtable.
>
> Do we know that "saturating the pipe" will give the best performance?
> We (CollabNet) are frequently hearing complaints of WAN performance
> lately and it has been suggested that a single request would perform
> better in that environment because:
>
> a) the pipe is not that big
> b) the latency on turnarounds is the biggest killer of performance
I'd have to defer this question to the serf experts. There's an
unspoken assumption that saturating the pipe with parallel (and/or
pipelined) requests is always a speed gain, but I actually don't know
of evidence to back this up. Maybe it's out there, though.
In the case of doing a checkout/update, my understanding is that
ra_serf's parallelism is 'slightly' faster than ra_neon's single
request/response. The response I've always gotten back from
gstein/jerenkrantz about this is that ra_serf is going to really shine
when caching proxies come into play. This makes sense to me: even if
there's no obvious, immediate benefit to most users doing checkouts
over serf, the design is a bit of investment in the future, and should
be especially beneficial to corporations with caching infrastructure.
It's not clear to me that parallel PUTs have the same promise, though
-- caching proxies don't help with that.
My instinct is to ask somebody to measure parallel PUTs vs.
single-request, so we have hard data to examine. But nobody's
implemented the parallel PUTs yet!
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-11-06 16:19:05 CET