[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Speed up client by re-using RA session connections

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 14 Sep 2011 19:04:15 +0100

Improvements already.

This current design is simplistic in that the "end of usage" of a given
session is implicit. It would be better if (inspired by Hyrum's comment
on IRC) we would "check out" a session from the cache and then "return"
that session to the cache when we've finished using it. That design
could then be extended to be thread-safe later, allowing a single cache
to be shared by multiple threads. The svn_client__cached_ra_session()
function would be made thread-safe and would allow a given cached
session to be "checked out" by only one thread at a time. It would open
a new connection to the same repo if there was no currently *available*
connection to that repo.

That "sharable" cache would be better for parallelizing clients.

In the "status" API, in this patch, it appears that two separate
connections are wanted at the same time. That may not be necessary
there, and if the code can be rearranged to do the "get locks" before or
after the main work on a single connection then great, that's an
independent improvement; but for now let's ignore that possibility and
assume that in some cases like this we do need two parallel sessions.
The way I've handled this is to make the *caller* pass in two separate
caches, but that's ugly, exposing an implementation detail. A better
way would be for the code to "check out" two separate sessions from the
same cache.

Another thought: should part (or all) of the support for this kind of
cache be provided by the svn_ra API rather than libsvn_client? Maybe
with libsvn_client wrapping it in a libsvn_client presentation API.

- Julian

On Wed, 2011-09-14 at 17:59 +0100, Julian Foad wrote:
> Daniel Shahaf wrote:
> > Julian Foad wrote on Wed, Sep 14, 2011 at 16:07:19 +0100:
> > > Enable libsvn_client APIs to re-use a previous RA session instead of always
> > > opening a new connection.
> > >
> > > Basically:
> > >
> > > - Declare an opaque "RA session cache" object in the public API. It holds
> > > one open RA session per repository, for any number of repositories.
> >
> > Why per repository?
>
> Because the caller of a libsvn_client API generally provides a
> "path_or_url" without knowing which repository it refers to. If the
> client is calling e.g. "update a/", "update b/", "update c/", it doesn't
> know in advance whether those paths refer to different repositories.
> Suppose only "a" and "c" are in the same repo; the client can't pass
> just a single cached session if we want to re-use the "a" session for
> "c".
>
> > What about a client that runs 'svn update' on several
> > wc's of the same URL concurrently?
>
> I haven't attempted to support concurrent calls. The cache is for
> sequential use only (one user of the cache at a time). If a client is
> doing parallel updates and want to use this kind of cache it must have a
> separate cache per thread. I think.
>
> > > - Implement a private API for libsvn_client functions to use instead of
> > > simply opening a new session. It re-uses a session from the cache, if
> > > present, or opens a new connection if necessary.
> >
> > As far as I can see, the "get a session from the cache" code assumes
> > that the cache is used by a single thread. (Consider what happens if
> > one thread retrieves or uses a cached session while another retrieves
> > the same session from the cache.) Will the cache be thread-safe?
>
> See above.
>
> - Julian
>
>
Received on 2011-09-14 20:04:56 CEST

This is an archived mail posted to the Subversion Dev mailing list.