[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: svn commit: r36203 - trunk/subversion/libsvn_ra_serf

From: Bert Huijben <rhuijben_at_sharpsvn.net>
Date: Sat, 28 Feb 2009 00:34:31 +0100

> -----Original Message-----
> From: Ivan Zhakov [mailto:ivan_at_visualsvn.com]
> Sent: Saturday, February 28, 2009 12:07 AM
> To: dev_at_subversion.tigris.org; Bert Huijben
> Subject: Re: svn commit: r36203 - trunk/subversion/libsvn_ra_serf
>
> On Sat, Feb 28, 2009 at 1:57 AM, Bert Huijben <rhuijben_at_sharpsvn.net>
> wrote:
> > Modified: trunk/subversion/libsvn_ra_serf/commit.c
> > URL:
> http://svn.collab.net/viewvc/svn/trunk/subversion/libsvn_ra_serf/commit
> .c?pathrev=36203&r1=36202&r2=36203
> >
> =======================================================================
> =======
> > --- trunk/subversion/libsvn_ra_serf/commit.c Fri Feb 27 13:50:27
> 2009 (r36202)
> > +++ trunk/subversion/libsvn_ra_serf/commit.c Fri Feb 27 14:57:15
> 2009 (r36203)
> > @@ -282,6 +282,9 @@ handle_checkout(serf_request_t *request,
> > apr_uri_parse(pool, location, &uri);
> >
> > ctx->resource_url = svn_uri_canonicalize(uri.path, ctx->pool);
> > +
> > + if (ctx->resource_url == uri.path)
> > + ctx->resource_url = apr_pstrdup(ctx->pool, ctx-
> >resource_url);
> Hi Bert,
>
> Sorry but I don't like this fix. It looks like premature optimization
> which is always bad thing.
>
> We should follow contract of svn_uri_canonicalize function that could
> theoretically return pointer to other than first character in provided
> buffer.
>
> So it's much better just add apr_pstrdup() in all cases from point of
> readability and preventing memory errors.

Feel free to fix this yourself after you spend a few days profiling. (You can also fix the about a dozen of other locations in our code that use the same construct). We try to do most memory consuming operations in scratch pools and this context pool is certainly not short lived.

I'd personally suggest fixing the documentation of the svn_(uri/dirent)_canonicalize function to say that it will always copy or return a static value (for ""). The current canonicalization rules (including case fixing) make it expensive not to copy while canonicalizing.

The optimization of our code should be in reducing these expensive canonicalization calls and just assuming everything passed is canonical, not in doing the operation twice to reduce a copy.

The unnecessary canonicalization we did before this series of patches (started by lgo) was about 10-20% of the CPU time spent before starting the network IO for an update.. (Mostly canonicalizing and joining urls to find out if a path is switched)

Too bad that on Windows most wallclock time is spend on creating lock files :(.

The canonicalize on input changes also reduced our memory usage in the first phase of updating by about 30%. (Nice side effect that helps the processor cache make the url calculations even faster).
 
Thanks,

        Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1241436
Received on 2009-02-28 00:35:24 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.