[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: [Client developers HEADS-UP] Penalty of APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 - trunk/subversion/bindings/javahl/native)

From: Bert Huijben <rhuijben_at_sharpsvn.net>
Date: Sun, 8 Feb 2009 13:20:33 +0100

> -----Original Message-----
> From: Mark Phippard [mailto:markphip_at_gmail.com]
> Sent: Saturday, February 07, 2009 5:31 PM
> To: Bert Huijben
> Cc: dev_at_subversion.tigris.org
> Subject: Re: [Client developers HEADS-UP] Penalty of
> APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 -
> trunk/subversion/bindings/javahl/native)
>
> On Sat, Feb 7, 2009 at 8:36 AM, Bert Huijben <rhuijben_at_sharpsvn.net>
> wrote:
> >> Log:
> >> * subversion/bindings/javahl/native/JNIUtil.cpp
> >> (JNIUtil::preprocessPath): Remove a gigantic Windows only
> performance
> >> penalty by using the always correct svn_path_internal_style()
> >> instead
> >> of the hidden breaking apr_filepath_merge with
> >> APR_FILEPATH_TRUENAME
> >> that only did validation on Windows.
> >
> > Hi,
> >
> > Until recently some parts of Subversion used apr_filepath_merge with
> the
> > APR_FILEPATH_TRUENAME flag all over the place. (It currently only
> uses it
> > for paths passed to svn via the commandline where it is actually
> usefull for
> > catching user typos).
> >
> > Using this method is VERY VERY expensive on Windows. Our (and your)
> code
> > should canonicalize paths only once. Exactly when passed from an
> untrusted
> > source and after that assume that paths are canonical (or at least
> correctly
> > cased).
> >
> > (Apr ignores this flag on all OSs except Windows, so as Linux
> developer you
> > would never know.)
> >
> >
> > I would recommend that other subversion client developers review
> their code
> > to make sure they don't accidentally call this same method without
> realizing
> > that this is expensive and you can get a fILE back when you gave the
> method
> > a File.
> >
> >
> > In 99.9% of all cases this normalization does nothing, except for
> costing a
> > lot of time; in that other 0.1 percent (where the file on disk has
> another
> > casing than subversion thinks it has), it makes it impossible for you
> to
> > call subversion commands on the administratively managed file, as it
> would
> > automatically 'repair' your 'File' to 'fILE'.
> >
> > Thanks for your attention,
>
> I am in favor of the change. I'll just have to keep my eyes open for
> whether this causes any problems for us in Subclipse on Windows.

If subclipse currently works fine on OS/X (I assume it does), it will
function the same on Windows.
(It just removes the platform difference)

> Did you do any kind of searching to get an idea where this function is
> used? For example, it is not something that is exposed to us on the

It is used about everywhere where a path is passed from the Java side to the
C/C++ side, to normalize the paths at the C++ side. (Some newer code ignores
this function.. That code should probably be fixed to make sure the same
code path is followed)

> Java side, it is used internally in the C++ code. I suspect it is
> only used when methods are called with paths in them, so performance
> improvement might not be that noticeable. If, for example, we are

Passing a path can go from costing about a 1/10th of a second to about one
microsecond. This is something you'll notice :)

(Just not on that update code path we were talking about on the users list
:( )

r35731 and r35732 should have more impact there. (Should be about 5% on
Windows; probably more on normal PCs). But fixing generic issues don't help
bridge the difference between Linux and Windows :)
Not that I mind.. Going fast enough on all OSs would be just fine :)

> doing a merge or update the overwhelming majority of the code is
> happening inside the SVN libraries, there is not a lot of back and
> forth from Java while these run. So the kind of performance speed-ups
> you did before, where it is being used within the libraries, is likely
> to have a much bigger impact than this. But like I said, every bit
> helps and there might still be cases where we are doing things like
> checking Status on a large working copy where this will make a nice
> difference.

Don't underestimate the performance impact of the notify handler! (Comparing
AnkhSVN/SharpSvn to TortoiseSVN, this was in most cases the explanation why
AnkhSVN 2.0 was faster (and AnkhSVN 1.X was slower) than TortoiseSVN). If
the gui repaints in the notify handler, the TCP/IP layer slows down all
network IO because the client can't handle the data fast enough.
(The slow scroll effects in XP/Vista can be a killer here, unless disabled
by the code)

Moving the long-running subversion actions to a background thread and doing
the UI updates asynchronous can really make a difference.

Responding to the thread on users@:
I'm looking whether implementing the svn_stream_t Windows specifically might
make a difference. (apr_file_open allocates more than a kilobyte of data for
asynchronous IO, thread synchronization, etc., never used by Subversion)

        Bert
>
> Thanks
>
> --
> Thanks
>
> Mark Phippard
> http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1123522
Received on 2009-02-08 13:21:09 CET

This is an archived mail posted to the Subversion Dev mailing list.