[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: numerous small 512 byte working copy reads and writes

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: 2007-12-18 22:23:57 CET

On Dec 18, 2007 4:54 PM, <kmradke@rockwellcollins.com> wrote:
> "Norbert Unterberg" <nunterberg@gmail.com> wrote on 12/18/2007 06:27:06
> AM:
>
> > On Dec 18, 2007 12:05 PM, Erik Huelsmann <ehuels@gmail.com> wrote:
> > >
> > > On 12/18/07, Norbert Unterberg <nunterberg@gmail.com> wrote:
> > > > You are on the wrong track here.The problem is not what BUFSIZ is
> > > > defined on the different platforms but how APR uses it.
> > > > BUFSIZ is the size of the buffer that can be set for buffered
> streams
> > > > with the setbuf() call. Nothing more. MSDN just says: "BUFSIZ is the
> > > > required user-allocated buffer for the setvbuf routine." BUFSIZ is
> NOT
> > > > a suggestion for a buffer size when dealing with large binary files.
> > > > So the bug is in APR where BUFSIZ is used for something it was
> > not designed for.
> > > >
> > > > This has already been discussed for years on this list, but
> > nothing has changed:
> > > >
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=82087
> > > >
> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=113549
> > > >
> > > > So it is clearly an apr and not a windows issue.
> > >
> > > Providing a patch might help:
> >
> > Thank you for the invitation, but no thank you. Diagnosing if someone
> > is ill and actually performing the operation are two different things,
> > and I currently do not want to jump over the high hurdles of actually
> > creating a politically and stylistically correct patch to an APR
> > library function!
> >
> > > They define a buffer size of their own
> > > (APR_FILE_BUFSIZE, defined to 4096 on Win32) which should probably be
> > > used for binary file copies since it's used for buffered files as
> > > well.
> >
> > As already suggested three years ago, why not call the native
> > CopyFile() function on Windows which will likely have best possible
> > performance without filling with different buffer sizes?

Well, that may work for *copying* a file, but the same source file
(copy.c) also provides *appending* to files, leaving the same issues
to users of the APR api, thereby not having solved anything. I think
it would be better to address the problem integrally (for example by
choosing an applicable static buffer size, or better yet, by
identifying a good means of dynamically determining the right size).

> Nice to know I've stumbled across a 3 year old problem. I can't believe
> file copying is implemented so potentially inefficiently.

So it seems, yes. But implying that since the problem was discussed
here and nothing changed means that nobody cares (enough to fix it)
won't fly: APR is a separate project and we need to make them aware of
the problem. Including a proposed fix usually helps resolving the
issue.

> I just checked apr-1.2.12, and it is still using BUFSIZ in the
> apr_file_transfer_contents() routine.

I found the same, yes.

> I may submit a patch to the APR people to see what they think. However,
> I really wonder if Subversion should just be doing it's own thing here
> until/when APR is improved. Subversion does a lot of file copying and
> this could be a big working copy performance improvement on slower
> access devices.

Nope. Subversion shouldn't do it's own thing. We use APR as our
portability layer, so do others. As soon as we find problems, we
should take it up with APR. If they fix it, everybody benefits. If
they don't *then* we get to consider to do it ourselves.

> Since I have a unix build environment (not windows where I observed this
> behavior), I may do some adhoc testing over nfs to see if/how it really
> affects working copy performance.

Of course a good deal of performance testing helps arguing the case
with the APR folks, so please do and please share the results from
your efforts!

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Dec 18 22:24:08 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.