[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Improve single byte read stream performance

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Mon, 8 Mar 2010 22:47:32 +0100

On Sunday 07 March 2010 21:30:35 you wrote:
> > While profiling TSVN's SVN proper access code,
> > I found that about 40% of the runtime was spent in
> > svn_config_get_config(). The reason was the config
> > parser fetching individual bytes from the input stream.
> >
> > However, translated_stream_read() obviously imposes
> > a significant overhead per call for smaller chunk sizes.
> > This patch quadruples the translated_stream_read()
> > throughput for single bytes cutting svn_config_get_config()
> > runtime by 50%.
>
> Hi Stefan,
>
> Thanks for your patch. It looks like a simple optimization that can help.

And the penalty for the old path is almost zero.
In fact, it could actually be zero on OOO CPUs.

> Where are those other slowdowns you see?

I didn't explicitly profile other functionality but
just had a quick look who else calls to
_svn_stream_read with *len==1:

svndiff.c: read_one_byte, in turn called from
  several other functions
load.c: read_key_or_val
hash.c: hash_read (multiple places)
stream.c: svn_stream_readline, in turn called from
  several other functions in blame.c, fs_fs.c, load.c etc.

The actual list may be longer as these are only
those calls where len is set immediately before
the read call.

> If we want to speed up the configuration file parsing we could also
> optimize the parser in config_file.c to buffer at the character level.
> (This seems a more logical place to me than to optimize all specific stream
> types for single byte access).

We could do that. However, that would just be
another buffering layer - something that the
translated_stream is already supposed to provide.

I have no idea how many different stream types
there are in SVN and what type is used when.
However, from the above list of callers, it seems
justified to make all of them handle 1-byte request
efficiently. That does not rule out further optimizations
higher up the call stack.

> It might even be easier to rewrite some parts of this config parser using
> the line based API we also use for reading diff files.

Can't comment on that as I don't know the code.

-- Stefan^2.
Received on 2010-03-08 22:48:33 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.