[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] duplicate keys for the 'strings' table

From: <cmpilato_at_collab.net>
Date: 2002-02-27 18:38:51 CET

Greg Stein <gstein@lyra.org> writes:

> However, I don't have any pre-Greg-patch numbers (using Mike's latest 4 meg
> buffering). Nor do I have numbers pre-buffer or pre-streaming. All of that
> data would be really good to have, to see where we started and where we're
> going. I'd also like to see if this dup key stuff has improved performance,
> or just reduced our log file spamming.

*smoooooooooooooch*

You might not have pre-Greg-patch numbers, but I applied your patch
locally, compiled, installed, etc. and re-ran my tests from
yesterday. You solution is GOLDEN from my perspective. Not quite as
fast as buffering the whole thing in memory (fuzzy estimate here, I
didn't actually time things yesterday -- you know, it's like "how far
in to this song can I play on the guitar before the operation is
complete" -- that kind of timing), but muuuuuuch faster than the first
draft of the streamy FS writing code. I think it's probably even
faster than the 4-meg buffering, and definitely wins on the log
message count overall:

   old way -> 25 megs
   first streaminess attempt -> 1.74 gigs (and only 60% finished when
                                i ran out of disk space)
   2 meg buffering -> 260 megs
   4 meg buffering -> 150 megs
   your way -> 25 megs
   
> The patch isn't quite ready for committing: I need to update the doc for
> svn_fs__string_read(). It was already out of date, and with this change,
> I've also introduced the "may return less than you asked for" semantic of
> most of our other reading functions.
>
> The hard-coding of 500k in tree.c should also go (I was lazy and didn't want
> to recompile everything by changing the constant in svn_fs.h :-). Note that
> cmpilato and I think that constant should move into tree.c anyways.

I just reverted my tree.c to revision 1279 to see what happens we no
buffering is around, and we're writing directly into your multi-record
strings. Time difference was minimum (again, fuzzy feeling) and
again, logs were 25 megs.

I would suggest rolling back tree.c to 1279 and removing the #define
from svn_fs.h (effectively removing the buffering code altogether),
and therefore the filesystem is never trying to guess at best-case
buffering behavior (which it can't possible know). If the clients are
writing directly into the strings table, they can (theoretically)
choose to send differently size chunks of data into the window
consumer returned by svn_fs_apply_textdelta(), and therefore have full
control over their own performance in this area!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:10 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.