[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion Performance Benchmark

From: Kevin Pilch-Bisson <kevin_at_pilch-bisson.net>
Date: 2002-09-24 19:31:28 CEST

Quoting Daniel Berlin <dberlin@dberlin.org>:

>
> On Tuesday, September 24, 2002, at 01:17 PM, cmpilato@collab.net wrote:
>
> > Daniel Berlin <dberlin@dberlin.org> writes:
> >
> >> But, given what you say, and that vdelta *was* the biggest time user,
> >> one of the following must hold:
> >>
> >> 1. You are wrong, and we are supposed to be storing everything
> >> non-fulltext (deltas against itself)
> >
> > Highly unlikely. :-)
> >
> >> 2. Subversion is buggy, and it is storing non-fulltext when it
> >> shouldn't be
> >
> > Also not the case. Do an initial import of something, and use db_dump
> > to examine the strings table. On my machine, there is nary a single
> > chunk of svndiff data.
> >
> >> 3. Subversion is buggy, and it is doing deltas and throwing them away.
> >
> > *Ding!* Subversion is using svndiff to *transmit* the text. Those
> > diff
> > packages usually break down to:
>
> I know what they look like *way* too well.
> Remember, i did an svndiff version 1?
> :)
>
> >
> > 'S'
> > 'V'
> > 'N'
> > \0
> > <flag that means 'here comes some new data'>
> > <here's the whole file as fulltext>
> >
> > but the fact remains that we are attempting to svndiff-ify a "delta"
> > that is effectively an "add" of the entire file.
> >
>
> But this doesn't jive with the profile.
> vdelta is certainly doing compression, it's calling find_match_len,
> etc. That's what is taking so much time.
> It's coming up with a *real* delta of the file against *something*
> (likely an empty string).
>
> Thus, an svndiff that is *not* just the whole file as fulltext with an
> "add new data" op is being generated, which is slow and wrong (it
> *should* be generating the svndiff you've given above).
>
> Whether it is just throwing it away, or storing it in the repo, i'm not
> sure.
> If it's storing it in the repo, it means something *else* is broken
> (whatever thought it should generate a compressed delta here) .
> If it's not storing it, and something *is* storing the svndiff you give
> above, we need to find what generates the compressed delta for no
> reason and make it stop.
>
> > And the repository is unpacking that stuff and storing the original
> > fulltext.
>
The problem is svn shooting itself in the foot trying to be too smart. It's
using svndiff to compress the data for TRANSMISSION to reduce network
bandwidth, even though this is a local operation. So we're actually
svndiffing the data, passing into a function call which promptly uncompresses
and stores it in the DB.

At least that's my understanding of what Mike was saying.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson
kevin@pilch-bisson.net
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Sep 24 19:32:22 2002

This is an archived mail posted to the Subversion Dev mailing list.