[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Speeding up blame

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2004-05-28 23:49:25 CEST

On Fri, 28 May 2004, Mark Benedetto King wrote:

> > > Right now, svn_client_blame() uses svn_diff_file_diff(),
> > > but there is also a generalized interface that could be applied
> > > to streams. Using that interface, it should be possible to
> > > avoid constructing any of the fulltexts at all, since the
> > > stream of revision N+1 can be computed from the stream of
> > > revision N and the delta from N to N+1.
> > >
> > How do you apply this to a forward-only svn_stream_t? It needs to be able
> > to compare arbitrary tokens, it seems.
>
> Judging from the svn_diff_fns_t interface, the stream needs only to be
> restartable, not random-accessible.
>
What about token_compare? IN svn_diff_file.c, it seeks and reads the token
back if it isn't available in memory. Still, if it only needs to be
restartable, how do you rewind a svn_stream_t? Then, you need to keep the
whole contents.

> >
> >
> > In ra_local, we don't need deltification at all. We just use
> > svn_fs_file_contents for each interesting revision.
> >
>
> I'm not sure what this will gain us; the deltas will still need
> to be combined somewhere.
>
If I understand the FS code correctly, to compute deltas between arbitrary
revisions, it gets a stream for each contents. Each of these streams will
combine deltas to produce fulltexts. Then, a new delta will be computed
from those fulltexts. (Someone will probably correct me her...)

So, when we use file://, we would:
- combine deltas to reproduce each fulltext
- compute a new delta of the fulltexts for consecutive revisions
- Take the previous fulltext (saved in a temporary file) and that delta
and calculate the next fulltext.

If we expose apr_file_ts in the RA function, we will instead:
- Combine deltas to produce each fulltext once, writing it to a temporary
file.

IN the network-base layers, we would still take the longer route, because
we optimize on network I/O, not CPU cycles.

So, this is so the ra_local will have less to do. If it is considered
ugly, I can expose deltas in the RA layer instead, letting the blame
command handle the temp files as it does today.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri May 28 23:39:04 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.