Re: log --limit not as good as it should be?

From: David James <james_at_cs.toronto.edu>
Date: 2007-05-09 19:55:15 CEST

On 5/9/07, Daniel Berlin <dberlin@dberlin.org> wrote:
> On 5/9/07, David James <james@cs.toronto.edu> wrote:
> > On 5/9/07, Daniel Berlin <dberlin@dberlin.org> wrote:
> > > [... snip ...]
> > > Yeah, it may slow down if you log obscure directories with almost no
> > > changes. But i don't believe this is the common case.
> >
> > How dramatic is the slowdown from your optimization if you run 'svn
> > log' on a file or directory which has few changes? From reading your
> > email, it seems like the slowdown will be very minor, but correct me
> > if I am wrong.
>
> Minor, because discover_changed_paths will see there is nothing there
> *really* quickly (should be O(1), i think)
>
> It's just you are going to have say, 10-20 more O(1) calls that take a
> few milliseconds.
>
> >
> > I do often run 'svn log --limit N' on an individual file or directory
> > to read about the last N changes to a particular file or directory. I
> > also find it handy to run "svn log -r1:HEAD --limit 1 --stop-on-copy"
> > to find the last revision in which a file was copied, or "svn log
> > -r1:HEAD --limit 1" to find the revision in which a file was created.
> > So far I haven't noticed any performance problems with these
> > operations, but if your change will have a dramatic effect on these
> > cases you might want to think about that.
>
> If you haven't had performance problems, you haven't tried it on a
> large enough repo. My suggestion would should make those commands
> very slightly slower. Let's say you had a million revision, and we
> picked a batch size of 20000.
>
> If the file was changed 4 times, and all in the last 20k revisions, we
> will have 49 useless calls (1-20k, 20k-40k) taking O(1) time each.
> If the file was changed 4 times, spread evenly among the batches, we
> do the same amount of work we used to.
> If the file was changed thousands of times, spread evenly among the
> batches, we do a lot less work than we used to.
>
> I expect some combination of 2 and 3 is the common case. Even for 1,
> it shouldn't drop the performance very much.

Great! In that case, +1 on the optimization. It sounds like it will be
very helpful when users need to grab the revisions in reverse order
for a big repository.

Cheers,

David

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 9 19:55:23 2007

This message: [ Message body ]
Next message: Dan Christian: "double free bug in threaded programs"
Previous message: Daniel Berlin: "Re: log --limit not as good as it should be?"
In reply to: Daniel Berlin: "Re: log --limit not as good as it should be?"
Next in thread: Daniel Berlin: "Re: log --limit not as good as it should be?"
Reply: Daniel Berlin: "Re: log --limit not as good as it should be?"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]