[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: misaligned blame output if repo has >1m revisions

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Mon, 19 Apr 2010 14:41:36 +0100

Johan Corveleyn wrote:
> On Sun, Apr 18, 2010 at 11:34 PM, Branko ─îibej <brane_at_xbc.nu> wrote:
> > Johan Corveleyn wrote:
> >> So, I'd like to do 3), and add just 1 (one) extra character for
> >> revnum. With 1 extra character, we'd have up to 9,999,999 for revnum.
> >> Even for http://websvn.kde.org/trunk/, starting at 1113881 and adding
> >> 100k revisions every 8 months, that would give us 59 years until we
> >> hit that (let's say it accelerates somewhat, so 30 years might be a
> >> better estimate, but that's still plenty of time).
> >
> > And still not solve anything for anybody.
> How does this not solve the problem that blame output (from the
> command line) is misaligned if a revision number >= 1000000 appears in
> it? That was the original issue that was raised.

Brane might have been exaggerating, but since it's quite easy to do a
more permanent solution, let's do it.

> Instead why not do the following:
> >
> > * Determining the max length for the revision number is a petty much
> > constant-time operation; you either get it explicitly from blame
> > options (-rN:M) or implicitly from HEAD.
> That doesn't seem very useful to me. The max revision number from the
> range over which to blame doesn't tell me anything about the maximum
> revision number that will be present in the blame output (which is
> more likely the "last changed rev"). Note: I think in >95% of all
> blames people will not specify an explicit range (I certainly don't),
> so this will be HEAD (or BASE, I'm not sure).

The rev number of the end of the range does tell us something useful: it
tells us the maximum possible rev number in the result, and therefore
the maximum possible required field width. The result might not have
any rev numbers as long as that, but usually it will unless the head rev
number (or specified end rev number) is just past a power-of-ten
boundary. In most usages I imagine it won't matter a jot if the field
is longer than necessary; indeed, it could occasionally be a minor
cosmetic advantage that the same column width will be chosen for each of
multiple files. (I am picturing two or more windows, one above the
other, in a text editor.)

The significant benefit is that no server contact is required to
calculate it this way.

> Writing this makes me think maybe the correct solution is:
> - Determine the "last changed rev" of the file (in an extra call to
> the repository).
> - Set column width to length(max("last changed rev", "end revnum")),
> where "end revnum" is the end revision from the range over which the
> blame was asked (M or HEAD in your example above).

It is possible (and better) to determine the last changed rev at or
before the specified "end revnum".

> - Then start gathering/processing the blame info.
> Is that correct?

That would be OK. Personally I prefer the idea of just using "end

> > * Just truncate the svn:author for now. If we can do that in "svn
> > ls" there's no reason not to do it in "svn blame". Perhaps someday
> > the repository API will provide an easy way to index and access
> > all revprop values without having to walk all of history. But for
> > now, just punt. Adding another column to that field is not really
> > any kind of solution.
> Agreed, truncating seems ok for author.

+1 from me too.

- Julian

> > And last but not least, leave the actuall formatting to the reporter.
> > GUIs can do variable-width columns much more easily than text-based
> > interfaces.
> Of course ...
> Cheers,
Received on 2010-04-19 15:42:15 CEST

This is an archived mail posted to the Subversion Dev mailing list.