[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: misaligned blame output if repo has >1m revisions

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Mon, 19 Apr 2010 15:03:07 +0200

On Sun, Apr 18, 2010 at 11:34 PM, Branko Čibej <brane_at_xbc.nu> wrote:
> Johan Corveleyn wrote:
>> Hi all,
>>
>> Sorry it took a while to respond (been away for a while). I've been
>> thinking a little about this, looking at the code ...
>>
>> I see 3 options:
>>
>> 1) Determine the max length of revnum (and author) in blame.c, either
>> while building the blame chunks or in an extra loop before sending the
>> blame info to the blame receiver (and adding the max length(s) as
>> extra parameters to the receiver).
>>
>> 2) Determine the max length of revnum (and author) in blame_cmd.c,
>> i.e. the command line command. To do this, it would have to buffer the
>> blame info that it receives in the blame_receiver callback,
>> determining the maximum length(s) in the meantime.
>>
>> 3) Add some fixed amount of extra characters to revnum and/or author.
>>
>>
>> I don't like 1) and 2). With 1) we're introducing overhead in blame.c
>> (and messing with the blame_receiver API) for something that's only
>> needed for command line use. I suppose all the GUI clients don't care
>> about these lengths, so they shouldn't be impacted by something that's
>> only needed for cmd line.
>>
>> And 2), it really seems a bad idea to introduce buffering of the
>> entire blame info (hence the entire file that's being blamed), just to
>> fix the alignment (memory footprint, not as streamy anymore, ...).
>>
>> So, I'd like to do 3), and add just 1 (one) extra character for
>> revnum. With 1 extra character, we'd have up to 9,999,999 for revnum.
>> Even for http://websvn.kde.org/trunk/, starting at 1113881 and adding
>> 100k revisions every 8 months, that would give us 59 years until we
>> hit that (let's say it accelerates somewhat, so 30 years might be a
>> better estimate, but that's still plenty of time).
>>
>
> And still not solve anything for anybody.

How does this not solve the problem that blame output (from the
command line) is misaligned if a revision number >= 1000000 appears in
it? That was the original issue that was raised.

> Instead why not do the following:
>
>    * Determining the max length for the revision number is a petty much
>      constant-time operation; you either get it explicitly from blame
>      options (-rN:M) or implicitly from HEAD.

That doesn't seem very useful to me. The max revision number from the
range over which to blame doesn't tell me anything about the maximum
revision number that will be present in the blame output (which is
more likely the "last changed rev"). Note: I think in >95% of all
blames people will not specify an explicit range (I certainly don't),
so this will be HEAD (or BASE, I'm not sure).

Writing this makes me think maybe the correct solution is:
- Determine the "last changed rev" of the file (in an extra call to
the repository).
- Set column width to length(max("last changed rev", "end revnum")),
where "end revnum" is the end revision from the range over which the
blame was asked (M or HEAD in your example above).
- Then start gathering/processing the blame info.

Is that correct?

>    * Just truncate the svn:author for now. If we can do that in "svn
>      ls" there's no reason not to do it in "svn blame". Perhaps someday
>      the repository API will provide an easy way to index and access
>      all revprop values without having to walk all of history. But for
>      now, just punt. Adding another column to that field is not really
>      any kind of solution.

Agreed, truncating seems ok for author.

>
> And last but not least, leave the actuall formatting to the reporter.
> GUIs can do variable-width columns much more easily than text-based
> interfaces.

Of course ...

Cheers,

-- 
Johan
Received on 2010-04-19 15:03:37 CEST

This is an archived mail posted to the Subversion Dev mailing list.