On Mon, May 30, 2011 at 8:50 PM, Morten Kloster <morklo_at_gmail.com> wrote:
> Johan, any progress on reviewing the code on your part? Things are
> a bit simpler now with the idx patch in: Since that patch settled the
> file (re)order issue, this patch now produces fully identical results to
> HEAD.
Thanks for the updated patch, it applies cleanly to HEAD of trunk indeed.
I've done some tests, and they look good. The result of my 2200-rev
blame is identical before or after this patch (the idx patch caused
some minor changes, but at first sight nothing drastic -- seemingly
unimportant lines, which were switched between revs they were
attributed to).
With my usual blame testcase, I use the -x-b option (ignore changes in
amount of whitespace). This helps the blame speed tremendously in this
case, because some of the revisions in the history have been
spaces->tabs and vice-versa. With this test, I could see the slight
overhead of the token counting of your patch (2%-3%), because there
are not that much "one-sided" lines. So it's slightly slower than
before.
However, when running the 2200-rev blame without ignore-options, I can
see a *huge* improvement thanks to your patch. This test used to take
over an hour. Now it's finished in 1m30s, as fast as the test with
-x-b. For the revs that change spaces->tabs all lines are "one-sided".
For real work, I prefer to use the -x-b option (I'm not interested in
who changed whitespace), but it's interesting to see the power of this
patch.
Anyway, I'm currently running the entire test-suite on your patch. So
far no problem. But before committing it I'd still want to do two
things:
- Take a closer look at measuring the overhead of the token counting.
Maybe you can also provide some numbers here? I think a good test for
measuring this in practice is:
1. take a very large file
2. change a line in the beginning and at the end
(eliminates prefix/suffix scanning, making sure everything goes to LCS)
3. diff those two
- Take a closer look at the code. I've skimmed through it, and it
looked good. But I need to go for a second pass, but right now (almost
3 am) I really need to get some sleep first :-).
--
Johan
Received on 2011-05-31 02:54:38 CEST