Molle Bestefich <firstname.lastname@example.org> writes:
> Jens Scheidtmann wrote:
> > The binary diff is best thought of as:
> > - Built up second string from longest substrings, which can be found in first
> > - color those parts alike.
> Doesn't always work though. See screenshot.
There is a threshold in the algorithm (see below).
> (The bizarre colors from m_BinDiffColors actually looks OK in this
> case - I wonder ;-).)
> [snipped explanation]
> Your explanation is better. Thanks.
> > If parts of the second string cannot be found in first string, these
> > will have the same color as the string in the right display (yellow).
> Screenshot is relevant.
Well, I confused red and yellow it seems.
In addition it's a greedy algorithm, meaning that it always tries to
find the longest match (green). But when these two spaces are
accounted for by the green match, the algo starts looking for the
longest match for "123.abc" (no look back). The longest match it finds
is the "123" at the end, but that is smaller than the threshold of
4(?) characters (The threshold tries to avoid complete color soup).
> > So you want to have them as visually distinct as possible.
> Hell no :-)! I'd like them to be different shades, to aid in visually
> discerning matched strings. But I'd like to have different meanings
> for different colours. Keeping the explanation as simple as "red
> means removed, yellow means added, green is the same (might have been
> moved around)" appeals to me.
Yes, good point, but ...
> Can you provide any good examples where it is vitally important to
> have extremely contrasting colors?
... I often have to deal with very long lines in source code, where
only some characters or words have been (ex)changed. In this case
colors giving a black eye are really handy: You just scroll within
TMerge from left to right and by these colors you are forced to look
at the place where it has changed. Using different shades of green
would make this more difficult. For me the information what's new and
what's been left out is not so important, as this is IMOH an
infrequent case for changes within lines (you drop some lines here and
insert some more lines there, but that's account for differently).
So the color differences should at least be easily spottable.
And yes, maybe the colors have much to much contrast.
Best would be (if you want to have them similar) to place a reference
point in CIE L*a*b*, where euclidian distance is proportional to
percepted color difference and place 5 - 11 colors (odd) around that
reference point. Then enumerate these points so that the sequence has
maximum color differences (for 5 points --- hope you don't use a
4 . 3
(You should do that in three dimensions taking luminance into account
;-) BTW, if you have a look at
http://en.wikipedia.org/wiki/Lab_color_space you'll see that green is
not a good choice, because the eye does not differentiate bright green
colors well. (See http://www.cs.rit.edu/~ncs/color/t_convert.html for
conversion formulas to RGB)
But maybe the problem does not lie in the colors but in the algorithm:
The algorithm has no notion of "relevant changes", it treats
all substrings the same. Maybe something like a visualization of
minimum edit distance (Levenshtein distance) would be more
appropriate, but I haven't found good references on that yet (only the
calculation of minimum edit distance).
> Many of the original colors are, by the way, for some reason *very*
> close. Compare:
> RGB(0x33, 0x00, 0x99)
> RGB(0x41, 0x00, 0x99)
> Can anybody provide a good explanation for why this is?
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
Received on Thu Mar 17 13:37:51 2005