On 18.01.2011 12:56, Johan Corveleyn wrote:
> On Mon, Jan 17, 2011 at 3:12 AM, Johan Corveleyn<jcorvel_at_gmail.com> wrote:
>> On Sat, Jan 8, 2011 at 6:50 AM, Stefan Fuhrmann
>> <stefanfuhrmann_at_alice-dsl.de> wrote:
> [ ... snip ... ]
>
>>> But I think, the stack variable is certainly helpful
>>> and easy to do.
> Ok, I've done this (locally, still have to clean up a little and then
> commit). It gives a nice 10% speedup of prefix/suffix scanning, which
> is great.
>
> I have a couple of small questions though, about other small
> optimizations you did:
>
>> Index: subversion/libsvn_diff/diff_file.c
>> ===================================================================
>> --- subversion/libsvn_diff/diff_file.c (revision 1054044)
>> +++ subversion/libsvn_diff/diff_file.c (working copy)
> ...
>> @@ -258,10 +259,10 @@
>> \
>> for (i = 0; i< files_len; i++) \
>> { \
>> - if (all_files[i]->curp< all_files[i]->endp - 1) \
>> - all_files[i]->curp++; \
>> + if (all_files[i].curp + 1< all_files[i].endp) \
>> + all_files[i].curp++; \
> Just curious: why did you change that condition (from 'X< Y - 1' to
> 'X + 1< Y')?
>
> You mention in your log message: "minor re-arragements in arithmetic
> expressions to maximize reuse of results (e.g. in
> INCREMENT_POINTERS)".
>
> Why does this help, and what's the potential impact?
There is a "hidden" common sub-expression.
Variant 1 in pseudo-code:
lhs = all_files[i].curp;
temp1 = all_files[i].endp;
rhs = temp1 - 1;
if (lhs < rhs)
{
temp2 = lhs + 1;
all_files[i].curp = temp2;
}
Variant 2:
lhs = all_files[i].curp;
temp = lhs + 1;
rhs = all_files[i].endp;
if (lhs < rhs)
all_files[i].curp = temp;
The problem is that the compiler cannot figure that out
on its own because (x+1) and (y-1) don't overflow / wrap
around for the same values. In particular 0 < (0-1) is true
and (0+1) < 0 is false in unsigned arithmetic.
>
> Second question:
>
>> +find_identical_prefix_slow(svn_boolean_t *reached_one_eof,
>> + apr_off_t *prefix_lines,
>> + struct file_info file[], apr_size_t file_len,
>> + apr_pool_t *pool)
>> +{
>> + svn_boolean_t had_cr = FALSE;
>> svn_boolean_t is_match, reached_all_eof;
>> apr_size_t i;
>> + apr_off_t lines = 0;
>>
>> *prefix_lines = 0;
>> for (i = 1, is_match = TRUE; i< file_len; i++)
>> - is_match = is_match&& *file[0]->curp == *file[i]->curp;
>> + is_match = is_match&& *file[0].curp == *file[i].curp;
>> +
>> while (is_match)
>> {
>> /* ### TODO: see if we can take advantage of
>> diff options like ignore_eol_style or ignore_space. */
>> /* check for eol, and count */
>> - if (*file[0]->curp == '\r')
>> + if (*file[0].curp == '\r')
>> {
>> - (*prefix_lines)++;
>> + lines++;
> Here you added a local variable 'lines', which is incremented in the
> function, and copied to *prefix_lines at the end (you do the same in
> fine_identical_prefix_fast). The original code just sets and
> increments *prefix_lines directly. So, out of curiousness: why does
> this help, and how much?
Incrementing the temporary variable requires one
indirection less than the original code and uses only
one instead of two registers (in typical cases).
The 'how much?' part depends on compiler / optimizer
and even more on the target platform (x86 has very
few registers to keep data in fly; x64 has about twice
as many).
So, both changes are more or less general coding
patterns that will improve performance somewhat
without compromising readability.
-- Stefan^2.
Received on 2011-01-19 01:20:16 CET