Re: [RFC] diff-optimizations-bytes branch: avoiding function call overhead (?)

From: Stefan Fuhrmann <eqfox_at_web.de>
Date: Wed, 15 Dec 2010 02:30:12 +0100

On 14.12.2010 23:35, Johan Corveleyn wrote:
> Hi all,
Hi Johan ;)
> On the diff-optimizations-bytes branch, in diff_file.c, there are two
> functions which are called for every byte of the identical prefix and
> suffix: increment_pointers and decrement_pointers. These functions are
> actually equivalents of curp++ or curp--, reading the next/previous
> byte, but taking into account the chunked-ness of the data (file data
> is read in memory in chunks).
>
> As an experiment I changed these functions into macro's, eliminating
> the function calls. This makes the diff algorithm another 10% - 15%
> faster (granted, this was measured with my "extreme" testcase of a 1,5
> Mb file (60000 lines), of which most lines are identical
> prefix/suffix). However, having an entire function like that
> implemented as a macro isn't very pretty (see below for an example).
I'm kind of surprised that the calling overhead is
actually noticeable for larger files, i.e. larger values
of file_len it should loop many times.
> Some considerations:
> - Maybe I can use APR_INLINE, with similar results?
>
> - Maybe I can put just the "critical section" into a macro (the curp++
> / curp-- part), and call a function when a chunk boundary is
> encountered (~ once every 131072 iterations (chunks are 128 Kb
> large)), to read in the new chunk, advancing variables, ...
Prefer inlining because the compiler is free to ignore it.
Depending on the target architecture and the compiler,
it may be beneficial to "narrow" the scope of optimization:
In large functions, the optimizer may guess wrongly about
the hot spots.

> - Maybe it's not worth it?
Since inlining is for free from the maintenance POV,
any gain should justify it.
> Thoughts?
Two things you might try:
* introduce a local variable for afile[i]
* replace the for() loop with two nested ones, keeping
calls to other functions out of the hot spot:

for (i=0; i < file_len;)
{
   /* hot spot: */
   for(; i < file_len; i++)
   {
     curFile = afile[i];
     if (curFile->chunk==-1)
       curFile->chunk = 0;
     else if (curFile->curp != curFile->endp -1)
       curFile->curp++;
     else
       break;
   }

   if (i < file_len)
   {
     /* the complex, rarely used stuff goes here */
   }
}

-- Stefan^2.
> Just for kicks, here is an example of increment_pointers as a macro:
> [[[
> #define increment_pointers(afile, file_len, pool) { \
> int i; \
> \
> for (i = 0; i< file_len; i++) \
> if (afile[i]->chunk == -1) /* indicates before beginning of file */ \
> { \
> afile[i]->chunk = 0; /* point to beginning of file again */ \
> } \
> else if (afile[i]->curp == afile[i]->endp - 1) \
> { \
> apr_off_t last_chunk = offset_to_chunk(afile[i]->size); \
> if (afile[i]->chunk == last_chunk) \
> { \
> afile[i]->curp++; /* curp == endp signals end of file */ \
> } \
> else \
> { \
> apr_off_t length; \
> afile[i]->chunk++; \
> length = afile[i]->chunk == last_chunk ? \
> offset_in_chunk(afile[i]->size) : CHUNK_SIZE; \
> SVN_ERR(read_chunk(afile[i]->file, afile[i]->path,
> afile[i]->buffer,\
> length, chunk_to_offset(afile[i]->chunk), \
> pool)); \
> afile[i]->endp = afile[i]->buffer + length; \
> afile[i]->curp = afile[i]->buffer; \
> } \
> } \
> else \
> { \
> afile[i]->curp++; \
> } \
> }
> ]]]
>
> Cheers,
Received on 2010-12-15 02:30:38 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]