On Wed, Dec 15, 2010 at 10:58 AM, Stefan Fuhrmann <eqfox_at_web.de> wrote:
> On 15.12.2010 02:30, Stefan Fuhrmann wrote:
>>
>> On 14.12.2010 23:35, Johan Corveleyn wrote:
>>
>>> Thoughts?
>>
>> Two things you might try:
>> * introduce a local variable for afile[i]
>> * replace the for() loop with two nested ones, keeping
>> calls to other functions out of the hot spot:
>>
>> for (i=0; i < file_len;)
>
> That should read:
> for (i=0; i < file_len; i++)
>>
>> {
>> /* hot spot: */
>> for(; i < file_len; i++)
>> {
>> curFile = afile[i];
>> if (curFile->chunk==-1)
>> curFile->chunk = 0;
>> else if (curFile->curp != curFile->endp -1)
>> curFile->curp++;
>> else
>> break;
>> }
>>
>> if (i < file_len)
>> {
>> /* the complex, rarely used stuff goes here */
>> }
>> }
Ok, I tried this, but it didn't really help. It's still only about as
fast as before. While the macro version is about 10% faster (I made a
cleaner macro version, only macro'ing the hotspot stuff, with a
function call to the complex, rarely used stuff).
For the inline version, I tried several variations (always with
APR_INLINE in the function signature, and with local variable for
file[i]): with or without the nested for loop, with or without the
complex stuff factored out to a separate function, ... it made no
difference.
Maybe I'm still doing something wrong, keeping the compiler from
inlining it (but I'm not really a compiler expert, maybe I have to add
some compiler options or something, ...). OTOH, I now have a macro
implementation which is quite readable (and which uses the do ...
while(0) construct - thanks Peter), and which does the trick. So maybe
I should simply stick with that option ...
FYI, in attachment some patches for the different variations (to be
applied against HEAD of diff-optimizations-bytes branch). And the two
main versions here below in full for easy review/discussion/context.
At the bottom, some of my measurements. As always, I welcome any
feedback.
New macro version (increment only, decrement is similar):
[[[
/* For all files in the FILE array, increment the curp pointer. If a file
* points before the beginning of file, let it point at the first byte again.
* If the end of the current chunk is reached, read the next chunk in the
* buffer and point curp to the start of the chunk. If EOF is reached, set
* curp equal to endp to indicate EOF. */
#define increment_pointers(all_files, files_len, pool) \
do { \
int i; \
\
for (i = 0; i < files_len; i++) \
{ \
if (all_files[i]->chunk == -1) /* indicates before beginning of file */\
all_files[i]->chunk = 0; /* point to beginning of file again */ \
else if (all_files[i]->curp != all_files[i]->endp - 1) \
all_files[i]->curp++; \
else \
SVN_ERR(increment_chunk(all_files[i], pool)); \
} \
} while (0)
static svn_error_t *
increment_chunk(struct file_info *file, apr_pool_t *pool)
{
apr_off_t length;
apr_off_t last_chunk = offset_to_chunk(file->size);
/* Unless this was the last chunk, we need to read a new chunk. */
if (file->chunk == last_chunk)
{
file->curp++; /* curp == endp signals end of file */
}
else
{
file->chunk++;
length = file->chunk == last_chunk ?
offset_in_chunk(file->size) : CHUNK_SIZE;
SVN_ERR(read_chunk(file->file, file->path, file->buffer,
length, chunk_to_offset(file->chunk),
pool));
file->endp = file->buffer + length;
file->curp = file->buffer;
}
return SVN_NO_ERROR;
}
]]]
APR_INLINE version (with nested for loop, and separate function)
[[[
/* For all files in the FILE array, increment the curp pointer. If a file
* points before the beginning of file, let it point at the first byte again.
* If the end of the current chunk is reached, read the next chunk in the
* buffer and point curp to the start of the chunk. If EOF is reached, set
* curp equal to endp to indicate EOF. */
static APR_INLINE svn_error_t *
increment_pointers(struct file_info *file[], int file_len, apr_pool_t *pool)
{
struct file_info *curFile;
int i;
for (i = 0; i < file_len; i++)
{
/* hot spot: */
for(; i < file_len; i++)
{
curFile = file[i];
if (curFile->chunk == -1) /* indicates before beginning of file */
curFile->chunk = 0; /* point to beginning of file again */
else if (curFile->curp != curFile->endp - 1)
curFile->curp++;
else
break;
}
if (i < file_len)
{
SVN_ERR(increment_chunk(curFile, pool));
}
}
return SVN_NO_ERROR;
}
static svn_error_t *
increment_chunk(struct file_info *file, apr_pool_t *pool)
{
apr_off_t length;
apr_off_t last_chunk = offset_to_chunk(file->size);
/* Unless this was the last chunk, we need to read a new chunk. */
if (file->chunk == last_chunk)
{
file->curp++; /* curp == endp signals end of file */
}
else
{
file->chunk++;
length = file->chunk == last_chunk ?
offset_in_chunk(file->size) : CHUNK_SIZE;
SVN_ERR(read_chunk(file->file, file->path, file->buffer,
length, chunk_to_offset(file->chunk),
pool));
file->endp = file->buffer + length;
file->curp = file->buffer;
}
return SVN_NO_ERROR;
}
]]]
Finally, some measurements (with the help of some instrumentation code
in blame.c, see patch in attachment). This is timing of "svn blame
-x-b" of a ~1,5 Mb file (61000 lines) with 2272 revisions (each time
average of 2nd and 3rd run):
Total blame time (s) diff time (s)
Normal function (original version): 118 80
Inline function (w/nested for) : 118 79
Macro version : 115 72
Ok, it's nothing huge, but it makes me feel a little bit better :-).
Cheers,
--
Johan
Received on 2010-12-20 02:44:24 CET