On Tue, Jan 25, 2011 at 1:37 AM, Stefan Fuhrmann <eqfox_at_web.de> wrote:
[ ... snip ...]
> And, as promised, here some ideas how to get more
> speed from the generic code. Your latest commit:
>
> +#if SVN_UNALIGNED_ACCESS_IS_OK
> +
> + /* Skip quickly over the stuff between EOLs. */
> + for (i = 0, can_read_word = TRUE; i< file_len; i++)
> + can_read_word = can_read_word
> +&& (file[i].curp + sizeof(apr_size_t)< file[i].endp);
> + while (can_read_word)
> + {
> + for (i = 1, is_match = TRUE; i< file_len; i++)
> + is_match = is_match
> +&& ( *(const apr_size_t *)file[0].curp
> + == *(const apr_size_t *)file[i].curp);
> +
> + if (!is_match || contains_eol(*(const apr_size_t *)file[0].curp))
> + break;
> +
> + for (i = 0; i< file_len; i++)
> + file[i].curp += sizeof(apr_size_t);
> + for (i = 0, can_read_word = TRUE; i< file_len; i++)
> + can_read_word = can_read_word
> +&& (file[i].curp + sizeof(apr_size_t)< file[i].endp);
> + }
> +
> +#endif
>
> could be changed to something like the following.
> Please note that I haven't tested any of this:
Thanks. There was one error in your suggestion, which I found out
after testing. See below.
> /* Determine how far we may advance with chunky ops without reaching
> * endp for any of the files.
> * Signedness is important here if curp gets close to endp.
> */
> apr_ssize_t max_delta = file[0].endp - file[0].curp - sizeof(apr_size_t);
> for (i = 1; i< file_len; i++)
> {
> apr_ssize_t delta = file[i].endp - file[i].curp - sizeof(apr_size_t);
> if (delta< max_delta)
> max_delta = delta;
> }
>
> /* the former while() loop */
> is_match = TRUE;
> for (delta = 0; delta< max_delta&& is_match; delta += sizeof(apr_size_t))
> {
> apr_size_t chunk = *(const apr_size_t *)(file[0].curp + delta);
> if (contains_eol(chunk))
> break;
>
> for (i = 1; i< file_len; i++)
> if (chunk != *(const apr_size_t *)(file[i].curp + delta))
> {
> is_match = FALSE;
Here, I inserted:
delta -= sizeof(apr_size_t);
because otherwise, delta will be increased too far (it will still be
increased by the counting expression of the outer for-loop (after
which it will stop because of !is_match)). Maybe there is a
cleaner/clearer way to break out of the outer for-loop here, without
incrementing delta again, but for now, I've committed it with this
change (r1063565).
> break;
> }
> }
>
> /* We either found a mismatch or an EOL at or shortly behind curp+delta
> * or we cannot proceed with chunky ops without exceeding endp.
> * In any way, everything up to curp + delta is equal and not an EOL.
> */
> for (i = 0; i< file_len; i++)
> file[i].curp += delta;
Thanks. This gives on my machine/example another 15-20% performance
increase (datasources_open time going down from ~21 s to ~17 s). We
should probably do the same for suffix scanning, but I'm too tired
right now :-) (and suffix scanning is more difficult to grok, so not a
good idea to do at 3 am).
Cheers,
--
Johan
Received on 2011-01-26 03:10:46 CET