[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Speed-up of libsvn_diff using token counts

From: Morten Kloster <morklo_at_gmail.com>
Date: Mon, 6 Jun 2011 19:38:44 +0200

On Mon, Jun 6, 2011 at 3:17 AM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> On Wed, Jun 1, 2011 at 5:56 PM, Morten Kloster <morklo_at_gmail.com> wrote:
[]
> Hi Morten,
>
> Sorry, it took me a little while longer than expected, but I finally
> got around to it.
>
> I did some more tests, and upon further investigation, I too can't
> really find a significant performance decrease because of the token
> counting. So that looks good.
>
> I've reviewed it all, and it looks fine. I just have one small
> question about the changes in lcs.c:
>
> [[[
> @@ -218,11 +228,17 @@ prepend_lcs(svn_diff__lcs_t *lcs, apr_off_t lines,
>  svn_diff__lcs_t *
>  svn_diff__lcs(svn_diff__position_t *position_list1, /* pointer to
> tail (ring) */
>               svn_diff__position_t *position_list2, /* pointer to
> tail (ring) */
> +              svn_diff__token_index_t *token_counts_list1, /* array
> of counts */
> +              svn_diff__token_index_t *token_counts_list2, /* array
> of counts */
> +              svn_diff__token_index_t num_tokens,
>               apr_off_t prefix_lines,
>               apr_off_t suffix_lines,
>               apr_pool_t *pool)
>  {
>   apr_off_t length[2];
> +  svn_diff__token_index_t *token_counts[2];
> +  svn_diff__token_index_t unique_count[2];
> +  svn_diff__token_index_t token_index;
>
> ...
>
> @@ -281,16 +310,17 @@ svn_diff__lcs(svn_diff__position_t *position_list1
>   sentinel_position[0].next = position_list1->next;
>   position_list1->next = &sentinel_position[0];
>   sentinel_position[0].offset = position_list1->offset + 1;
> +  token_counts[0] = token_counts_list1;
>
>   sentinel_position[1].next = position_list2->next;
>   position_list2->next = &sentinel_position[1];
>   sentinel_position[1].offset = position_list2->offset + 1;
> +  token_counts[1] = token_counts_list2;
>
> ...
>
> @@ -310,12 +340,12 @@ svn_diff__lcs(svn_diff__position_t *position_list1
>       /* For k < 0, insertions are free */
>       for (k = (d < 0 ? d : 0) - p; k < 0; k++)
>         {
> -          svn_diff__snake(fp + k, &lcs_freelist, pool);
> +          svn_diff__snake(fp + k, token_counts, &lcs_freelist, pool);
>         }
>          /* for k > 0, deletions are free */
>       for (k = (d > 0 ? d : 0) + p; k >= 0; k--)
>         {
> -          svn_diff__snake(fp + k, &lcs_freelist, pool);
> +          svn_diff__snake(fp + k, token_counts, &lcs_freelist, pool);
>         }
>
>       p++;
> ]]]
>
> Why do you copy the token_counts_list parameters into the array
> token_counts[]? Is that to simplify the passing of data to __snake
> (keeping number of arguments to a minimum)?
>
> Can't you just pass the token_counts_list parameters 'as is' to
> __snake as two extra arguments, and lose the token_counts[] variable?
>
> But maybe you have a reason for this, because of compiler optimization
> or inlining or something like that?
>
> Apart from that I have no more remarks, so I think I'll be able to
> commit this soon.
>
> Thanks for your work.
> --
> Johan
>

I made the token_counts array when idx was still in, and then its
values depended on the value of idx. Now it it no longer required as
such, although I find it somewhat tidier to use arrays for "everything"
in svn_diff__snake. I get no easily detectable difference in speed on
my system from removing it, but if you want to remove it, feel free.

Thanks for the review. :-)

Morten
Received on 2011-06-06 19:39:19 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.