[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: More diff lib stuff

From: mark benedetto king <mbk_at_earthquake.boredom.org>
Date: 2003-01-12 17:38:38 CET

 Sat, Jan 11, 2003 at 12:06:36AM +0100, Sander Striker wrote:
> Ofcourse, while working on this I stumbled over the
> current implementation of diff... *blush* I remember
> now that to get something working I used md5 hashes
> of each line... No wonder we perform bad on 6MB files
> (still only 3.5 seconds with a _lot_ of differences
> and matches).
>

Would Adler-32 be more appropriate?

/*
 * An Adler-32 implementation per RFC1950.
 *
 * "The Adler-32 algorithm is much faster than the CRC32 algorithm yet
 * still provides an extremely low probability of undetected errors"
 */

#include <stdio.h>
#include <sys/types.h>

/*
 * 65521 is the largest prime less than 65536.
 * "That 65521 is prime is important to avoid a possible large class of
 * two-byte errors that leave the check unchanged."
 */
#define MOD_BASE 65521

/*
 * "The modulo on unsigned long accumulators can be delayed for 5552 bytes,
 * so the modulo operation time is negligible."
 */
#define MOD_BLOCK_SIZE 5552

u_int32_t
adler32_update(u_int32_t checksum,
               const unsigned char *input,
               size_t length)
{
  u_int32_t s1 = checksum & 0xFFFF;
  u_int32_t s2 = checksum >> 16;

  size_t blocks = length / MOD_BLOCK_SIZE;
  
  length %= MOD_BLOCK_SIZE;

  while (blocks--)
    {
      int count = MOD_BLOCK_SIZE;
      while (count--)
        {
          u_int32_t b = *input++;
          s1 += b;
          s2 += s1;
        }
      s1 %= MOD_BASE;
      s2 %= MOD_BASE;
    }

  while (length--)
    {
      u_int32_t b = *input++;
      s1 += b;
      s2 += s1;
    }

  return ((s2 % MOD_BASE) << 16) |
          (s1 % MOD_BASE);
}

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Jan 12 17:39:25 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.