[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Fix corner case bug in libsvn_diff

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2006-02-08 11:52:38 CET

On Tue, Feb 07, 2006 at 11:28:41PM +0100, Peter N. Lundblad wrote:
> Hi,
>
> While testing the new diff stuff (ignore whitespace etc.), I changed the
> token read chunk size from 128K to 4, and found an old corner case bug.
> The problem is that the code will return a spurious empty token if the
> file is exactly n*CHUNK_SIZE bytes long (n>0). This confuses the diff
> algorithm, which thinks that the file has an extra line. Attached is a
> fix for this, which seems a little clumsy. If anyone has a better
> solution, feel free to suggest/commit.
>
>
> file_baton->curp[idx] = eol;
> - *token = file_token;
> + /* If the file length is exactly a multiple of CHUNK_SIZE, we will end up
> + * with a spurious empty token. Avoid returning it. */
> + if (file_token->length > 0)
> + *token = file_token;
>
> return SVN_NO_ERROR;
> }

I'm not sure, but isn't this what the section at the top of the function
is trying to guard against:

  curp = file_baton->curp[idx];
  endp = file_baton->endp[idx];

  last_chunk = offset_to_chunk(file_baton->size[idx]);

  if (curp == endp
      && last_chunk == file_baton->chunk[idx])
    {
      return SVN_NO_ERROR;
    }

? There's also some checking against last_chunk later on, apparently for
files that are a multiple of the chunksize with no trailing EOL marker.
Isn't the problem here actually that we shouldn't use offset_to_chunk()
with a file _size_ directly? (a file of 128k will only have one chunk,
so last_chunk should be zero until the file is 128k+1 bytes)

I think the more correct fix would be just to change the calculation of
last_chunk to something like:

  last_chunk = offset_to_chunk(file_baton->size[idx] - 1);

Untested, and I'm also assuming we don't have to deal with empty
(zero-length) files at this stage, since much of file_baton seems to
be unintialised in that case (see svn_diff__file_datasource_open()),
so we can't (for example) compare curp and endp meaningfully.

Regards,
Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 8 11:53:28 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.