[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Fix corner case bug in libsvn_diff

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2006-02-08 23:00:59 CET

On Wed, 8 Feb 2006, Malcolm Rowe wrote:

> On Tue, Feb 07, 2006 at 11:28:41PM +0100, Peter N. Lundblad wrote:
> > file_baton->curp[idx] = eol;
> > - *token = file_token;
> > + /* If the file length is exactly a multiple of CHUNK_SIZE, we will end up
> > + * with a spurious empty token. Avoid returning it. */
> > + if (file_token->length > 0)
> > + *token = file_token;
> >
> > return SVN_NO_ERROR;
> > }
>
> I'm not sure, but isn't this what the section at the top of the function
> is trying to guard against:
>
> curp = file_baton->curp[idx];
> endp = file_baton->endp[idx];
>
> last_chunk = offset_to_chunk(file_baton->size[idx]);
>
> if (curp == endp
> && last_chunk == file_baton->chunk[idx])
> {
> return SVN_NO_ERROR;
> }
>
That's what makes this function terminate early if there is no data,
regardless of whether it ended on a chunk boundry or not.

> ? There's also some checking against last_chunk later on, apparently for
> files that are a multiple of the chunksize with no trailing EOL marker.

That check is for files with no eol at eof, regardless of the file size.

> Isn't the problem here actually that we shouldn't use offset_to_chunk()
> with a file _size_ directly? (a file of 128k will only have one chunk,
> so last_chunk should be zero until the file is 128k+1 bytes)
>
> I think the more correct fix would be just to change the calculation of
> last_chunk to something like:
>
> last_chunk = offset_to_chunk(file_baton->size[idx] - 1);
>
> Untested, and I'm also assuming we don't have to deal with empty

It would be correct if it worked:-) I tried that first, but then the last
chunk is being missed.

> (zero-length) files at this stage, since much of file_baton seems to
> be unintialised in that case (see svn_diff__file_datasource_open()),
> so we can't (for example) compare curp and endp meaningfully.
>
This code isn't entirely straight. We *have* to handle 0 byte files. NOte
that the baton fields are not uninitialized; they are set to zero in the
svn_diff_file_diff{,34} functions.

This code have served us well for years, but honestly it could need some
more documentation in some places.

Malcolm, as I said, I think my fix looks a little hacky, so if you have
the time and energy, feel free to find a more proper fix. A tip, if you
do try, is to set the chunk size to a small value like I did to really
make sure it works.

Reards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 8 23:01:26 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.