[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Making EXPANDED_SIZE reliable in FSFS [was: svn commit: r1673204 - /subversion/trunk/subversion/libsvn_fs_fs/dag.c]

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 14 Apr 2015 01:04:51 +0200

On Mon, Apr 13, 2015 at 11:20 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name>
wrote:

> rhuijben_at_apache.org wrote on Mon, Apr 13, 2015 at 14:28:39 -0000:
> > - *has_props = (noderev->prop_rep->expanded_size > 4);
> > + *has_props = (noderev->prop_rep->expanded_size > 4
> > + || (noderev->prop_rep->expanded_size == 0
> > + && noderev->prop_rep->size > 4));
>
> Having to remember, on every use of EXPANDED_SIZE, to check if it's zero
> is madness.

The pain has been increasing and is now at a level
where we should do something about it.

> Is there a good reason not to make the deserializer
> function handle this? If needed, by inventing a getter or a new struct
> member that doesn't have this quirk?
>

The main problem is that we don't know whether SIZE is
correct, either, without looking at the actual representation:
It could be a self-delta of an empty fulltext. So, most of
these tests are contextual, we e.g. know that these are
property reps. Or we can allow for false positives or false
negatives etc.

One way to get around this problem is to basically use
the approach of svn_fs_fs__file_length(), where we compare
the contents checksum with with that of an empty rep.
We might also restrict that check to "meaningful" values
of "SIZE", i.e. those txdelta streams that produce an
empty output.

Assuming we find a truly reliable test (e.g. say txdelta is
always 4 bytes and we can enumerate all MD5s), we could
add it to the parser function in low_level.c.

Thoughts?

-- Stefan^2.
Received on 2015-04-14 01:05:20 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.