[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [VOTE] Merging 1.9-cache-improvements to /trunk

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Sat, 16 May 2015 06:48:26 +0200

On Fri, May 15, 2015 at 7:25 PM, Philip Martin <philip.martin_at_wandisco.com>

> Philip Martin <philip.martin_at_wandisco.com> writes:
> > Philip Martin <philip.martin_at_wandisco.com> writes:
> >
> >> Prompted by the warnings I think there are some issues to fix. For
> >> APR_HASH_KEY_STRING keys there is no protection against abnormally long
> >> keys. combined_long_key() will allocate strlen() memory even if it is
> >> many GB. The item will not get cached if key+data length is more than
> >> 4GB but the memory for the key, which could be more than 4GB, will be
> >> permanently allocated in the cache. There is also a problem with
> >> overflow in membuffer_cache_set_internal() when calculating key+data
> >> length, although in practice a key large enough to trigger this will
> >> probably fail memory allocation first.
> >
> > Another issue:
> >
> > static svn_boolean_t
> > entry_keys_match(const entry_key_t *lhs,
> > const entry_key_t *rhs)
> > {
> > return (lhs->fingerprint[0] == rhs->fingerprint[0])
> > && (lhs->fingerprint[1] == rhs->fingerprint[1])
> > && (lhs->key_len == rhs->key_len);
> > }
> >
> > I think the key_len comparison is wrong and should be removed. If two
> > keys have fingerprints that collide it does not mean the key lengths are
> > the same. When attempting to use two keys with colliding fingerprints
> > the behaviour when the key lengths vary should be same as when the key
> > lengths are the same and the key data varies.

Not necessary. It is true that we could decide to *only* use
the fingerprint as discriminator. But the implementation here
includes the key length; it reduces the number of conflicts
that result in entries that evict each other.

Most fixed-length keys are 16 bytes long and for them, the
fingerprint is almost identical to the key and conflicts are
extremely unlikely. So, varible-length keys are the one that
may have fingerprint conflicts now due to the weakness of
FNV1. For them, however, the key_len (often derived from
paths) has a good chance of being different.

Note that if the key_len differs, it *cannot* be the same key.
Hence, including the key_len does not create false negatives.

> Another issue: find_entry() now calls drop_entry() in more cases and can
> now call it when find_empty==FALSE during read operations. On Unix when
> using the read-write lock this means the cache gets modified while only
> holding a read lock, not a write lock, and that can corrupt the cache.
> If we use read-write locks then when read detects a fingerprint
> collision it cannot modify the cache to clear the collision, it will
> have to return NULL and leave the entry in the cache.

You are absolutely right. Fixed in r1679681.

-- Stefan^2.
Received on 2015-05-16 06:48:39 CEST

This is an archived mail posted to the Subversion Dev mailing list.