Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de> writes:
> Philip Martin wrote:
>> Stefan Fuhrman <stefanfuhrmann_at_alice-dsl.de> writes:
>>> svn_cache_t lookup returns a copy of the value found.
>>> For objects of variable size, this can be expensive. If you
>>> checkout / export all N files of a specific directory, for
>>> instance, every single path gets verified. The directory
>>> information is cached for FSFS but the copy operation
>>> will still be O(N).
>>> As a result, the whole check takes O(N2). The good
>> That sounds bad. Do you have any measurements to show how much
>> improvement your patch gives?
> Well, choose your N ;) For low square mean numbers of files
> per folder as in TSVN and SVN, the gains were between 3
> and 4 percent of total "svn export" runtime. If you want me to,
> I could run a test with 10k files in a folder (probably a realistic
> upper limit).
Sounds promising. In this case I think server performance is the more
interesting. If you can be bothered then run a server, do a checkout
of a 1K or a 10k directory (probably many times) and look at CPU or
memory usage of the server. But if you are getting 3% on the client
then the server is probably getting a bigger gain.
>>> - SVN_ERR(svn_cache__get((void **) &manifest, &is_cached,
>>> - ffd->packed_offset_cache, &shard, pool));
>>> + SVN_ERR(svn_cache__get_pinned((void **) &manifest, &is_cached,
>>> + ffd->packed_offset_cache, &shard,
>>> if (is_cached)
>>> *rev_offset = APR_ARRAY_IDX(manifest, rev %
>> would now be a bug because an error would cause the unpin to be
>> skipped. This is an interface that is easy to get wrong, how about a
>> calback interface similar to svn_wc__call_with_write_lock?
> The implementation is supposed not pin nor lock upon
> lookup failure. Perhaps, I should make that clear in the
> API documentation.
What I meant was that with code like:
it's easy to make a change in the future like:
and that is a bug. We have had bugs in the past like that: errors
that leak, working copy locks that leak, and now we have pins that
leak. If you write an interface like svn_wc__call_with_write_lock
then it's much harder to leak.
>> So pinning leaves the cache locked and failing to unpin is essentially
>> an unrecoverable error.
> Yes. Even if the locking would not be a problem. Pinning
> is equivalent to locking a resource (the entry in that case).
> Memory, for instance, will not be reclaimed until you destroy
> the whole cache. This is the very purpose of this functionality.
>> I think your claim that the interface allows more than one item to be
>> pinned is also suspect. How would the locking work? Could two
>> threads each pin an item?
> The current implementation is more simplistic than what
> the interface allows. Locking the cache is a by-product
> of that simplistic implementation.
>> Do you really need to leave the cache locked? Would it be enough to
>> arrange for the item not to be evicted from the LRU cache? Move the
>> pinned item from the LRU cache to a pinned cache, and move it back
>> when unpinned?
> An alternative implementation would associate every
> cache entry with a pin count. While the latter is >0,
> the respective entry will not be evicted from cache.
> The basic meaning of "pin" is: 'keep this reference
> valid as long as it is pinned'.
> The downside is that you may be tempted to pin many
> items or to pin them for a longer time, possibly forgetting
> to release these items in certain edge-cases.
An svn_wc__call_with_write_lock interface mostly solves the forgetting
to unpin problem.
> That is the reason why I chose to pin at most one item
> at any given time. Mistakes in cache usage would become
> obvious pretty quickly. On the downside, multi-threaded
> code needs additional coordination to make that "only 1"
> limit feasible. Thus, the lock.
> If your assessment is that "misusing" the lock is more
> risky than the risk of a temporary memory leak, I can
> change the implementation this weekend and post a
> new patch.
Leaving the cache locked feels wrong to me, but that really is an
BTW, there is at least one memory bug associated with the cache:
something to do with the way apr hash tables reuse memory.
Received on 2010-04-16 12:42:19 CEST