[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Remove entries cache in adm_access

From: Hyrum K. Wright <hyrum_at_hyrumwright.org>
Date: Fri, 31 Jul 2009 10:31:23 -0500

On Jul 31, 2009, at 1:49 AM, Greg Stein wrote:

> Any write operation within wc_db.c should be calling
> wc_db.c::flush_entries().
>
> As long as you don't mix/match usage style (wc_db vs entries), then
> that cached entries will remain NULL, or will contain "current"
> properly.

Except for right now in the dev process, where we are intentionally
mixing and matching as we move to wc_db.

> ISTR that you disabled the cache once, and the cmdline performed
> *very* unacceptably w.r.t memory. We could expect the same thing to
> happen with any old-API customer. I don't think we can allow ourselves
> to do that. (granted that there are lots of APIs that are access/path
> pairs, and those will not populate the entries cache, even for old API
> users)
>
> Re: the bindings. IMO, you shouldn't be so quick to dismiss them.
> They're part of the product. That said, you aren't set up to
> build/test Windows, and bindings may be the same for you, but I think
> you ought to treat both as "I'll try not to break it, and will support
> those who *can* build/test them". (maybe that's what you said, but it
> came off much more dismissively...)

Yeah, I could have been a bit less terse.

I'm just tired of hearing "the binding are broken! the bindings are
broken!" but nobody actually doing anything about it. I don't use the
bindings; I'm not familiar with developing them or keeping them up-to-
date. I try not to break the bindings, and I'm happy to answer
questions when people are working on them. But, I don't see why
development in our core libs should be hampered by a lack of developer
efforts on the bindings.

Now, all that being said, I'm reminded why keeping the cache is a Good
Thing, so I'll not bother removing it, just flushing it whenever we do
a write in wc_db.

-Hyrum

> On Fri, Jul 31, 2009 at 04:48, Hyrum K.
> Wright<hyrum_wright_at_mail.utexas.edu> wrote:
>> Greg,
>> The doc string for svn_wc_entries() reads thusly:
>>
>> /** Parse the `entries' file for @a adm_access and return a hash @a
>> entries,
>> * whose keys are (<tt>const char *</tt>) entry names and values are
>> * (<tt>svn_wc_entry_t *</tt>). The hash @a entries, and its keys
>> and
>> * values, are allocated from the pool used to open the @a adm_access
>> * baton (that's how the entries caching works). @a pool is used for
>> * transient allocations.
>> ...
>>
>> We've discussed before that removing the entries cache in
>> adm_access will
>> lead to memory explosion, as the entries are re-read from disk into
>> the
>> access pool, without ever freeing memory used for stale entries
>> hashes.
>> (Well, it will lead to *faster* memory explosion, since we still
>> flush the
>> cache under certain circumstances.)
>>
>> However, as we start touching the sqlite db directly, the cache
>> becomes
>> stale quite quickly, since the entries hash isn't modified at the
>> same time
>> the the db is, has happens with svn_wc__entries_modify(). Since
>> we're not
>> using adm_access anymore, we've no way of knowing when to flush the
>> cache,
>> which means we'll need to flush it aggressively--so aggressively
>> that it
>> probably isn't worth keeping the cache around.
>>
>> As for API consumers which are using entries and now have sub-
>> optimal memory
>> consumption, I say "tough rocks, upgrade to the new APIs". That
>> might be
>> harsh, but I think it's the reality as keeping the cache in sync is
>> much
>> more trouble than it's worth.
>>
>> Thoughts?
>>
>> -Hyrum
>>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2377447
Received on 2009-07-31 17:31:43 CEST

This is an archived mail posted to the Subversion Dev mailing list.