On 10/29/05, Daniel Berlin <dberlin@dberlin.org> wrote:
> 1. Anybody who doesn't have enough memory to hold 128 dirs of their repo
> in memory is probably in trouble anyway. Assuming 100k of info per dir,
> that's only ..... 12.8 meg of memory, *if they hit all the dirs*, and
> they had probably about 1000 files per dir (to generate 100k of info).
> This seems reasonable to me.
This seems like a reasonable tradeoff for the speed gain, although
it's kind of depressing that there's so much overhead for a maximum of
128 entries. I imagine it's the pools that are screwing us here.
> 2. The internal rev of the id makes a perfectly fine hash. We just use
> it as an index into the table, not as the actual id. We still compare
> the ids, of course. The rev was chosen over other possible keys because
> the others are strings, and hashing strings is more expensive.
>
> 3. The number is certainly magic, but it's not easy to make this user
> configurable without adding files to fsfs. I also take the view that
> gcc is probably the size of the average "been using subversion for a
> couple years to store projects" repository. I don't imagine people will
> want the number significantly smaller, however, they may want it bigger.
>
> Maybe we should explore a "config" file to tune these parameters.
>
> (Sorry the diff looks uglier than it should, when you are doing line by
> line replacements like this, unidiff tends to look worse than context
> diff :( )
The patch seems fine to me (with dlr's suggestion of moving the hash
calculation to a macro), assuming it produces a noticable speed
improvement. The whole thing seems like a straitforward extension of
our existing caching.
-garrett
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Oct 31 20:46:27 2005