Simon Spero <ses@unc.edu> writes:
> Noise sources :
> Original report is for a memory spike from 19Mb -> 44Mb, so
> results on the order of megabytes are possibly significant.
> Hashtable array size is always a power of two; hash node size is ~20
> bytes.
> First metric was to run find . >/tmp/find-netbsd. Total size is
> 8,139,654 Bytes. (wc -c)
> Number of entries: 193,716 (wc -l)
> Average path length: ~42 bytes
> Measurements were made relative to '.' ; paths in memory would be
> relative to the root of the repository. Adding /trunk/ to start of
> each path would use an extra 6 chars per entry (~1.1MB in this case)
Okay, so 193,716 * (42 + 6) == 9,298,368 == appx 9 MB
That's bad, but is it responsible for a 25 MB jump in memory usage?
Is there other data associated with each of these paths, whose size is
proportional to path depth?
> Second metric is to strip out everthing but the last name component:
> ( sed -e 's;^.*/;;' )
> Total size: 1,654,088
> Number of entries: 193,716 (wc -l)
> Average size: 8 bytes
(What was this metric for?)
This gets us 1549728 == appx 1.5 MB, much better relative to the
earlier figure, but somewhat less better relative to the mem jump
we're seeing. Hmmm. I confess I'm not sure what to think here.
> Third metric: size of interned path-name components (sed ... | sort | uniq)
> Total size: 577,432
> Number of unique strings: 51,236
(This was for estimating if we used a tree-style data structure, right?)
So, taking an average size of 8 bytes (ah, maybe that's what your
middle metric was about), we can multiply 51,236 * 8 to get 409,888.
Nice, about half a meg. Of course, we need to add in the usual tree
structure overhead, which is a whole hash-table per unique entry
except for the leaf nodes. I'm not really sure how to estimate that.
It's more than log(51,236), but less than 51,236. Plus we need a
4-byte pointer per entry...
So, is it really looking so much better than 9 MB, in the long run?
I don't mean to be reflexively skeptical, but at least this
back-of-the-envelope estimate doesn't look promising. Maybe I'm
missing something, though?
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 13 22:42:24 2004