[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: crash managing a large FSFS repository

From: Simon Spero <ses_at_unc.edu>
Date: 2004-12-13 18:45:03 CET

Eric Gillespie wrote:

>That's exactly what it is. It was much, much worse until r11701 and r11706. However, fs_fs.c:fetch_all_changes still builds a giant hash in memory. I wasn't sure what to do about this, and so left it alone. I seem to recall asking for suggestions but not getting a response, but it's possible i overlooked it as i became busy elsewhere right afterwards.
>
>
I've been looking at scaling issues with fs_fs, but mostly looking at
repository size related issues. This issue is isolated to individual
transactions, so it's simpler to fix and test.

At the moment the code uses memory roughly proportional to the total
lengths of all paths in the transactions.

One approach to reducing the amount of memory needed would be to use a
data structure that models directories, rather than complete paths.
Each directory node should have its own lookup table; the keys can be
just the name of the immediate child relative to this node.
Intermediate nodes for path components that haven't been seen themselves
should be marked as such; if the path is later explicitly encountered,
the mark can be cleared (or vice versa).

This approach requires space roughly proportional to the number of
directories and files in the transaction, rather than total path
length. For big, flat namespaces, this isn't much of a win, but it also
isn't much worse; as the name space gets deeper, and closer to real
source repositories, the win gets bigger. This approach also makes it
faster to determine parent/child relationships.

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 13 18:48:17 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.