[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: ugly problem found while trying to test KDE SVN

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2005-02-26 23:44:32 CET

> Doing some strace on the server it became obvious, that the server
> was simply running out of memory. Having had some bad experience
> with cvs, we set a ulimit on our server to disallow (unintended) DoS
> attacks. But 128MB are fine to do a commit, no?

I have looked into this problem after getting some more information
from Stephan over IRC, and here are my findings:

  * The auto-merge code does not use subpools.

  * The auto-merge code asks questions using id_check_ancestor(),
    which once upon a time was a very cheap function, but now involves
    a predecessor walk.

  * In this case, it sounds like we're asking "is A an ancestor of B",
    where the answer is "no", but B is the root directory or at least
    a directory with many many past revs, so the ancestor walk is
    extremely expensive. (I don't yet know exactly what leads to this
    expensive ancestor walk, so I don't know how often it happens, but
    clearly it's happening in this KDE test case.)

  * The DAG code for performing the predecessor walk does not use
    subpools.

  * In summary, the auto-merge code was written a long time ago, and
    no one has wanted to touch it since, and apparently it has scaled
    very poorly ever since we switched to triplet-form IDs.

These findings apply to both back ends, not just FSFS, since this is
code FSFS stole wholesale from BDB and didn't have to touch. It's
possible that the problem is worse in FSFS, in that FSFS might take
longer to do a predecessor walk or might allocate more memory per
predecessor, or both. Or it might be worse in BDB. Hard to say.

Fixing this problem is well beyond the threshold of effort I'm
prepared to put in. I'll note that:

  * Fixing the memory leaks should be pretty easy, but I don't think
    we have many (or any?) test cases which cover this code, so
    testing the fixes might be painful.

  * Fixing the fundamental performance issue will require deep
    analysis of an algorithm we've spent years trying very hard not to
    look at.

  * A possible workaround is to provide a repository or FS option to
    disable auto-merges. That means large commits would be prone to
    failure, unless we introduced a second option to write-lock the
    repository starting from the beginning of a commit. (Easier than
    Unfortunately, that would in turn lead to repositories getting
    stuck because someone started a commit and then fell off the
    network or otherwise failed to gracefully terminate it.

For reference, I'll include the stack trace from Stephan which led me
to the above conclusions. There's no debugging information, but it's
not really needed. All you need to know is that this is what the code
is doing while it's iterating over tens of thousands of rev files and
gobbling up memory.

#4 0x404dd193 in __read_nocancel () from /lib/tls/libpthread.so.0
#5 0x402330c3 in apr_file_read () from /usr/lib/libapr-0.so.0
#6 0x40233372 in apr_file_getc () from /usr/lib/libapr-0.so.0
#7 0x400fa0df in svn_io_file_getc () from /usr/lib/libsvn_subr-1.so.0
#8 0x400fa149 in svn_io_read_length_line () from /usr/lib/libsvn_subr-1.so.0
#9 0x400953b9 in read_header_block () from /usr/lib/libsvn_fs_fs-1.so.0
#10 0x40099940 in svn_fs_fs__get_node_revision () from /usr/lib/libsvn_fs_fs-1.so.0
#11 0x40091961 in get_node_revision () from /usr/lib/libsvn_fs_fs-1.so.0
#12 0x40091ca8 in svn_fs_fs__dag_get_node () from /usr/lib/libsvn_fs_fs-1.so.0
#13 0x40091da2 in svn_fs_fs__dag_walk_predecessors () from /usr/lib/libsvn_fs_fs-1.so.0
#14 0x40091eee in svn_fs_fs__dag_is_ancestor () from /usr/lib/libsvn_fs_fs-1.so.0
#15 0x4009d3b5 in id_check_ancestor () from /usr/lib/libsvn_fs_fs-1.so.0
#16 0x4009dbb6 in merge () from /usr/lib/libsvn_fs_fs-1.so.0
#17 0x4009dfb5 in merge_changes () from /usr/lib/libsvn_fs_fs-1.so.0
#18 0x4009e7a5 in svn_fs_fs__commit_txn () from /usr/lib/libsvn_fs_fs-1.so.0
#19 0x40089cb6 in svn_fs_commit_txn () from /usr/lib/libsvn_fs-1.so.0
#20 0x4007a7f8 in svn_repos_fs_commit_txn () from /usr/lib/libsvn_repos-1.so.0
#21 0x40075fcb in close_edit () from /usr/lib/libsvn_repos-1.so.0
#22 0x40026b49 in svn_client__do_commit () from /usr/lib/libsvn_client-1.so.0
#23 0x4002501f in svn_client_commit () from /usr/lib/libsvn_client-1.so.0
#24 0x0804cb1d in svn_cl__commit ()
#25 0x08050a4b in main ()

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Feb 26 23:45:50 2005

This is an archived mail posted to the Subversion Dev mailing list.