[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fs_fs core dumps in checksum code

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 04 Aug 2010 13:38:38 +0100

Hi Blair. I had a look but can't see how this can occur.

On Sat, 2010-07-31, Blair Zajac wrote:
> We run long running, persistent processes that expose the svn_fs and svn_repos
> APIs via RPC (using ZeroC's Ice). We've seen some occasional core dumps after
> the processes have been up for two weeks or more. We're running against 1.6.5.
>
> The processes have LRU caches of fs_t and repos_t that they use if there is an
> RPC request onto the same repos or revision.
>
> The core dumps only appear to happen during the commit phase in checksum related
> code. In the first, svn_checksum_dup() is getting a pointer to address 0x48
> which is odd and the other svn_checksum__from_digest() is getting a pointer to
> an out of bounds address.
>
> We haven't had the cycles to investigate them because they happen rarely and we
> have more pressing issues. I'm posting them here just in case anybody else has
> seen these or has an idea.
>
> Regards,
> Blair
>
> #0 svn_checksum_dup (checksum=0x48, pool=0x19730448) at
> subversion/libsvn_subr/checksum.c:190
> #1 0x00002abf50dabf87 in read_representation (contents_p=0x2aaadf514c50,
> fs=0x2aab0af5a730, rep=0x19731478, pool=0x19730448)
> at subversion/libsvn_fs_fs/fs_fs.c:2859

fs_fs.c:2859 is in rep_read_get_baton() which is actually a subroutine
of read_representation(), presumably in-lined by the compiler and
therefore not visible in the debug info.

The pointer 'rep->md5_checksum' is apparently 0x48 at this time.

> #2 0x00002abf50dac762 in svn_fs_fs__rep_contents_dir (entries_p=0x2aaadf514cd0,
> fs=0x2aab0af5a730, noderev=0x19940428, pool=0x19730448)
> at subversion/libsvn_fs_fs/fs_fs.c:3351

(Similarly, fs_fs.c:3351 is in get_dir_contents(), a subroutine of
svn_fs_fs__rep_contents_dir().)

(FWIW, we can see from the location of line 3351 that rep->txn_id is
null.)

> #3 0x00002abf50dadc83 in svn_fs_fs__set_entry (fs=0x2aab0af5a730,
> txn_id=0x15ac3ca8 "683229-ewby", parent_noderev=0x19940428, name=0x1bf5a8d0
> "hash:28",
> id=0x2aab1316b378, kind=svn_node_dir, pool=0x1723ae78) at
> subversion/libsvn_fs_fs/fs_fs.c:4651

(FWIW, at this level the rep of interest has been passed in as
'parent_noderev->data_rep'.)

> #4 0x00002abf50da1dee in set_entry (parent=0x2aab127a9378, name=0x1bf5a8d0
> "hash:28", id=0x2aab1316b378, kind=svn_node_dir,
> txn_id=0x15ac3ca8 "683229-ewby", pool=0x1723ae78) at
> subversion/libsvn_fs_fs/dag.c:346
> #5 0x00002abf50db33a7 in merge (conflict_p=0x2aab0d1b4de8,
> target_path=0x2abf50db7530 "/", target=0x2aab127a9378, source=0x2aab18bde098,
> ancestor=0x27a17c98, txn_id=0x15ac3ca8 "683229-ewby",
> mergeinfo_increment_out=0x0, pool=0x2aaaf76aed28) at
> subversion/libsvn_fs_fs/tree.c:1422
> #6 0x00002abf50db34fa in merge_changes (ancestor_node=0x27a17c98,
> source_node=0x2aab18bde098, txn=<value optimized out>, conflict=0x2aab0d1b4de8,
> pool=0x2aaaf76aed28) at subversion/libsvn_fs_fs/tree.c:1602
> #7 0x00002abf50db3cae in svn_fs_fs__commit_txn (conflict_p=0x0,
> new_rev=0x2aaadf515078, txn=0x15ac3c80, pool=0x2aab14b7afc8)
> at subversion/libsvn_fs_fs/tree.c:1701
> #8 0x00002abf509782cb in svn_repos_fs_commit_txn (conflict_p=0x0,
> repos=0x2aab10c99df8, new_rev=0x2aaadf515078, txn=0x15ac3c80, pool=0x2aab14b7afc8)
> at subversion/libsvn_repos/fs-wrap.c:51
> #9 0x00000000004fd819 in SvnUtil::commit_transaction (repos=0x2aab10c99df8,
> fs=0x2aab0af5a730, txn=0x15ac3c80, client_info=<value optimized out>,
> pool=@0x2aab309374f8) at SvnUtil.cpp:276
> #10 0x00000000004f10e8 in SvnTransactionI::commit (this=0x2aab309374f0,
> clientInfo=@0x2aaadf5153d0, current=<value optimized out>)
> at SvnTransactionI.cpp:964
> #11 0x000000000049ef47 in VnpIce::SvnTransaction::___commit
> (this=0x2aab309374f0, __inS=@0x2aaadf515580, __current=@0x2aaadf515580)
> at vnp_svn_backend.cpp:6796
> #12 0x00002abf50099ea0 in IceInternal::Incoming::invoke (this=0x2aaadf515580,
> servantManager=@0x2aaadf515930) at Incoming.cpp:447
> #13 0x00002abf5006a9ab in Ice::ConnectionI::invokeAll (this=0x23f96b50,
> stream=@0x2aaadf515cb0, invokeNum=1, requestId=4873841, compress=0 '\0',
> servantManager=@0x2aaadf515930, adapter=@0x2aaadf515920) at
> ConnectionI.cpp:2440
> #14 0x00002abf5006eac7 in Ice::ConnectionI::message (this=0x23f96b50,
> stream=@0x2aaadf515cb0, threadPool=<value optimized out>) at ConnectionI.cpp:1110
> #15 0x00002abf5017f92a in IceInternal::ThreadPool::run (this=0x150ee070) at
> ThreadPool.cpp:519
> #16 0x00002abf5018013f in IceInternal::ThreadPool::EventHandlerThread::run
> (this=0x151016d0) at ThreadPool.cpp:759
> #17 0x00002abf50448e02 in startHook (arg=0x151016d0) at Thread.cpp:368
> #18 0x0000003ae6a062f7 in start_thread () from /lib64/libpthread.so.0
> #19 0x00000033b0ad1b6d in clone () from /lib64/libc.so.6
> Current language: auto; currently c

These two back-traces appear to be showing the very same code path.

Again due to in-lining, I presume, fs_fs.c:2859 calls svn_checksum_dup()
but the debugger shows, in this case, only its subroutine
svn_checksum__from_digest().

> #0 0x00002ac15f8bd28c in svn_checksum__from_digest (
> digest=0x646c6f663a706e76 <Address 0x646c6f663a706e76 out of bounds>,
> kind=1949987428, result_pool=<value optimized out>)
> at subversion/libsvn_subr/checksum.c:77

Here, digest is ASCII "dmof:pnv" and kind is hex 743a7264 is ASCII
"t:rd"...

> 77 memcpy((unsigned char *)checksum->digest, digest, DIGESTSIZE(kind));
> #1 0x00002ac15f057f87 in read_representation (contents_p=0x2aaab92d7c50,
> fs=0x2aab80d53590, rep=0x2aabbd57e748, pool=0x2aab1c15a0e8)
> at subversion/libsvn_fs_fs/fs_fs.c:2859

... meaning that 'rep->md5_checksum' points to readable memory but not
to a valid checksum object.

Thus, in both cases, it looks like that pointer was either uninitialized
or subsequently corrupted.

I searched in 1.6.5 code for creation of a representation_t structure
and found nowhere that leaves it uninitialized. At least one place
allocates the structure with apr_pcalloc (zero-initialized) and doesn't
subsequently fill in ->md5_checksum, but that's not what you're seeing.
I searched for creation of node_revision_t structures and found no
places where the data_rep member is uninitialized.

Thus I can only suggest looking for memory corruption.

- Julian

> #2 0x00002ac15f058762 in svn_fs_fs__rep_contents_dir (
> entries_p=0x2aaab92d7cd0, fs=0x2aab80d53590, noderev=0x2aab60107a48,
> pool=0x2aab1c15a0e8) at subversion/libsvn_fs_fs/fs_fs.c:3351
> #3 0x00002ac15f059c83 in svn_fs_fs__set_entry (fs=0x2aab80d53590,
> txn_id=0x2aabccccda58 "532073-bkk4", parent_noderev=0x2aab60107a48,
> name=0x2aab4fa6e0c0 "hash:18", id=0x2aab6921d868, kind=svn_node_dir,
> pool=0x1b49bd78) at subversion/libsvn_fs_fs/fs_fs.c:4651
> #4 0x00002ac15f04ddee in set_entry (parent=0x2aaaf1ff7ee8,
> name=0x2aab4fa6e0c0 "hash:18", id=0x2aab6921d868, kind=svn_node_dir,
> txn_id=0x2aabccccda58 "532073-bkk4", pool=0x1b49bd78)
> at subversion/libsvn_fs_fs/dag.c:346
> #5 0x00002ac15f05f3a7 in merge (conflict_p=0x2aabc568f168,
> target_path=0x2ac15f063530 "/", target=0x2aaaf1ff7ee8,
> source=0x2aab23fea2c8, ancestor=0x2aab482efc18,
> txn_id=0x2aabccccda58 "532073-bkk4", mergeinfo_increment_out=0x0,
> pool=0x2aab41ef8b38) at subversion/libsvn_fs_fs/tree.c:1422
> #6 0x00002ac15f05f4fa in merge_changes (ancestor_node=0x2aab482efc18,
> source_node=0x2aab23fea2c8, txn=<value optimized out>,
> conflict=0x2aabc568f168, pool=0x2aab41ef8b38)
> at subversion/libsvn_fs_fs/tree.c:1602
> #7 0x00002ac15f05fcae in svn_fs_fs__commit_txn (conflict_p=0x0,
> new_rev=0x2aaab92d8078, txn=0x2aabccccda30, pool=0x2aab81433cc8)
> at subversion/libsvn_fs_fs/tree.c:1701
> #8 0x00002ac15ec242cb in svn_repos_fs_commit_txn (conflict_p=0x0,
> repos=0x12f71318, new_rev=0x2aaab92d8078, txn=0x2aabccccda30,
> pool=0x2aab81433cc8) at subversion/libsvn_repos/fs-wrap.c:51
> #9 0x00000000004fd819 in SvnUtil::commit_transaction (repos=0x12f71318,
> fs=0x2aab80d53590, txn=0x2aabccccda30, client_info=<value optimized out>,
> pool=@0x2aab709cf448) at SvnUtil.cpp:276
> #10 0x00000000004f10e8 in SvnTransactionI::commit (this=0x2aab709cf440,
> clientInfo=@0x2aaab92d83d0, current=<value optimized out>)
> at SvnTransactionI.cpp:964
> #11 0x000000000049ef47 in VnpIce::SvnTransaction::___commit (
> this=0x2aab709cf440, __inS=@0x2aaab92d8580, __current=@0x2aaab92d8580)
> at vnp_svn_backend.cpp:6796
> #12 0x00002ac15e345ea0 in IceInternal::Incoming::invoke (this=0x2aaab92d8580,
> servantManager=@0x2aaab92d8930) at Incoming.cpp:447
> #13 0x00002ac15e3169ab in Ice::ConnectionI::invokeAll (this=0x2aabd00c0920,
> stream=@0x2aaab92d8cb0, invokeNum=1, requestId=18708638, compress=0 '\0',
> servantManager=@0x2aaab92d8930, adapter=@0x2aaab92d8920)
> at ConnectionI.cpp:2440
> #14 0x00002ac15e31aac7 in Ice::ConnectionI::message (this=0x2aabd00c0920,
> stream=@0x2aaab92d8cb0, threadPool=<value optimized out>)
> at ConnectionI.cpp:1110
> #15 0x00002ac15e42b92a in IceInternal::ThreadPool::run (this=0x11e58ff0)
> at ThreadPool.cpp:519
> #16 0x00002ac15e42c13f in IceInternal::ThreadPool::EventHandlerThread::run (
> this=0x11e65a50) at ThreadPool.cpp:759
> #17 0x00002ac15e6f4e02 in startHook (arg=0x11e65a50) at Thread.cpp:368
> #18 0x0000003ae6a062f7 in start_thread () from /lib64/libpthread.so.0
> #19 0x00000033b0ad1b6d in clone () from /lib64/libc.so.6
Received on 2010-08-04 14:46:05 CEST

This is an archived mail posted to the Subversion Dev mailing list.