[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: OOM problems

From: Marius Gedminas <mgedmin_at_b4net.lt>
Date: 2005-11-02 17:24:04 CET

The second half of this email contains more interesting information.

On Wed, Nov 02, 2005 at 05:25:10AM -0500, John Szakmeister wrote:
> On Saturday 22 October 2005 08:12, Marius Gedminas wrote:
> > I scp'ed the repository to my laptop, ran svnadmin recover for good
> > measure, and retried svnadmin dump with Subversion 1.2.0. The same
> > thing happened. svnadmin eats about 300 megs of RAM in 20 seconds, then
> > I kill it.
>
> Be careful when doing such things. BDB is sensitive to platform, OS, and
> version of the library. If any of those things changed (and it appears that
> at least the version of BDB might have changed), then you might have run into
> a side effect that prevented you from dumping the repository.

I'll keep that in mind. Versions of libdb4.2 are pretty close
(4.2.52-18 from Debian on the original server; 4.2.52-19ubuntu4 on my
laptop), but have been compiled with different versions of gcc.

(I'll reiterate that svnadmin dump fails on the server in the same way,
so a different libdb4.2 cannot be the only reason).

> > I have tried dumping random revisions with svnadmin dump -r N
> > --incremental, and looking at their size with wc -l. There are 28
> > revisions out of 600 that I cannot dump without running out of RAM:
> > 6, 7, 17, 25, 32, 39, ..., 229, 276, 458, 595. The two large commits
> > that I suspected (571 and 597) are not among them.
> >
> > I can access all log messages with svnlook log -r N with no problems.
>
> That only touches part of the database. Try 'svn diff -r5:6 url:://to/repo'.
> That will pull out the entire changeset for that revision.

svn diff -r5:6 fails in the same way (out of memory).

> > What do I do now?
>
> You have a couple of choices. If you can tar the repo up someplace, and email
> me the link, I can take a closer look at the problem. In the event you can't
> do that (because of intellectual property concerns), then there is something
> you can try (and it might be good to do so first).

There are no IP concerns (this repository servers as a backup of my home
directory), but there are some privacy concerns (it contains things like
instant messenging chat logs). Although I've tried to keep various
passwords and SSH/GPG keys out of it, I'm not entirely sure none have
crept in.

I will think about it. I would prefer acquiring sufficient knowledge of
subversion internals/bsddb to be able to debug the problem myself,
perhaps with some guidance. Do you think that is unrealistic?

> Make a copy of the repository, and attempt then a catastrophic recovery:
> db_recover -c -v -h /path/to/repos/db. I believe there was one occassion
> where I saw a similar behavior, and a catastrophic recovery fixed the
> situation. To be safe, I'd dump and load the repository if the catastrophic
> recovery was successful.

Thank you for the suggestion. Alas, it did not help.

By the way, db4.2_verify reports no errors on any of the database
files. svnadmin verify runs out of memory. I will compile subversion
with debug symbols and try to poke around.

(time passes)

Ok, here's what happens: the do..while loop in rep_read_range never
finishes. rep_key inside it alternates between "2y1" and "4hq".

The rep with key "4hq" is of rep_kind_delta kind, with txn_id =
0x80a0198 "", and contents.delta contain exactly one chunk {version = 0
'\0', offset = 0, string_key = 0x80a01e0 "87b", size = 1485, rep_key =
0x80a01f0 "2y1"}.

The rep with key "2y1" is of rep_kind_delta kind, with txn_id =
0x80a03b0 "", and contents.delta contain exactly one chunk {version = 0
'\0', offset = 0, string_key = 0x80a03f8 "84w", size = 1384, rep_key =
0x80a0408 "4hq"}.

Looks like a loop in a data structure that should not contain loops.
Fun fun fun.

I added a printf to that loop, and hacked up a second array of rep_keys, with
an inner for loop to look for duplicates (since I'm not familiar with apr's
hash tables). Here's the chain of looping rep_keys when I run svnadmin
dump --incremental -r 6 on my repository:

  rep_read_range(rep_key="i")
   `> loading rep "i"
   `> loading rep "1r"
   `> loading rep "7f"
   `> loading rep "120"
   `> loading rep "1fr"
   `> loading rep "1sw"
   `> loading rep "2y1"
   `> loading rep "4hq"
  svnadmin: Looping rep_key '2y1'

It appears that all other broken revisions (in my original email I
listed 6, 7, 17, 25, 32, 39, (skipped a bunch of them in the middle), 229,
276, 458, 595, and I just checked all of these) end up with this cycle.
"2y1" is passed directly as the rep_key argument to rep_read_range when
I try to dump rev 458, and "4hq" is likewise passed when I try to read
rev 595.

According to log messages, revs 458 and 595 only changed svn:ignore
properties. I think (although I cannot prove) that the problem is with
svn:ignore on a single directory.

Dear Subversion developers, would you mind adding such a loop check to
rep_read_range? I can send my uber-hacky diff to pinpoint the place
in the code, if necessary.

Cheers,
Marius Gedminas

-- 
Voodoo Programming:  Things programmers do that they know shouldn't work but
they try anyway, and which sometimes actually work, such as recompiling
everything.
-- Karl Lehenbauer

Received on Wed Nov 2 17:26:43 2005

This is an archived mail posted to the Subversion Users mailing list.