Re: crash managing a large FSFS repository

From: <kfogel_at_collab.net>
Date: 2004-12-13 17:46:27 CET

Chase Phillips <shepard@ameth.org> writes:
> As a follow-up to my thread "svn resource requirements with large
> repositories?" (http://svn.haxx.se/users/archive-2004-11/0180.shtml), I
> was recently able to try out the same procedure with revision 12289 from
> Subversion's trunk. With this revision I experience the same resource
> usage issues that led me to raise this issue at first.
>
> As a refresher, our project is developing a software image that runs on
> top of NetBSD. We need to store NetBSD in a revision-controlled
> environment to track the changes we make to the operating system and
> kernel. I decided to create this new repository locally on the disk in
> FSFS format (our current repo that stores application source is in BDB
> format).
>
> After importing the NetBSD source code and then copying it onto a branch,
> a subsequent checkout leads to a core dump. I've attached one of the
> stack traces from my two attempts to this email (each attempt takes
> upwards of 15 minutes before svn dumps core). The second stack trace
> differs from the first only in memory addresses of variables, though it
> can be sent as well if needed.

It still looks like an out-of-memory error ("abort_on_pool_failure" in
the stack trace), hmmm. Both your client and server were on the same
box, and indeed in the same process, when you reproduced this, right?
I could try to make an educated guess from the stack trace, but it
would be great if we could narrow this down to "server code", or
"client code", or both. (Even when they're in the same process,
they're distinct bodies of code.)

When you say "subsequent checkout", you mean a first-time checkout of
the new branch, right?

What can you tell us about your hardware, memory, etc? (Not because
they're causing the bug in any sense, just to help us figure out what
we need to reproduce it.)

Can we get our hands on your data?

Does it reproduce with BDB instead of FSFS?

> The Subversion issue tracker holds 4 issues that come close to addressing
> this but for one reason or another don't match up well enough to allow me
> to assume they should be used as the target for this issue.
>
> Issue 602 - http://subversion.tigris.org/issues/show_bug.cgi?id=602
>
> "import of large trees can bloat memory on client side"
>
> The last comment to this bug was made 2002/11 and it appears purposed to
> handle import efficiency (I've not had this resource issue doing an
> import of the source code).

Agree that this is probably not your bug.

> Issue 1702 - http://subversion.tigris.org/issues/show_bug.cgi?id=1702
>
> "Initial checkout should scale well for large projects"
>
> This issue focuses on checking out a revision from a remote repository.
> In my scenario, I am checking out from a local repository.

Well, the real point is that 1702 is about time performance, not
memory growth. The local vs remote thing is not such a big
difference. Many problems that are first reported in remote
operations are also present in local operations; it just means they
are problems in the core libraries, not the transport layer libraries.

> Issue 2067 - http://subversion.tigris.org/issues/show_bug.cgi?id=2067
>
> "Perf issues with BDB and directories with a large number of items"
>
> This bug mentions similar problems with FSFS, though the bug summary
> refers strictly to BDB. Is it meant to cover only issues with BDB-based
> repositories?

It looks like it's still mainly about BDB, and anyway is mainly about
import and commit scalability, not specifically about checkouts.

> Issue 2137 - http://subversion.tigris.org/issues/show_bug.cgi?id=2137
>
> "svn update" extremely slow with large repository and FSFS"
>
> The common problems I experience are with checkouts and commits back to
> the local repository. Again, I'm using a late revision of the trunk
> (r12289).

Your earlier description says "a subsequent checkout leads to a core
dump". Here you say you are also having problems with commits. I
suspect the checkout problems are unrelated to the commit ones, so we
should have two separate threads for them. In this reply, I've only
been talking about the checkout problem (because until this moment,
that's the only one I knew about anyway).

> Should one of the above issues be used for tracking this problem? Or
> should I file a new issue, presuming I'm running into a bug in the source
> and not some problem local to my system? Any suggestions for what to try
> next?

I think a new issue would be best. As much reproduction information
as you can give us (numbers of files, sizes of files, names of files)
would be great.

Thanks very much for the report -- and the care you took to make it so
organized & comprehensible.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Dec 13 17:48:26 2004

This message: [ Message body ]
Next message: kfogel_at_collab.net: "Re: <?> Deltify, huh, what is it good for<?> Questions on Deltify and recovery"
Previous message: Jim Correia: "Re: [BUG] svn update fails after reverting a modified file to a revision in which it didn't exist"
In reply to: Chase Phillips: "crash managing a large FSFS repository"
Next in thread: Chase Phillips: "Re: crash managing a large FSFS repository"
Reply: Chase Phillips: "Re: crash managing a large FSFS repository"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]