Re: uhoh.

From: Greg Stein <gstein_at_lyra.org>
Date: 2001-03-29 01:58:27 CEST

On Wed, Mar 28, 2001 at 05:36:55PM -0600, Karl Fogel wrote:
> Greg Stein <gstein@lyra.org> writes:
> > > Hmm. Now might be a good time to rework the filesystem not to create
> > > per-object subpools, with the corresponding interface changes.
> >
> > +1
> >
> > I'd be happy to see M2 slip a day to get our working set down to a
> > reasonable size.
>
> (I think I must be missing the wavelength here...)
>
> What is the "working set"

The "working set" is the memory that a process needs to complete its
actions. Effectively, the peak usage.

It is kind of a age-old term to describe "how much [memory] do you need to
accomplish your work?"

If you have 1G of mem/swap in your machine, and your working set is 100M,
then you can only run 10 operations simultaneously. If your working set is
1M, then you can run 1000 simultaneously. Thus, the desire for a small
working set.

Second, a server never really wants to swap. If that happens, then you're
really screwed. You want to actually design/scale things to that your
maximum load fits into availble RAM. So if you available RAM is 256MB, then
your can deal with 256 1M processes.

> and how does Jim's suggested change get it
> down to a reasonable size?

At the moment, the FS deals with a bunch of subpools internally. As
described in my "pool usage" thread a while back, this can sometimes be
counter-productive to what the caller is trying to do.

> It seems to me that what Jim is suggesting
> makes individual working sets bigger (if I understand the term
> correctly), not smaller; and that he's suggesting it because calling
> styles make it unnecessary for the fs to be so fussy now.
>
> ... But I'm not sure.
>
> Somebody please clarify? :-)

It puts more control in the caller's hands about how the FS will use memory.
The caller knows more about whether looping/repetition will occur and can
manage the memory better.

Effectively, it moves control of the memory usage from inside the FS to
outside, where more information about calling patterns are known.

I believe the main thrust is to add pools to FS function arguments, rather
than using internal pools.

However, if the FS is going to keep information across invocations (e.g.
some kind of cache), then it will need an internal pool for that (presumably
a subpool of the top-level FS object).

So... no, Jim's suggestion does not increase working set, unless the caller
keeps passing in the same pool (without clearing it occasionally).

> > Note: the (server) working set when using DAV will probably be quite a bit
> > better. It will be operating against the FS a bit at a time, then cleaning
> > up. We won't have these big editor-based pools that glom everything
> > together. Dunno what'll happen on the client, tho.
>
> I'm confused.
>
> In the past, Greg, you've suggested that we not worry too much about
> getting fine-grained editor pool granularity.

Yup. Creating a pool per object, and then using that pool for further
allocations, can lead to lifetime mismatches. For example, let's say that
the FS creates some directory entries, stored in an svn_fs_root subpool. We
fetch the entries and store them away somewhere. Then, we toss the subpool
that held the root and the entries. We've now got some dangling pointers.

Instead, if we always pass in a pool, then the caller can ensure the passed
pool has the same lifetime as the results of the operation.

[ and yes, I believe the dir_entries does use a passed pool; the example is
just for an example :-) ]

Creating fine-grained pools can also create overheads relative to what the
caller is attempting to do with memory management. If the caller is only
ever going to create a single FS object, then why does the FS need to shove
everything into a subpool? It isn't going to need to later clear that pool
to make room for a second FS object in the working set.

Yes, it doesn't hurt much speed-wise to create the pool, but this goes back
to the confusion of lifetimes problem.

> Meantime, Ben just checked in a change that makes the filesystem
> editor use per-object subpools.

The editor uses subpools for two reasons:

1) pools are not passed for the editor functions
2) its semantics imply that it will be called multiple times

In effect, it is creating the subpools to deal with memory management issues
caused by looping because the API doesn't provide for it.

But that is an independent choice from our other APIs. And yes, I could
probably make an argument for including pools in the editor API :-)

> Meantime, the working copy update/checkout editor has been using
> per-object subpools for a long time now. :-)

Again, this is caused by the editor, not by a pool usage design choice.

> I'm not arguing with anything you're saying above; rather, I'm simply
> failing to understand the proposed changes, or even what's motivating
> change here.

Hopefully, I've clarified. Probably not fully answered everything, but
that's what add'l email is for, right? :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Received on Sat Oct 21 14:36:26 2006

This message: [ Message body ]
Next message: Karl Fogel: "Re: CVS update: subversion/subversion/libsvn_wc Makefile.am import.c"
Previous message: Karl Fogel: "Re: uhoh."
In reply to: Karl Fogel: "Re: uhoh."
Next in thread: Jim Blandy: "Re: uhoh."
Reply: Jim Blandy: "Re: uhoh."

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]