[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: object-model: Wrapping Subversion C-structs in C++

From: Branko Čibej <brane_at_xbc.nu>
Date: Fri, 24 Sep 2010 08:02:32 +0200

 On 24.09.2010 04:05, Hyrum K. Wright wrote:
> On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej <brane_at_xbc.nu> wrote:
>> On 22.09.2010 21:41, Hyrum K. Wright wrote:
>>> [ apologizes for the somewhat stream-of-conscious nature of these mails ]
>>>
>>> On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
>>> <hyrum_wright_at_mail.utexas.edu> wrote:
>>>> On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
>>>> <hyrum_wright_at_mail.utexas.edu> wrote:
>>>>> For the C++ folks out there, I've got a question about an approach to
>>>>> take on the object-model branch. At issue is how to wrap the various
>>>>> C structures returned to callers, particularly in a backward
>>>>> compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I
>>>>> see it, there are a few options:
>>>>>
>>>>> 1) Just wrap the pointer to the C struct as a member of the wrapper class.
>>>>> Pros: Easy to implement; lightweight constructor.
>>>>> Cons: Getters would need to translate to C++ types; would need to
>>>>> implement a copy constructor which deep copies the C struct; would
>>>>> also introduce pools, since creating and duplicating C structs
>>>>> requires them.
>>>>>
>>>>> 2) Wrap each C struct member individually
>>>>> Pros: C->C++ complexity is constrained to the constructor,
>>>>> everything else is C++ types
>>>>> Cons: Hard to extend for future compatibility
>>>>>
>>>>> 3) Just pass the C-struct pointer around; don't even bother with a class
>>>>> Pros: Dead simple.
>>>>> Cons: Requires more memory management thought by consumers; not
>>>>> C++-y enough; may introduce wrapping difficulties.
>>>>>
>>>>> I'd like to come up with something consistent, which would be used
>>>>> throughout the C++ bindings. I'm also interested in a solution which
>>>>> ensures the C++ bindings can be used as the basis for other
>>>>> object-oriented bindings models (Python, Perl, etc.)
>>>> After lunch, and some thought, it feels like #1 is the best solution.
>>>> This doesn't change the external class interface, which is good, and
>>>> can still provide C++ values to callers who want them. The pool
>>>> issues are a bit messy, but at least the object can manage it's own
>>>> memory (albeit at a significant overhead).
>>> This could get ugly.
>>>
>>> Creating and destroying pools all over the place could get ugly, but
>>> it's necessary evil because all of our object creation / duplication
>>> functions all require a pool. An alternative would be a set of
>>> functions returning the size of the object, and then another which
>>> puts the object in a pre-allocated memory location. (These could
>>> theoretically replace the pool argument version of the API, but that'd
>>> be *a lot* of churn.)
>>>
>>> The approach would let the C++ allocate the memory (of the correct
>>> size) using whatever scheme it wants, and then do a "placement
>>> initialize" using the second API. If we do go this route, I'd
>>> recommend exposing these as private-to-Subversion, at least initially.
>>>
>>> The other option is just pass the C-struct pointer around everywhere,
>>> but then the bindings consumers have to work about this exact same
>>> issue. In other words, it solves it now, but really just pushes the
>>> problem elsewhere.
>>>
>>> -Hyrum
>> Memory management with pools and C++ -- keep away from doing it in
>> per-object ctor/dtor pairs is all I can say. Lifetimes would get so
>> messy and mucked up you could hardly believe it.
>>
>> IMHO the best way to wrap the C structures in C++ is to subclass 'em.
>>
>> class svn_some_thing : public svn_some_thing_t { .... };
>>
>> This way you can pass C++ pointers directly to the C implementation,
>> most methods become just inlined wrappers. Callbacks are a bit more
>> hairy, but not all that much.
> This works for input types, but right now it's the output types (which
> are much more common) that I'm concerned about. Things like
> svn_commit_info_t, or svn_wc_notify_t. For my initial hack at it see:
> http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600

If you're strict about the wrapping, i.e., do not add any data members
or virtual functions to the wrappers, then a simple static_cast of the
returned pointer will solve the type conversion for output parameters.
It makes the wrapper methods a bit more complex, but still easily inlinable.

Two things worry me about your approach:

    * Addidional dereference overhead, and construction overhead ...
      might or might not be important performance-wise, and certainly is
      important bug-wise ...
    * The pool-per-object that you already mentioned, made worse by the
      fact that you implement operator= ... nice syntactic sugar, but
      has serious side effects.

The pool-per-object is especially bothersome because it creates the
false sense that you don't have to worry about pool lifetime ... but you
*do* because every pool has a parent, you've just hidden the fact away
very obscurely. If you /do/ want to give each object a reference to its
containing pool (by no means a bad thing), there are better ways, see below.

Consider too what you'll do with destructors and pool destruction order.
Clearly any destructor for objects such as yours cannot be allowed to
touch the pointer to the object contained in the pool, but what happens
if the parent pool is destroyed before the object that contains the
child pool? (I guess you noticed that pools are decidedly un-C++-like.)

>> As to pools ... I'd once thought that a proper C++ wrapper for a pool
>> would be a reference-counted smart pointer. That turns out to be too
>> much overhead and too much of a good thing, since you *cannot* allow
>> pool lifetime to depend on some reference counting order of execution.
>> Best just use plain pool pointers, or maybe wrap them minimally so that
>> destructors do the pool cleanup reliably.
> Yeah, we've got a Pool class which destroys the underlying APR pool in
> the dtor, so that provides a handy way to clean things up (if it
> weren't for the 8k overhead of the minimum pool size). To the initial
> hack, I've added a smart pointer to prevent multiple redundant copies
> of the underlying C struct, since I expect it to be read-only (we
> could probably add an explicit Clone() method if folks really wanted a
> deep copy). The smart pointer magic is here:
> http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h
>
> There's still a self-managed pool for allocating the C-struct, but
> it's encapsulated enough to be changable should we have a better way
> of duplicating objects in the future.
>
> Thanks for the insight, by the way. These classes and paradigms are
> in no way set in stone, and I welcome discussion. I sense you've got
> a bit more C++ design experience than I do. :)

Mmmm ... wouldn't know about that, and it's been ages since I'd touched
C++. Nevertheless ... the Pool class gives me the shivers. :) At the
very least you forgot to declare the copy constructor private Your code
allows passing Pools by value to functions ... not a good idea, as you
noted yourself in the comment.

I suggest you take a look at auto_ptr and auto_ptr_ref. You need a
similar pair of classes that will deal with pools, then you can strictly
control when you're passing a "handle" to a pool and where you're
holding onto the pool itself. You can't use auto_ptr directly, but you
can use a simple wrapper to put inside auto_ptr. You may even then be
able to derive the real Pool and PoolRef from auto_ptr and auto_ptr_ref,
respectively ... I forget how the inheritance rules for auto_ptr work,
they might be tricky. Certainly wrapping will work, though.

-- Brane
Received on 2010-09-24 08:03:30 CEST

This is an archived mail posted to the Subversion Dev mailing list.