[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

API review - svn_info2_t

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Tue, 07 Jun 2011 12:34:37 +0100

Two things about the new svn_info2_t structure.

svn_info2_t contains the following fields, which I'll group into four
sections:

  repos node coordinates:
    repos_root_URL
    repos_UUID
    rev
    URL

  repos node basic info:
    kind
    size
    last_changed_rev
    last_changed_date
    last_changed_author

  repository path lock:
    lock

  WC info:
    wc_info

1) I think we need to change the "shape" of this structure so that WC
info is not "inside" it.

The idea behind this current structure, as I understand it, is that it
should hold repos info and/or WC info about a node, and should be shared
between libraries, such that libsvn_wc can fill in both parts whereas
libsvn_ra (if it were directly used there, which it isn't currently)
would fill in only the repos part and leave the WC part null.

Currently the repos into fields are immediate fields, and there is a
pointer to a WC info struct, so that effectively it "contains" an
(optional) WC info struct, like this:

  svn_info2_t:
  +---------------------------------------------+
  | |
  | svn_wc_info_t: |
  | +------------------+ |
  | | | |
  | (repos node info) | (wc node info) | |
  | +------------------+ |
  +---------------------------------------------+

The definition of svn_wc_info_t is in libsvn_wc, whereas the definition
of the overall svn_info2_t is global, in svn_types.h, with the WC part
being declared there as a pointer to an "incomplete" structure type.

However, there is a problem with this, which is exposed by asking "how
would we write an svn_info2_dup() function?" Any structure like this
should have a 'dup' function. The function should be global (so
implemented in the lowest layer, libsvn_subr), as any code using an
svn_info2_t might need it. Yet it needs to access libsvn_wc data and
use libsvn_wc functions in order to duplicate the WC part. That creates
a cyclic dependency - a layering violation.

The interesting thing to note is that dependencies between "repos info"
and "WC info" are not symmetrical. The repos layer does not know
anything about WC info, but the WC layer *does* already know repos info
about each node. Therefore the shape we need, in order to cleanly
express info about a WC node, is rather the other way around, something
like this:

  svn_wc_info2_t:
  +-----------------------------------------------+
  | |
  | svn_node_info_t: |
  | +------------------+ |
  | | | |
  | |(repos node info) | (wc node info) |
  | +------------------+ |
  +-----------------------------------------------+

Then to report a WC node we would use svn_wc_info2_t, whereas to report
a repos node (nothing to do with a WC) we would use this putative
'svn_node_info_t' alone.

So why do we currently have the WC info innermost? I think it is
because the only thing we do with the info is pass it to a generic info
receiver function that prints whatever info is there (the WC info and/or
the repos node info), and the repos node info is the common part of what
it receives, so it feels Wrong to declare the receiver as taking a "WC
info" structure and more Right to declare it as taking a "generic info
structure that might include WC info".

Now, I think we have to analyze that last thought more carefully. If we
think about a replaced WC node, for example, we can see that a WC node
is not logically "a repository node plus some WC-specific info"; it
might in fact refer to two different repository nodes, or none at all in
some cases. So the feeling was misleading. The receiver's input should
instead be structured differently. Basically, in pseudo-code it should
be:

  info_receiver(repos-info OR wc-info)

in some concrete form or other.

I'm working on how exactly to res-structure this now. Let me know any
thoughts.

2) An independent thing. Note that the "repos node basic info" fields
in svn_info2_t are exactly what the existing 'svn_dirent_t' structure is
defined to hold:

  /** A general subversion directory entry. */
  typedef struct svn_dirent_t
  {
    /** node kind */
    svn_node_kind_t kind;

    /** length of file text, or 0 for directories */
    svn_filesize_t size;

    /** does the node have props? */
    svn_boolean_t has_props;

    /** last rev in which this node changed */
    svn_revnum_t created_rev;

    /** time of created_rev (mod-time) */
    apr_time_t time;

    /** author of created_rev */
    const char *last_author;

    /* IMPORTANT: If you extend this struct, check svn_dirent_dup(). */
  } svn_dirent_t;

The only differences are different names for the fields and one extra
field, 'has_props'.

I would prefer to re-use the svn_dirent_t structure instead of putting
separate fields in an info structure. (If a given info reporting
function doesn't fill in the 'has_props' field, we will simply say so.)
That would simplify existing code such as libsvn_client's
build_info_from_dirent() by allowing us to pass that single
sub-structure around directly instead of unpacking and re-packing its
fields. With this change, the current svn_info2_t definition would look
like:

  repos node coordinates:
    repos_root_URL
    repos_UUID
    rev
    URL

  repos node basic info (svn_dirent_t):
    node_info

  repository path lock (svn_lock_t):
    lock

  WC info (svn_wc_info_t):
    wc_info

(And then we can observe that the set of repos node coordinates is
another group of fields that we should some time create a structure for.
But that's a separate issue.)

- Julian
Received on 2011-06-07 13:35:16 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.