[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Merging Ctypes Python Branch

From: Eric Gillespie <epg_at_pretzelnet.org>
Date: 2007-08-31 20:34:58 CEST

I've finished reviewing the high-level interface. A lot of these
things are probably on someone's TODO list already.
Nevertheless, I think this needs more work before merging onto
trunk, especially now that we're starting to firm up the 1.5
release schedule.

- Style nits:

  Lots of lines are broken for no good reason, e.g. in

- self._info_func = \
- svn_info_receiver_t(self._info_wrapper)
+ self._info_func = svn_info_receiver_t(self._info_wrapper)

  We as a project don't seem to have a rule about trailing
  whitespace, much to my dismay, but some folks (notably hwright)
  have gone around cleaning it up of late. Some of these files
  have a really bad case of trailing whitespace. Please use
  spacehi.vim or (setq show-trailing-whitespace t) :).

  Docstrings should adhere to the style guide:

  repos._fs.root has spaces around the = in default arguments,
  but nothing (or almost nothing) else does. Python style is no
  spaces here.

- WC._build_path_list

  This sort of magic is generally a bad idea. It isn't at all
  easier for callers to pass in "file" than ["file"], but
  allowing both can lead to hidden bugs. I looked to see if
  other functions taking path lists behave this way, but it seems
  to be limited to wc.py.

  Not all methods use _build_path_list; lock and unlock, for
  example, just take a list.

- WC.pool and WC.iterpool

  We use the name "iterpool" for pools we clear on each iteration
  of a loop. I'm not sure of its usage here. Some methods use
  it, others use self.pool (which is never cleared). Some
  methods that use self.iterpool clear it, some don't. For those
  that clear it, I wonder about the lifetime of objects returned.

  Specifically, I was thinking about commit_info. I see that
  WC.commit does not clear self.iterpool; is that so the returned
  commit_info will still work? It's no good to have these
  methods vary so.

  I'm not sure we can do anything to save quick scripts from pool
  management except allocating everything in one pool that we
  never clear. I've always assumed that's what we do in swig,
  though I'm not sure, as I manage pools myself.

- WC.copy WC.move

  Need to return commit_info, not pass in NULL.

- Rename WC.resolve to WC.resolved (actually, you have both).

- I don't understand this:

# As of ctypes 1.0, ctypes does not support custom error-checking
# functions on callbacks, nor does it support custom datatypes on
# callbacks, so ctypesgen sets the return values for callbacks
# to c_void_p.
# Callback functions, therefore, won't have their errors checked
# automatically. Users who call these functions should wrap
# their function call using SVN_ERR so as to ensure that any
# exceptions are thrown appropriately.

  Does it mean that I must wrap all svn calls I make inside my
  progress_func callback with SVN_ERR()? Or does it mean I must
  wrap all calls I make to an svn function takes a callback in
  SVN_ERR()? Also, this needs to be in user-facing documentation
  somewhere, not buried in a comment down here.

- Why the staticmethod on the callback wrappers? That shouldn't
  be necessary. If you drop that, you'll get self as the first
  argument as normal, and can ignore the baton entirely. Batons
  are just a way to fake closures in C; Python code should never
  have to bother with them.

- cancel_func

  How is this supposed to work? It seems to me that for the
  high-level side, my callback should do something like 'raise
  csvn.CancelError("Caught signal")' and the high-level wrapper
  should catch that and return an appropriate svn_error_t. The
  current high-level cancel wrapper doesn't do anything like
  that, but it also ignores my callback's return value, so I
  can't even do it the low-level way. I don't think these work
  right now.

  setup_auth_baton passes in a bogus cancel_func.

- progress_func

          For details of apr_off_t objects, see the apr_off_t

  Cute ;->. The high-level cb wrapper should translate to normal
  Python longs before calling the real cb.

  I bet other callback type operations, like log and status, have
  a similar problem: receiving paths as some ctypes object rather
  than a str, for example.

- WC.checkout should return result_rev

- RepositoryURI.__init__

  More of the magic with the uri argument, this time with an
  encoded parameter so callers can work around it. As with
  character encoding, the barriers should be at clearly defined
  places (usually at end-uesr interaction), not at random layers.
  Just require a URI-encoded argument.

  I'm not sure this class is a good idea in the first place,
  though. It seems to me that a high-level, Pythonic interface
  should integrate well with standard Python modules and idioms,
  not export the bits of svn and apr that Python already has.
  It's the same reason the high level interface should take
  lists, strs, dicts, and so on as opposed to ctyped apr
  equivalents. In this case, I want to use urlparse.

- RemoteRepository -- well, it's not always remote. I suggest
  just Repository and LocalRepository. Furthermore, the Local
  version is missing some things, such as the import method.
  LocalRepository should be a superset of Repository. I suggest
  making Local a subclass of Repository; it can just prepend
  file:/// and let the superclass methods do all the work.

- Rename RemoteRepository.log's verbose argument to
  discover_changed_paths. Same for _LogMessageReceiver. Also,
  _LogMessageReceiver.receiver probably ought to set
  entry.changed_paths based on whether the changed_paths hash is
  NULL or not.

- Are you planning to use CallbackReceiver to turn all functions
  which call receiver callbacks into generators?

- RemoteRepository.revprop_get should use get_dir or get_file.

- RemoteRepository._log_func_wrapper is duplicated in wc.py

- LocalRepository.__init__

  A separate create constructor would be better than having a
  create argument here.

- LocalRepository vs. RemoteRepository

  - Remote
    proplist, propget
    revprop_list, revprop_get, revprop_set

  - Local
    no node prop methods at all
    set_rev_prop, get_rev_prop, no list

  revprop vs. rev_prop
  noun_verb vs. verb_noun

  Yeah, I know, svn is awful about this, but no reason to
  propagate the mess into the high-level layer. I vote for
  revprop and verb_noun.

- inconsistent assert

  IMO, just drop all the asserts. If someone fails to provide a
  working User instance, for example, that's their problem, and
  they'll get appropriate exceptions. It's no different from
  passing a bool as get_rev_prop's name argument.

  If we're going to go with the asserts, though, we should use
  them everywhere. For example, LocalRepository.txn asserts
  self.user but _get_commit_editor does not.

- _get_commit_editor (and others) have comments that should
  become docstrings.

- _fs and _fs_root can lose the NOTE about being private; the
  leading _ already indicates that.

- _txn_operation.__init__ should use SVN_INVALID_REVNUM instead
  of -1 for copyfrom_rev default argument.

  _txn_operation had me scratching my head for a while, but I
  think I understand now. So, the Txn class is all about
  building up a transaction in client memory and only driving the
  editor when the caller commits? Why?

  I'm not saying it's never a good idea, but it seems odd for
  this to be the default mode of operation, much less the only
  mode. Also, the open method isn't open at all; rename it
  (add_op? queue_op? dunno).

  Finally, it's important to hold open the file batons for which
  we'll send txdelta until the end. The performance impact is

- Txn.upload -- rename to import? "Upload" is so generic as to
  be meaningless; all txn operations are uploading of a sort.
  However, this upload method seems to do almost exactly the same
  thing as svn import, so that seems a better name.

  You also have RemoteRepository.svnimport (redundant "svn" in
  the name), and it looks like the only real difference is that
  upload will replace. That's a nice feature, but why not make
  it optional, with a bool argument 'replace', and make the
  import method of the two Repository classes a simple wrapper
  around Txn import/commit calls.

- Most of the Txn methods do a check_path first, which is really
  going to hurt performance, but is redundant besides. If I try
  to mkdir an existing path, I'll get an error. SVN_ERR_BAD_URL
  is the wrong code anyway; it should be SVN_ERR_FS_NOT_FOUND.
  Regarding "No such file or directory", I think we normally just
  say "path not found".

- Txn check_path and copy use peg rev syntax in their doc
  strings, which is not right; I think they should say "PATH in
  REV" instead, to reduce confusion.

- Txn.copy doesn't need an if for the path kind; both blocks are
  identical anyway.

- Txn.commit -- the entire span when the edit is open needs to be
  wrapped in a try/except/abort, and for all exceptions. And
  since an exception is already in progress, the inner except
  should also trap all exceptions:

@@ -225,13 +226,13 @@
         (editor, editor_baton) =
             self.commit_callback, commit_baton, self.pool)
- child_baton = c_void_p()
+ child_baton = c_void_p()
             self.root.replay(editor[0], self.session, base_rev,
- except SubversionException:
+ except:
- except SubversionException:
+ except:

- SvnDate -- What I want in a high-level interface is easy
  integration with standard Python date/time modules, not

Eric Gillespie <*> epg@pretzelnet.org
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Aug 31 20:32:09 2007

This is an archived mail posted to the Subversion Dev mailing list.