[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Potential Google Summer of Code 2007 project

From: David James <james_at_cs.toronto.edu>
Date: 2007-03-20 20:30:54 CET

Hi Edin,

It's great to hear you're interested in helping out with the Python
bindings. I'd be happy to mentor any students who are interested in
this task.

Currently, the Subversion Python bindings are implemented using SWIG
(see http://www.swig.org/). SWIG is an automatic wrapper generator
which generates bindings for several languages including Python.

Inside SWIG "interface" files, we provide SWIG with typemaps which
explain how to convert between Python datatypes and Subversion native
datatypes.

These datatype conversions include:
   - apr_array_header_t <-> Python array
   - apr_hash_t <-> Python dictionary
   - svn_string_t <-> Python string
   - svn_stringbuf_t <-> Python string
   - svn_error_t * ==> (If an error occurs, throw a Python exception)

If SWIG was perfect, our SWIG bindings could probably consist of five
short interface files, which explained how to convert those five
datatypes between Python and Subversion, and we would have basic, raw,
Python bindings, which look a lot like the original Subversion/C API,
but are written in Python.

Let's pretend for a few moments that SWIG is perfect, and that we
already have perfect SWIG bindings for all of Subversion's datatypes.
(This is by no means true! But let's pretend for a moment.)

Now that we have raw bindings to Subversion, your job is to create a
friendly object-oriented interface to Subversion which is more
Pythonic in nature.

What would Pythonic bindings look like? Here's a few examples of what
you would be able to do with Python bindings:

   client = SubversionClient("http://svn.collab.net/repos/svn/trunk",
username="joecommitter", password="joepassword")

   # Print out the README
   print client.cat("README")

   # Find out who wrote "line 1" of the README file
   lines = client.blame("README")
   print lines[0].author

   # Move a file
   client.move("README", "README2", message="Move the readme file to a
less obvious location")

   # Do several operations in a single transaction
   files = client.ls()
   txn = client.transaction()

   # Make sure that every file in the "trunk" directory starts with a
"hip_" prefix.
   for file in files:
      txn.move(file, "hip_%s" % file)

   # Commit all of the changes at once
   txn.commit(message="Make the Subversion trunk more hip!")

   # Checkout repositories to disk
   wc = client.checkout()

   # Commit changes
   open("hip_README2","w").write("Hello world\n")
   wc.commit(message="Update the README2 file")

Wouldn't it be great if the Python bindings were that easy to use?
Unfortunately, it's not that easy, but it should be. We need a summer
volunteer to fix this problem as a Summer of Code project.

Unfortunately, there's a few obstacles (imperfections in Subversion's
SWIG bindings) which might stand in your way:
   1) SWIG does not automatically wrap arguments to callback
functions. Instead, you must create a C callback function which
manually converts arguments between Python and C using SWIG's APIs.
   2) SWIG does not automatically wrap pointers which are contained
inside arrays or hashes. Instead, you must write a typemap which
manually converts these pointers between Python and C using SWIG's
APIs.

We already have written a large number of conversion functions which
accomplish (1) and (2), but they're really not very much fun to write.
It involves a lot of careful work and error checking.

Once the SWIG typemaps are complete, we also suffer from another
problem: the generated SWIG bindings are undocumented. If you read
through the Subversion include files, you can understand a great deal
about how the Subversion bindings work, but your understanding will
not be complete until you understand the SWIG bindings as well.

When I write functions which use the Python bindings, I often consult
the source code of the SWIG interface files to see how the Python
datatypes will be converted into Subversion datatypes, so as to make
sure that I am providing the correct arguments. We aren't completely
consistent about how we convert datatypes between Python and C, so you
often may have to consult the source code to make sure that it behaves
as you expect.

That said, let's take another look at your primary task, which is to
write higher-level bindings for Subversion in Python. As written, our
project page suggests that these higher-level bindings for Subversion
must be based upon our lower-level SWIG bindings, but I'd like to
suggest another alternative: ctypes.

As of Python 2.5, it's possible to access C datastructures directly
from Python. Using ctypes, you can access the complete Subversion API
via pure Python, without needing to write any complex SWIG or C
wrappers. This feature is also supported in older Python versions if
you download the ctypes module.

ctypes is much simpler than SWIG, and supports many key features that
SWIG does not. For example, ctypes directly supports callback
functions, so you do not need to waste time creating verbose wrappers
in C for every callback function that Subversion supports. ctypes also
supports composite datatypes (e.g. an array of strings), so you don't
need to create individual typemaps for every possible composite
datatype. In fact, using ctypes, you don't need to create typemaps at
all -- you can just call the C functions directly, without worrying
about typemaps.

If you use ctypes, you won't have to write (or modify) any code in C.
You can write everything in Python. As a result, your compile/debug
cycle will be very quick -- you won't have to recompile Subversion or
our Python bindings in order to test out a change to your higher-level
layer.

Overall, I think that it would be easier to write higher-level Python
bindings using ctypes than it would to write them using our existing
SWIG bindings. In this case, you would truly be able to depend on
having a clear interface between your bindings and the underlying C
implementation.

As part of your proposal, I'd love if you could write a simple Python
program which does a few simple things using Subversion. Perhaps you
can move a file from one location to another, directly in the
repository? This shouldn't be too hard to implement using ctypes.

Don't forget to budget lots of time for test cases and documentation!

Cheers,

David

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Mar 20 20:31:15 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.