[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Python Command-line Client Bindings

From: David James <james82_at_gmail.com>
Date: 2005-06-29 04:33:20 CEST

This summer, as part of the Google Summer of Code initiative, I'm
planning to create a port of the Subversion Command-Line client that
uses the Python/SWIG bindings. Once this new software is complete,
we'll be able to use the standard tests for the regular Subversion
Command-Line Client to identify bugs in the Python bindings.

The current state of the Python tests is glibly described in
tools/test-scripts/svntest-bindings.sh:
 "Hey! My friends have got beautiful and shining unit tests,
  but I have been left out in the cold. This is soooo unfair!"

Here's what I'm planning to do in the next few weeks:

- Refactor the C command-line client:
  * Create a libsvn_cmdline library by extracting functions from
    the main command-line client and moving them to the
    subversion/libsvn_cmdline directory.
  * Create new API header file for the libsvn_cmdline library
    based on cl.h. I'm thinking of calling the new header file
    svn_cmdline_cl.h.
  * Simplify the svn_cmdline_cl.h file so that SWIG can have
    an easier time parsing it. This means we'll have to change
    functions declared as callbacks into vanilla function
    definitions. The standard syntax isn't as compact,
    but it's SWIG-compatible.
  * Simplify the main function in the command-line client by
    splitting it into several smaller functions. This refactoring
    will make it easier for me to slowly convert the main function
    from C into Python, piece by piece.
- Expand the functionality of the Python/SWIG bindings:
  * Create new svn_cmdline module for the new libsvn_cmdline library
  * Create new svn_diff module for the libsvn_diff library. This is
    mostly a stub for now.
  * Create new svn_apr module for the APR library. Create mappings
    for any functionality that is used in the command-line client.
    Two candidates right now: the apr_getopt.h and apr_allocator.h
    header files.
  * Create bindings for the libsvn_cmdline library
  * Create new typemaps for types such as char **
- Refactor the Python/SWIG bindings:
  * Move the FILE* typemap from core.i to a new file called 'file.i'
    so that it can be reused in the new svn_cmdline bindings.
  * Remove the svn_cmdline_init function from the svn_delta.i module.
    This function has been added to the svn_cmdline module instead.
- Convert the C command-line client to Python, piece by piece.
  I'll start with the main function. Then I'll move on to the
  implementations of each basic Subversion command:
   * add, blame, cat, checkout, cleanup, commit, copy, delete,
     diff, export, help, import, info, list, lock, log, merge,
     mkdir, move, propdel, propedit, propget, proplist, propset,
     resolved, revert, status, switch, unlock, update
  To help keep the implementation and maintenance of the new
  functions simple, the C code for each function will be
  directly transliterated into Python. I'll try to use the
  exact same API calls and the exact same logic but in Python
  syntax.

  Along the way, I'll be sure to run into a few issues with
  the C code and the SWIG bindings. Some of these issues have
  already been identified above.

Questions:
- Is it OK to create the new libsvn_cmdline library? It will make my
  command-line SWIG work this summer much easier.
- Is it possible to mark the libsvn_cmdline_cl.h API as experimental
  and subject to change? I think that the API for the command-line
  client could go through quite a few changes and improvements this
  summer and I don't want to have to worry about revving the API yet.
- Is it OK to use malloc / free in the SWIG bindings? In some cases,
  this seems to be the only solution because APR pools are not always
  available in a typemap.

For those who haven't read my Python proposal, I've attached it below.
Feedback and advice is always welcome.

Project Title: Command-line Bindings for Python

----------------------------------------------------------------------
Synopsis
----------------------------------------------------------------------

Subversion is not just a version-control system. It is also a library.
The Subversion library officially supports five programming languages:
C, Perl, Python, Ruby, and Java. As Subversion is updated to fix bugs
and support new features, it is often difficult to know whether these
changes will cause problems in the various programming language
bindings. This problem of hidden bugs is particularly acute in the
Python/SWIG bindings because they do not have an automated test suite.

To help Subversion developers more quickly identify bugs in the Python
bindings, I will implement a clone of the standard command-line
client using the Python/SWIG bindings. This clone will allow the
existing test suite for the Subversion command-line client to also test
the Python command-line client. If this testing reveals bugs or missing
features in the underlying Python/SWIG bindings, the Subversion
developers will automatically be notified via the svn-breakage mailing
list.

----------------------------------------------------------------------
Benefits
----------------------------------------------------------------------

Benefits for Python Developers:
- Simple and consistent usage: Type the same commands into Python as
 you would on the command-line.
- Because the new interface is implemented based on the existing
 low-level library, high-level command-line calls and low-level
 library calls can be intermixed during a single session.
- The existing high-performance Python/SWIG bindings now have proven
 reliability, thanks to the extensive automated test suite for the
 command-line client.

Benefits for Subversion Developers:
- Increased adoption of Subversion in the Python community.
- Automatic nightly test suite will notify developers if changes to the
 Subversion code break the Python bindings.
- Upgraded Python bindings will be easy to maintain because they build
 upon the framework established in the existing SWIG bindings (e.g.
 Ruby, Perl, Python)

----------------------------------------------------------------------
Deliverables
----------------------------------------------------------------------

Code:
- 31 functions. Each implements a basic command
 * add, blame, cat, checkout, cleanup, commit, copy, delete, diff,
   export, help, import, info, list, lock, log, merge, mkdir, move,
   propdel, propedit, propget, proplist, propset, resolved, revert,
   status, switch, unlock, update

- 32 command-line options
 * auto-props, config-dir, diff-cmd, diff3-cmd, dry-run, editor-cmd,
   encoding, extensions, file, force, force-log, ignore-ancestry,
   ignore-externals, incremental, limit, message, native-eol, new,
   no-auth-cache, no-auto-props, no-diff-deleted, no-ignore,
   no-unlock, non-interactive, non-recursive, notice-ancestry,
   old, password, quiet, recursive, relocate, revision, revprop,
   show-updates, stop-on-copy, strict, targets, username, verbose,
   version

Testing:
- The test suite for the standard client will be adapted so that it can
 test the Python client

----------------------------------------------------------------------
Implementation Plan
----------------------------------------------------------------------

1. Write a simple script which creates a Python command-line parser
  based on the svn_cl__options and svn_cl__cmd_table structures from
  main.c in subversion/clients/cmdline. The Python standard optparse
  module will do most of the work, but we will also need to write
  custom code to parse Subversion revision numbers and ranges.

2. Upgrade our script to dispatch each command to the appropriate
  command-line client function. This initial prototype will provide
  the full functionality of the Subversion client, but will only test
  the surface functionality of SWIG.

3. Upgrade the Subversion automated test suite to test our new script
  using the command-line test suite. Fix any errors that are found.

4. Replace each command-line C function in the Python command-line
  client with an appropriate Python implementation. Each function
  should be implemented, tested, and committed as a separate patch. If
  adding a new function reveals a bug in the underlying SWIG/Python
  library, these bugs should be reported to the Subversion development
  list.

----------------------------------------------------------------------
Project Schedule
----------------------------------------------------------------------

This 2 month plan assumes that I will start work on this project
on June 25. The project will be complete by September 1, 2005

1. Planning and Approval (4 days)
- Send the technical details of my plan to Subversion developers
  and solicit feedback
- Revise my plan as necessary to meet the needs of the Subversion
  developers
- Apply to be a partial committer for the Python bindings

2. Initial Prototype (2 weeks)
- Write a simple script which creates a Python command-line parser
- Upgrade our script to dispatch each command to the appropriate C
  function
- Upgrade the Subversion automated test suite to test our new script

4. Implementation, Documentation & Testing (7 weeks)
- Replace each command-line C function in the Python command-line
  client with an appropriate Python implementation. Implement 5 or 6
  API functions per week.
- Monitor Subversion developer list and fix issues as required

5. Project is complete

----------------------------------------------------------------------
Appendix A: Why stick with the existing Python/SWIG bindings?
----------------------------------------------------------------------

Subversion developers love SWIG because it saves them time. "Sharing
the core of the bindings implementation across languages", writes
Daniel Rall, "is powerful reuse." [1]

Nevertheless, the Python/SWIG bindings are not the only libraries which
offer access to the Subversion library. The PySVN and SvnCpp projects
are both written in C++, and they both offer documented and tested
bindings for the Subversion library. However, both sets of bindings
suffer from the same key problem: they only wrap a small subset of
Subversion's functionality, and they reimplement functionality which is
already available in the existing SWIG bindings for Subversion.
Maintaining the two sets of bindings in parallel would be too large a
task for the Subversion development team.

As a result of this situation, Python developers are forced to make a
difficult choice between ease of use and complete functionality.
According to Max Bowsher, "PySVN wraps much less of the API than the
SWIG-Python bindings do, but does it in a higher level (and documented)
way -- it's all about tradeoffs, really." [2] The situation with SvnCpp
is much the same as with PySVN, except for that SvnCpp does not
directly support Python.

In 2004, Ben Reser announced that, for Python, the "SWIG stuff is
pretty much done. You could write the OO layer entirely in Python" [3]
To a developer who considered reimplementing the bindings in Pyrex, Ben
Reser advised: "I think your time would be better spent working on
writing the OO layer on top of SWIG." [3]

Greg Stein also attests to the quality of the SWIG/Python bindings:
"I've been using the Python Bindings for years. Literally." [4]

In building a command-line client in Python, we will get an extensive
test suite and a rudimentary object-oriented interface to the
SWIG/Python bindings for free. While this interface will initially only
support the basic functionality of the Subversion client, we can in
future extend this interface to support additional functionality.

[1]: http://svn.haxx.se/dev/archive-2004-04/1044.shtml
[2]: http://svn.haxx.se/dev/archive-2005-02/0748.shtml
[3]: http://svn.haxx.se/dev/archive-2004-04/1395.shtml
[4]: http://svn.haxx.se/dev/archive-2004-05/0407.shtml

----------------------------------------------------------------------
Appendix B: Functionality Listing
----------------------------------------------------------------------

All documented commands of the command-line client will be supported.
That is, the following commands will be supported.

 * add: Add files, directories, or symbolic links to your working
             copy and schedule them for addition to the repository.
 * blame: Show author and revision information in-line for the
             specified files or URLs
 * cat: Output the contents of the specified files or URLs
 * checkout: Check out a working copy from a repository
 * cleanup: Recursively clean up the working copy
 * commit: Send changes from your working copy to the repository
 * copy: Copy a file or directory in a working copy or in the
             repository
 * delete: Delete an item from a working copy or the repository
 * diff: Display the differences between two paths
 * export: Export a clean directory tree
 * help: Describe the usage of this program or its subcommands
 * import: Recursively commit a copy of PATH to URL
 * info: Print information about PATHs
 * list: List directory entries in the repository
 * lock: Lock working copies paths or URLs in the repository, so
             that no other user can commit changes to them.
 * log: Display commit log messages
 * merge: Apply the differences between two sources to a working
             copy path
 * mkdir: Create a new directory under version control
 * move: Move a file or directory
 * propdel: Remove a property from an item
 * propedit: Edit the property of one or more items under version
             control
 * propget: Print the value of a property
 * proplist: List all properties
 * propset: Set PROPNAME to PROPVAL on files, directories, or
             revisions
 * resolved: Remove 'conflicted' state on working copy files or
             directories
 * revert: Undo all local edits
 * status: Print the status of working copy files and directories
 * switch: Update working copy to a different URL
 * unlock: Unlock working copies paths or URLs
 * update: Update your working copy

The following command-line options will be supported:

 * auto-props: enable automatic properties
 * config-dir: read user configuration files from directory ARG
 * diff-cmd: use ARG as diff command
 * diff3-cmd: use ARG as merge command
 * dry-run: try operation but make no changes
 * editor-cmd: use ARG as external editor
 * encoding: treat value as being in specified charset
                     encoding
 * extensions: pass ARG to --diff-cmd as options
 * file: read data from specified file
 * force-log: force validity of log message source
 * force: force operation to run
 * help: show help on a subcommand
 * ignore-ancestry: ignore ancestry when calculating merges
 * ignore-externals: ignore externals definitions
 * incremental: give output suitable for concatenation
 * limit: maximum number of log entries
 * message: specify commit message ARG
 * native-eol: use a different EOL marker than the standard
                     system marker for files with a native svn:eol-
                     style property. ARG may be one of 'LF', 'CR',
                     'CRLF'
 * new: use ARG as the newer target
 * no-auth-cache: do not cache authentication tokens
 * no-auto-props: disable automatic properties
 * no-diff-deleted: do not print differences for deleted files
 * no-ignore: disregard default and svn:ignore property ignores
 * no-unlock: don't unlock the targets
 * non-interactive: do no interactive prompting
 * non-recursive: operate on single directory only,
 * notice-ancestry: notice ancestry when calculating differences
 * old: use ARG as the older target
 * password: specify a password
 * quiet: print as little as possible,
 * recursive: descend recursively
 * relocate: relocate via URL-rewriting
 * revision: a revision or a range of revisions
 * revprop: operate on a revision property (use with -r)
 * show-updates: display update information
 * stop-on-copy: do not cross copies while traversing history
 * strict: use strict semantics
 * targets: pass contents of file ARG as additional args
 * username: specify a username
 * verbose: print extra information
 * version: print client version info
 * xml: output in XML

----------------------------------------------------------------------
Appendix C: Biography
----------------------------------------------------------------------

David James is an undergraduate Computer Science student at the
University of Toronto. In Fall 2004, David helped write Subversion
bindings for a Java-based academic groupware system. Since then, David
has been a regular contributor to the Subversion project, submitting
17 patches which were reviewed and accepted by Subversion developers.
David's contributions have made the Java and Ruby bindings easier to
compile, test, and install. Recently, David added Ruby support to the
automated nightly test-suite, so that Subversion developers can be
notified by email whenever a Ruby test fails. David is a partial
committer for the Ruby bindings.

At the University of Toronto, David has researched improved statistical
models for understanding natural language, earning three research
awards and a teaching assistantship in the process. As part of his
teaching assistantship, David taught Python to Computational
Linguistics graduate students.

For more information on David James, please see my resume:
 http://www.cs.toronto.edu/~james/David_James_Resume_2005.html

-- 
David James -- http://www.cs.toronto.edu/~james
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jun 29 04:34:27 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.