[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH]On branch swig-py3: accept both of bytes/str for input char * args

From: Yasuhito FUTATSUKI <futatuki_at_yf.bsdclub.org>
Date: Thu, 10 Jan 2019 23:49:21 +0900


I've made a patch swig-py3_at_1850520, to accept both of bytes and str
objects for input args correspondings char * args of Subversion API
on Python 3 wrapper functions. It is just a interim report, because
I have some points I want to make clear, and/or need to fix.
(Moreover, I've not touched code to convert return value of Python
callback functions yet)

The patch attached modifies 4 kind of input argment translations.

(1) typemap(in) char * (with/without const modifiers); not allow NULL,
     typemap(in) const char * MAY_BE_NULL; allows NULL
These had done by using 'parse=' typemap modifier, however there is no
PyArg_Parse() format unit to convert both of str and bytes in py3.
So I make a function svn_swig_py_string_to_cstring() in swigutil_py.c,
and use it in for new typemap definition.

* For py2, my patch code uses svn_swig_py_string_to_cstring()
   - It isn't allow Unicode for input, however 's' and 'z' format units
    PyArg_Parse() Unicode in py2. If it is need to accept Unicode in py2,
    it is need to fix. (use svn_swig_py_string_to_cstring() py3 only, or
    add code to conversion for py2)
   - Difference of TypeError exception message. Pyrg_Parse() reports
    argment by argnum in Python wrapper function. However it can't
    determin in typemap definition code, so my patch code uses argment
    symbol instead.
* For py3, it seems to need more kindly Exception message for
  UnicodeEncodeError, which can be caused if input str contains surrogate
  data (U+DC00 - U+DCFF).
* test case for UnicodeEncodeError is needed

(2) in core.i, typemap for svn_stream_write

* As this typemap doesn't seems to accept Unicode on py2, there seems to
  be no regression like described in (1)
* As this typemap only used for svn_stream_write wrapper which have only
  one char * arg, default UnicodeEncodeError message seems to be sufficient.

(3) typemap(in) apr_hash_t * (for various types)
These are using make_string_from_ob() for hash key string conversion,
and typemap(in) apr_hash_t *HASH_CSTRING uses it for hash value
conversion, too. Similarly typemap(in) apr_hash_t *PROPHASH uses
make_svn_strinf_from_ob() for hash value conversion

* It seems some of API (e.g. svn_prop_diffs()) allows NULL for hash value,
  but current implementation of conversion function doesn't allows. Isn't
  it needed? (I added test for this case, but disabled until it make clear)
* test case for UnicodeEncodeError is needed (for both of hash key and
  hash value)

(4) typemap(in) apr_array_header_t *STRINGLIST
This typemap is using svn_swig_py_unwrap_string() through the function

* test case for UnicodeEncodeError is needed (for both of hash key and
  hash value)



Received on 2019-01-10 15:50:37 CET

This is an archived mail posted to the Subversion Dev mailing list.