[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH]On branch swig-py3: accept both of bytes/str for inputchar * args

From: Yasuhito FUTATSUKI <futatuki_at_yf.bsdclub.org>
Date: Mon, 14 Jan 2019 18:30:12 +0900

On 1/14/19 3:10 PM, Yasuhito FUTATSUKI wrote:
> In article <CAEZNtZ+SKqiATLwDvzyVf=SULPGhT7usxhZEawNfextPWsbEag_at_mail.gmail.com>
> troycurtisjr_at_gmail.com writes:
>> On Thu, Jan 10, 2019 at 9:50 AM Yasuhito FUTATSUKI
>> <futatuki_at_yf.bsdclub.org> wrote:
>
>>> The patch attached modifies 4 kind of input argment translations.
>>>
>>> (1) typemap(in) char * (with/without const modifiers); not allow NULL,
>>> typemap(in) const char * MAY_BE_NULL; allows NULL
>>> These had done by using 'parse=' typemap modifier, however there is no
>>> PyArg_Parse() format unit to convert both of str and bytes in py3.
>>> So I make a function svn_swig_py_string_to_cstring() in swigutil_py.c,
>>> and use it in for new typemap definition.
>>>
>>> consideration:
>>> * For py2, my patch code uses svn_swig_py_string_to_cstring()
>>> - It isn't allow Unicode for input, however 's' and 'z' format units
>>> PyArg_Parse() Unicode in py2. If it is need to accept Unicode in py2,
>>> it is need to fix. (use svn_swig_py_string_to_cstring() py3 only, or
>>> add code to conversion for py2)
>>
>>
>> Yes I think you should support Unicode in this case, but it turns out
>> you are most of the way there. If you just remove the IS_PY3
>> conditional, it will support unicode! The "PyBytes_*" and "PyStr_*"
>> functions are wrappers provided by the py3c library. The names point
>> to the concept that they target, and then map it appropriately in Py2
>> and Py3. So
>>
>> PyBytes: Sequence of byte values, e.g. "raw data"
>> In Py2: str
>> In Py3: bytes
>>
>> PyStr: Character data
>> In Py2: Unicode
>> In Py3: str
>
> Unfortunately, PyStr in py3c compatibility layer API is the intersection
> of PyString in Python 2, and PyUnicode in Python 3, so we must explicitly
> use PyUnicode_* for handling Unicode in py2.

In py2, it seems there is no C API to return (const) char * buffer
corresponding to Unicode object directly In py3, PyUnicode_AsUTF8() returns
(const) char *, without extra object reference (and py3c uses it as the entity
of PyStr_AsUTF8()).

So, for py2, we should explicitly convert Unicode object into new Bytes object
and hold its reference while API call, then release it after API call
in %typemap(freearg).

Alternatively, revert to use %typemap(in, parse="s"), %typemap(in, parse="z")
in py2 (only).

-- 
Yasuhito FUTATSUKI
Received on 2019-01-14 10:32:49 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.