[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [swig-py3][patch] interfacing bytes object instead of str

From: <futatuki_at_poem.co.jp>
Date: Thu, 27 Dec 2018 01:51:19 +0900

On 12/25/18 12:20, Troy Curtis Jr wrote:
> On Mon, Dec 24, 2018 at 1:11 PM Yasuhito FUTATSUKI <futatuki_at_poem.co.jp>
> wrote:

> Awesome! I'm really glad you are helping work on this, helps to keep me
> motivated as well ;)

I'm glad if I could some help, too :)
  
> So I wonder, as a guiding principle if we could support either Str or Bytes
> as an input? That would be akin to most
> of the standard module interfaces. I think that would be fairly easy to
> support, and would help ensure that the
> behavior is much closer to the typical expectation. it would also reduce
> the number of b"" that are necessary :P Of
> course, in many cases for the standard library interfaces, the input type
> is used as a hint to indicate what type to
> return, which is much more difficult/impossible to fully support in these
> bindings. That seems like it would provide
> a way to make using the bindings more familiar, though we'd still have to
> have the policy of just always returning Bytes
> objects as output. That input/output disconnect might undo any ease of use
> advantage than just always using Bytes
> ...I'm not sure exactly what my opinion on that is...

It is just I wanted to confirm, how deal with inputs :)

I also want that both of Bytes and Str is acceptable input values,
especially for paths on file system (not Subversion's internal paths).
So, I'll tweak typemaps for it.

> The patch inline for comments:
>
> diff --git a/build/ac-macros/swig.m4 b/build/ac-macros/swig.m4
>> index d865689de5..c797ae69d8 100644
>> --- a/build/ac-macros/swig.m4
>> +++ b/build/ac-macros/swig.m4
>> @@ -154,7 +154,7 @@ AC_DEFUN(SVN_FIND_SWIG,
>> ])
>>
>> if test "$ac_cv_python_is_py3" = "yes"; then
>> - SWIG_PY_OPTS="-python -py3"
>> + SWIG_PY_OPTS="-python -py3 -DPY3"
>> else
>> SWIG_PY_OPTS="-python -classic"
>> fi
>>
>
> We shouldn't define a new preprocessor value here, but rather make use of
> the existing IS_PY3 value defined by py3c.

It was really needed because IS_PY3 macro symbol is not recognized to
SWIG, because it is C preprocessers macro, not SWIG macro. I've looked
for distinguish py2 and py3 on SWIG context, but I couldn't find suitable
macro. (however this is no more needed)

> diff --git a/subversion/bindings/swig/core.i
>> b/subversion/bindings/swig/core.i
>> index e5a234c5e2..c309d48070 100644
>> --- a/subversion/bindings/swig/core.i
>> +++ b/subversion/bindings/swig/core.i
>>
>
> [snip]
>
> @@ -442,16 +442,18 @@
>> */
>> #ifdef SWIGPYTHON
>> %typemap(in) (const char *data, apr_size_t *len) ($*2_type temp) {
>> - if (!PyStr_Check($input)) {
>> + char *tmpdata;
>> + Py_ssize_t length;
>> + if (!PyBytes_Check($input)) {
>> PyErr_SetString(PyExc_TypeError,
>> - "expecting a string for the buffer");
>> + "expecting a bytes object for the buffer");
>> SWIG_fail;
>> }
>> - /* Explicitly cast away const from PyStr_AsUTF8AndSize (in Python 3).
>> The
>> - swig generated argument ($1) here will be "char *", but since its
>> only
>> - use is to pass directly into the "const char *" argument of the
>> wrapped
>> - function, this is safe to do. */
>> - $1 = (char *)PyStr_AsUTF8AndSize($input, &temp);
>>
>
>
>> + if (PyBytes_AsStringAndSize($input, &tmpdata, &length) == -1) {
>>
>
> Why not just use "&$1" here instead of introducing a new temp variable?

It's to avoid C compiler warning to discard 'const' quarifier, but this
can be also safely accomplished by cast.

> + SWIG_fail;
>> + }
>> + temp = ($*2_type)length;
>> + $1 = tmpdata;
>> $2 = ($2_ltype)&temp;
>> }
>> #endif
>>
> @@ -504,8 +506,8 @@
>> SWIG_fail;
>> }
>>
>> - if (PyStr_Check($input)) {
>> - const char *value = PyStr_AsString($input);
>> + if (PyBytes_Check($input)) {
>> + const char *value = PyBytes_AsString($input);
>> $1 = apr_pstrdup(_global_pool, value);
>> }
>> else if (PyLong_Check($input)) {
>>
>
> Since this function does a whole list of checks and conversions, perhaps it
> makes sense to support Str here in
> addition to bytes? Of course in Py2 the "PyStr_Check()" would effectively
> be performed twice if we did that, but
> I don't expect that to be a tremendous impact, and with the "cost" of an
> addition "IS_PY3" preprocessor check,
> that could be eliminated from the Py2 build.

I Agreed.

>> diff --git a/subversion/bindings/swig/include/svn_global.swg
>> b/subversion/bindings/swig/include/svn_global.swg
>> index 191bbdd531..f7364be28f 100644
>> --- a/subversion/bindings/swig/include/svn_global.swg
>> +++ b/subversion/bindings/swig/include/svn_global.swg
>> @@ -31,6 +31,12 @@
>> #define SVN_DEPRECATED
>> #endif
>>
>> +#ifdef SWIGPYTHON
>> +%begin %{
>> +#define SWIG_PYTHON_STRICT_BYTE_CHAR
>> +%}
>> +#endif
>> +
>> %include typemaps.i
>> %include constraints.i
>> %include exception.i
>> @@ -136,9 +142,15 @@ static PyObject * _global_py_pool = NULL;
>> /* Python format specifiers. Use Python instead of SWIG to parse
>> these basic types, because Python reports better error messages
>> (with correct argument numbers). */
>> +#if defined(PY3)
>>
>
> This is where we'd want to use py3c's "IS_PY3" value.

including header file py3c/compath.h is done by indirectly as

#ifdef SWIGPYTHON
%{
#include "swigutil_py.h"
#include "swigutil_py3c.h"
%}
#endif

however, it is enclosed by %{ %} so it is not defined as SWIG macro
symbol. and following

>> +%typemap (in, parse="y")
>> + char *, char const *, char * const, char const * const "";
>> +#else
>> %typemap (in, parse="s")
>> char *, char const *, char * const, char const * const "";
>> +#endif

is a SWIG context switch. this is why -DPY3 was needed.
However, if we want to allow both of Bytes and Str for input, `parse'
directive cannot be used here and must write code here to type check
and to convert ourself, it isn't needed.

Rest of all your suggestions convinced me, so I'll start to modify.

Cheers,

-- 
Yasuhito FUTATSUKI
Received on 2018-12-26 17:58:52 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.