[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Python 3 Bindings Query

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: Tue, 17 Oct 2017 03:33:29 +0000

On Mon, Oct 16, 2017 at 3:36 PM Daniel Shahaf <d.s_at_daniel.shahaf.name>
wrote:

>
> > It has been quite some time since I was last on here, but as I was
> looking
>
> Almost a decade, according to contribulyzer. :-)
>
> Yeah, you know it has been a while when your feature shows up in RHEL..2
releases ago (RHEL6)!

>
> There hasn't been any suggestion to deprecate swig-py. Moreover, they
> are our favourite bindings for tools/ scripts, so I don't anticipate
> them to be deprecated, either.
>
> That said, the bindings see few changes nowadays, and we have always had
> few swig-savvy devs around; so any help would be most welcome.
>
> Perfect, that's great to hear. This is what I assumed, but I wanted to be
sure.

> > I also wanted to know of any partial efforts that might have
> > already been started, or if there were discussions related to the
> > implementation that my searches did not turn up.
>
> There are several separate uses of Python in the source tree. I recall
> patches to build/, tools/, and subversion/tests/cmdline/ that improve
> 3.x compatibility, but I don't recall any such changes to the bindings.
> Note that we have both SWIG bindings at subversion/bindings/swig/python/
> and ctypes bindings at subversion/bindings/ctypes-python/.
>
> I take it that of all these, you're interested in the swig-py bindings?
> Or are the build system and test suite also within your scope?
>
>
Well I certainly don't want to half way do it, so while my initial target
was the swig bindings, I'll take a look at the full set.

You said the patch was going to be a largeish one. How large/invasive
> are we talking about? (This affects how easy it'd be for us to
> review/apply it)
>

Yes, I suspect the effort will touch a fair number of code, but much of it
will be direct substitution, and thus should be reasonable to review.
Plus, it'll certainly be broken up into smaller, more manageable commits.
I'll also be planning to simultaneously build python3 and python2 bindings,
since I'll be eyeing future Fedora (and presumably other distro) packaging,
which will touch a few front end build pieces.

In fact, to get the conversation going I've attached a patch which gives a
sense of the road ahead. This is where I got to yesterday before deciding
that I should probably start talking to the dev team about desires and
direction. I believe that it should consist mostly of replacing various
functions deprecated/removed in Py3 with wrappers to consolidate all the
conditional compiling into a common location. Then substituting the use of
those functions.

My initial assessment is there there are really only a handful of the
deprecated functions in use by the current subversion python bindings.
However, it may make sense to use the py3c project [1], which already
provides most (all?) of the necessary shims and compatibility functions.
However, I'm not sure whether this (header-only) dependency is something
you really want to pick up or not. As I don't think there is a wide
variety of functions that need wrappers, it may not be worth the new dep.

There is one fairly big decision to be made.

As you may or may not know, one of the primary user facing changes of
Python 3 is the migration of the "string" type to "unicode" types.
Previously in python 2 the "string" type basically represented a sequence
of bytes, and when you printed it or did a few other manipulations you
assumed/hoped/prayed it was actually in a valid encoding (typically assumed
to be ASCII/latin-1/uft8). Now in Python 3 basically all I/O operation
result in a more accurate 'bytes' object, which you then 'decode' into a
"unicode" object, explicitly indicating the encoding format.

In practice, you can almost always use 'utf8', but of course that is
'almost'. I'm sure there are scenarios when you know that you have some
other encoding coming in or going out that you would need to use some other
specified encoding, but I suspect that is quite rare. I believe for this
code-set, which will target being compatible with both Py2 and Py3 at the
same time, it is perfectly reasonable to assume 'utf8'. Indeed, this is
the same approach the aforementioned py3c project took [2].

So I believe (and it seems the py3c developer agrees) that making the
assumption of utf8 encoding is reasonable, especially given that the
current py2 is likely making that same same assumption implicitly (though
ASCII/latin-1 might be closer to the truth in many py2 scenarios). There
is actually a Py2 unicode object that could be an option to convert to, but
the level of changes trying to move over to that for Py2 coupled with the
pretty large differences in the necessary unicode conversions, would mean a
lot more code churn, for likely little gain.

The questions to decide on are:
1. Are you generally comfortable with the build changes necessary to
(optionally) build Py2 and Py3 bindings at the same time?
2. Do you want to pick up 'py3c' as a new dep, or implement the handful of
necessary wrappers?
3. Is the assumption of utf8 encoding sufficiently reasonable?

My general plan of attack will be:
1. Replace deprecated Py2 swig functions and syntax with Py2/Py3
cross-compatible versions or wrappers.
2. Ensure existing full Py2 support works correctly.
3. Update build to build both, Py2 and Py3, and get Py3 working.
4. Look at the remainder of the python usage to ensure Py3 compliance.

Cheers,
>
> Daniel
>

That's all I have for now,
Troy Curtis, Jr

[1]: https://github.com/encukou/py3c
[2]: https://github.com/encukou/py3c/blob/master/include/py3c/compat.h

Received on 2017-10-17 05:33:57 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.