On 2020-06-29 12:11, Manuel Jacob wrote:
> Hi,
>
> In a Python application, I want to convert a path (as native Unix
> bytes) to a file URL (and later probably also other paths between the
> "file system encoding" and UTF-8). There are functions for this in the
> Subversion binding. However, for the sake of being able to deal with
> the familiar Python exceptions, I’d like to do the decoding/encoding
> in Python. For that, I need to find out the encoding that Subversion
> uses for converting UTF-8 to the "file system encoding".
>
> Subversion seems to use the encoding returned by
> apr_os_locale_encoding(), which is however not exposed by the Python
> bindings.
>
> lib = ctypes.CDLL(libsvn._core.__file__)
> lib.apr_os_locale_encoding.argtypes = [ctypes.c_void_p]
> lib.apr_os_locale_encoding.restype = ctypes.c_char_p
> with util.with_lc_ctype():
I forgot to mention what `with util.with_lc_ctype()` does. It calls
`setlocale(LC_CTYPE, '')` before the block and resets it to what it was
before after the block. I put it around all calls to the Subversion
bindings to ensure that Subversion works correcly while locale-dependent
str methods on Python 2 stay unchanged.
> es =
> lib.apr_os_locale_encoding(int(svn.core.application_pool.this))
> fsencoding = codecs.lookup(es).name
>
> Is there an easier way? I could emulate what apr_os_locale_encoding()
> is doing, which is calling nl_langinfo() and falling back to
> ISO-8859-1 on systems which are supported by Python. Is it reasonable
> to assume that this logic will stay? Or, asked differently, what has
> the least chance of stopping to give the "file system encoding"? The
> ctypes code or using nl_langinfo (falling back to ISO-8859-1)?
>
> Thanks,
> Manuel
Received on 2020-06-29 13:36:36 CEST