[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Determining "file system encoding" from Python

From: Manuel Jacob <me_at_manueljacob.de>
Date: Mon, 29 Jun 2020 12:11:09 +0200


In a Python application, I want to convert a path (as native Unix bytes)
to a file URL (and later probably also other paths between the "file
system encoding" and UTF-8). There are functions for this in the
Subversion binding. However, for the sake of being able to deal with the
familiar Python exceptions, I’d like to do the decoding/encoding in
Python. For that, I need to find out the encoding that Subversion uses
for converting UTF-8 to the "file system encoding".

Subversion seems to use the encoding returned by
apr_os_locale_encoding(), which is however not exposed by the Python

lib = ctypes.CDLL(libsvn._core.__file__)
lib.apr_os_locale_encoding.argtypes = [ctypes.c_void_p]
lib.apr_os_locale_encoding.restype = ctypes.c_char_p
with util.with_lc_ctype():
     es = lib.apr_os_locale_encoding(int(svn.core.application_pool.this))
fsencoding = codecs.lookup(es).name

Is there an easier way? I could emulate what apr_os_locale_encoding() is
doing, which is calling nl_langinfo() and falling back to ISO-8859-1 on
systems which are supported by Python. Is it reasonable to assume that
this logic will stay? Or, asked differently, what has the least chance
of stopping to give the "file system encoding"? The ctypes code or using
nl_langinfo (falling back to ISO-8859-1)?

Received on 2020-06-29 13:36:35 CEST

This is an archived mail posted to the Subversion Users mailing list.