[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: String formatting with APR_INT64_T etc. & gettext localization

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Sun, 27 Apr 2014 15:13:03 +0100 (BST)

Branko ÄŒibej wrote:
>>>+    -e 's/APR_SSIZE_T_FMT/"ld"/g' \
>>>+    -e 's/APR_SIZE_T_FMT/"lu"/g' \
>>>+    -e 's/APR_OFF_T_FMT/"ld"/g' \
>>
>> Surely these must be wrong on IL32P64 platforms?
>
> It's wrong on all platforms, unless you also explicitly cast the
> values to the declared format types.

Yes, it's wrong to substitute a fixed literal string such as "%ld" before translation. (I'm not considering casting the values.)

Our available options are:

  * Substitute one of the <inttypes.h> tokens (PRId64, etc.) that 'gettext' handles.

  * Do what 'gettext' does for 'PRId64', which is arrange for the substitution to happen *after* translation at build time, as explained below.
So, let me explain how these ones work:

+    -e 's/APR_INT64_T_FMT/PRId64/g' \

The way 'gettext' handles PRId64 (etc.) is:

First, when 'gettext' is extracting translatable strings from the C source files into the .pot (PO template) file, it replaces the source token 'PRId64' with the literal string "<PRId64>". So the translators will see strings like "The number is %<PRId64>". The translators are expected to leave this part unchanged so the translation is for example "Le numéro est %<PRId64>".

Second, after the translations have been prepared in the .po files, 'msgfmt' substitutes "<PRId64>" in any translated string with one of "d", "ld", etc. in the resulting message catalog (.mo) file, according to the system 'msgfmt' is running on at build time (or packaging time).

Therefore the executable that looks up the strings at run time gets a translated string containing "%d" or "%ld" etc. depending on where the message catalogs were built. The idea is that these will match the corresponding strings that were compiled in to the executable directly, of course.

The documentation about that facility [1] is rather slim. I learnt the above from a quick read of the source code and the original patch [2]. As far as I can tell, though, we're safe to use this feature.

[1] <http://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html>
[2] <http://www.sourceware.org/ml/libc-alpha/2002-07/msg00202.html>

Whether we can use that for APR_OFF_T etc. depends on whether we can determine, at build time, what recognized <inttypes.h> type to map those to. I haven't confirmed but think we probably can.

An alternative option is to emulate the functionality that's built in to 'gettext', for our own types, by pre-processing 'APR_OFF_T_FMT' to '<APR_OFF_T_FMT>' and then post-process at 'msgfmt' time from that to the correct replacement string. In effect, this approach is exactly the same as hacking the 'gettext' system to support APR_OFF_T_FMT etc., except we could do it as pre-processing and post-processing steps.

Any further thoughts on whether this approach is feasible?

- Julian
Received on 2014-04-27 16:13:37 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.