[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

I18n: The gettext proposal

From: Nicolás Lichtmaier <nick_at_reloco.com.ar>
Date: 2004-03-30 05:55:34 CEST

Hi, I propose to use gettext for i18n.

First some info about how does gettext work.

Gettext translates user messages at runtime. This is done by the
gettext() function, which takes a string and provides its localized
counterpart. The "weird" thing about gettext is that it uses the English
message as the look-up key, not a number, not an ID, but the real and
usable English message. Normaly you won't see "gettext()" in the code,
because this define is used instead:

#define _(x) gettext(x)

...so what you would see is:

printf(_("I'm a localized message and I'm proud of it"));

Once all the localizable strings are marked with _(), then it's time to
run xgettext. This tool will scan a list of files and product just one
archive: subversion.pot. This archive contains entries, and each entry
looks like this one:

#: subversion/clients/cmdline/blame-cmd.c:145
#, c-format
msgid "Skipping binary file: '%s'\n"
msgstr ""

The subversion.pot file contains every localizable message, and serves
as a starting point for new localizations. A new localization starts by
copying this file to (e.g.) es.po. There, the translator fills the empty
string with the proper translation:

#: subversion/clients/cmdline/blame-cmd.c:145
#, c-format
msgid "Skipping binary file: '%s'\n"
msgstr "Omitiendo el archivo binario: '%s'\n"

The .pot file is never edited by hand, it's always overwritten by
xgettext. There's a tool calles msgmerge which will merge a new
subversion.pot with an old XX.po. This tool is very smart, it even
suggests translations for the new messages if they look too similar to
old messages.

Each of these XX.po (where XX is an ISO language code) is compiled by
msgfmt to produce a .mo file, which is installed to
/usr/share/locale/XX/LC_MESSAGES/subversion.mo . So if Subversion
supported 3 additional languages, it would ship 3 .mo files, which would
be used at runtime according to the user's locale settings.

In a system there might be several programs using gettext, they don't
collide between each other because they name differently their .mo
files. This name is called domain. A domain is a set of messages which
need translation. We would create a "subversion" domain.

I told you before about gettext. In fact, this function is not the one
used in the patch I've sent. Why? Because of svn being also a library.
When being a library the process is configured to use some other domain.
Suppose that gnumeric added subversion support. Gnumeric already uses
gettext, with a "gnumeric" domain (so its files are in
/usr/share/locale/XX/LC_MESSAGES/gnumeric.mp). We don't want to use this
domain, but we don't want to disturb the main application wither. So we
use dgettext, which is version of the function which enables one to pass
the domain in each invocation. So the different svn libraries will
provide proper localized messages, even when ran inside some other
application.

To properly implement i18n some things will need to change:
It's imposible to handle plurals like this:
printf("%d file%s", n, n>1 ? "s": "");
The gettext function provides its own way of handling this, which
implies having the two versions of this message separated: "%d file" and
"%d files" (look for ngettext in the manual).

Other thing is this:
printf("We are at revision " I_M_THE_REV_FMT "\n", rev);
This clashes with the gettext cheme because it breaks the xgettext
scanning. xgettext doesn't process includes and can't figure out what
the message is supposed to be. One way to fix this if we can't get rid
of the define is to format the number in a separate step, probably in an
auxiliar function:

printf("We are at revision %s\n", fmt_rev(buf, rev));

One thing which is still unresolved is how to translate server messages.
I think these should be handled (in DAV) with the Accept-language:
header. This header will tell the server which locales the client is
willing to accept, so that the server can choose the proper .mo and
serve the right messages. This would be implemented in a second stage
since there are some issues that should be resolved.

Well, I'll end this for now. This mail has got too long.

Bye! =)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Mar 30 05:55:48 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.