sent from my phone
On May 23, 2013 4:43 PM, "Dongsheng Song" <dongsheng.song_at_gmail.com> wrote:
>
> On Thu, May 23, 2013 at 10:06 PM, Philip Martin
> <philip.martin_at_wandisco.com> wrote:
> > Dongsheng Song <dongsheng.song_at_gmail.com> writes:
> >
> >> On Thu, May 23, 2013 at 9:28 PM, Philip Martin
> >> <philip.martin_at_wandisco.com> wrote:
> >>> Dongsheng Song <dongsheng.song_at_gmail.com> writes:
> >>>
> >>>> On Thu, May 23, 2013 at 9:11 PM, Philip Martin
> >>>> <philip.martin_at_wandisco.com> wrote:
> >>>>> Philip Martin <philip.martin_at_wandisco.com> writes:
> >>>>>
> >>>>>> So it appears the UTF8 to native conversion is missing from
> >>>>>> repos_notify_handler. I think repos_notify_handler should be using
> >>>>>> svn_stream_printf_from_utf8 rather than svn_stream_printf.
> >>>>>
> >>>>> I've fixed trunk to use svn_cmdline_cstring_from_utf8 and proposed
it
> >>>>> for 1.8.
> >>>>>
> >>>>
> >>>> As GETTEXT(3) man pages said, If and only if
> >>>> defined(HAVE_BIND_TEXTDOMAIN_CODESET),
> >>>> your commit is OK.
> >>>>
> >>>> So you should check HAVE_BIND_TEXTDOMAIN_CODESET when you use
> >>>> svn_cmdline_cstring_from_utf8.
> >>>
> >>> Are you saying there is a problem with my change? If there is a
problem
> >>> doesn't already apply to all other uses of
svn_cmdline_cstring_from_utf8?
> >>>
> >>
> >> I thinks so. In the subversion/libsvn_subr/nls.c file:
> >>
> >> #ifdef HAVE_BIND_TEXTDOMAIN_CODESET
> >> bind_textdomain_codeset(PACKAGE_NAME, "UTF-8");
> >> #endif /* HAVE_BIND_TEXTDOMAIN_CODESET */
> >>
> >> bind_textdomain_codeset only called when HAVE_BIND_TEXTDOMAIN_CODESET
> >> defined. In this case, you can assume GETTEXT(3) returned string is
> >> UTF-8 encoded.
> >
> > I still don't understand if you are claiming my change has a problem or
> > if there is a problem in all uses of svn_cmdline_cstring_from_utf8.
> >
> > I recall a related thread from last year:
> >
> > http://svn.haxx.se/dev/archive-2012-08/index.shtml#34
> >
http://mail-archives.apache.org/mod_mbox/subversion-dev/201208.mbox/%3Cop.wilcelggnngjn5@tortoise%3E
> >
> > I think we assume that the translations are UTF-8.
> >
> > Is there some code change you think we should make?
> >
>
> Even ALL the translations are UTF-8, GETTEXT(3) still return the
> string encoded by the ***current locale's codeset***.
>
> Here is sniped from the GETTEXT(3) man pages:
>
> In both cases, the functions also use the LC_CTYPE locale facet in
> order to convert the translated message from the translator's
> codeset to the ***current locale's codeset***, unless overridden by a
> prior call to the bind_textdomain_codeset function.
>
> So svn_cmdline_printf SHOULD NOT assume the input string is UTF-8
> coded, it it encoded to the ***current locale's codeset***.
But we call the codeset function to make sure we do not generate output in
the current locale encoding.
> I think the best solution is: DO NOTconvert the GETTEXT(3) returned
> messages, write it ***AS IS***, since GETTEXT(3) already do the
> correct conversion for us.
Well, even though gettext may want us to believe otherwise, this doesn't
work for cross platform applications: e.g. in windows the locale for output
on the console may be different from the locale for other uses. Back when
we went with gettext (2004?), we've hashed this through pretty thoroughly.
I hope that discussion is still available in the archives.
Bye,
Erik.
Received on 2013-05-23 17:10:12 CEST