[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1515023 - /subversion/branches/1.8.x/STATUS

From: Daniel Shahaf <danielsh_at_apache.org>
Date: Sun, 18 Aug 2013 19:07:45 +0000

On Sun, Aug 18, 2013 at 10:01:51PM +0300, Daniel Shahaf wrote:
> Ivan Zhakov wrote on Sun, Aug 18, 2013 at 22:04:58 +0400:
> > > * r1514785
> > > ra_serf: Improve SSL certificate verification failure message.
> > > @@ -211,6 +210,8 @@ Candidate changes:
> > > informative. Regression from Subversion 1.7.x
> > > Votes:
> > > +1: ivan, stefan2
> > > + danielsh: I believe chopping off the last 2 bytes is wrong, _(", ")
> > > would
> > > + be longer than two bytes in Japanese locale.
> >
> > Actually not, because we use UTF8 internally so ', ' will be always two
> > bytes long.
>
> Yes, ", " will be two bytes long, but _(", ") may be any number of
> bytes. It is not guaranteed that the localised version ends with an
> ASCII comma and an ASCII space; it might end with a character whose
> representation has three bytes.
>

Case in point:

>>> unicodedata.lookup('ARABIC COMMA').encode('utf-8')
    b'\xd8\x8c'

If we add an Arabic localization, the localised version would end with bytes
D8 8C 20 00, and chopping off two bytes would result in a bytestring that ends
with D8 00, which is invalid UTF-8.

Daniel

> > String will be convert to required console locale if needed.
> > The code could be improved btw: remove ', ' and ': ' from loclized strings
> > and them seaparately to prevent translators broke output accidently.
> > But it does not prevent backport this change IMHO.
Received on 2013-08-18 21:07:54 CEST

This is an archived mail posted to the Subversion Dev mailing list.