On Sun, Aug 18, 2013 at 10:01:51PM +0300, Daniel Shahaf wrote:
> Ivan Zhakov wrote on Sun, Aug 18, 2013 at 22:04:58 +0400:
> > > * r1514785
> > > ra_serf: Improve SSL certificate verification failure message.
> > > @@ -211,6 +210,8 @@ Candidate changes:
> > > informative. Regression from Subversion 1.7.x
> > > Votes:
> > > +1: ivan, stefan2
> > > + danielsh: I believe chopping off the last 2 bytes is wrong, _(", ")
> > > would
> > > + be longer than two bytes in Japanese locale.
> >
> > Actually not, because we use UTF8 internally so ', ' will be always two
> > bytes long.
>
> Yes, ", " will be two bytes long, but _(", ") may be any number of
> bytes. It is not guaranteed that the localised version ends with an
> ASCII comma and an ASCII space; it might end with a character whose
> representation has three bytes.
>
Case in point:
>>> unicodedata.lookup('ARABIC COMMA').encode('utf-8')
b'\xd8\x8c'
If we add an Arabic localization, the localised version would end with bytes
D8 8C 20 00, and chopping off two bytes would result in a bytestring that ends
with D8 00, which is invalid UTF-8.
Daniel
> > String will be convert to required console locale if needed.
> > The code could be improved btw: remove ', ' and ': ' from loclized strings
> > and them seaparately to prevent translators broke output accidently.
> > But it does not prevent backport this change IMHO.
Received on 2013-08-18 21:07:54 CEST