[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Error doing a svnsync

From: Stephen Butler <sbutler_at_elego.de>
Date: Thu, 9 Feb 2012 00:27:36 +0100

[ cc'ing users@ again]

On Feb 8, 2012, at 23:25 , Phil Pinkerton wrote:

> Thanks when I try using --source-prop-encoding I get
>
> svnsync: E000022: Safe data 'Server Currency ' was followed by
> non-ASCII byte 150: unable to convert to/from UTF-8

What was the name of the (non-UTF-8) encoding? I think

  --source-prop-encoding cp1252

might work, thanks to a handy table I found at

  http://www.prismnet.com/~jdawson/cp1252.html

[[[
>>> s = "".join([chr(int(n, 16)).decode("cp1252") for n in "53 65 72 76 65 72 20 43 75 72 72 65 6e 63 79 20 96 20 42 61".split()])
>>> with open("log.txt", "w") as f:
... f.write(s.encode("utf-8"))
]]]

When I open log.txt in Emacs I see a long dash:

  Server Currency – Ba

Bingo! :-)

Steve

>
> On Wed, Feb 8, 2012 at 4:28 PM, Stephen Butler <sbutler_at_elego.de> wrote:
>>
>> On Feb 8, 2012, at 20:00 , Phil Pinkerton wrote:
>>
>>> We have been doing a few hundred svnsync's from a 1.6.5 repositories
>>> to 1.7.2 repositories
>>>
>>> for the most par this has gone quite well. but we have encountered an
>>> error that is not to clear and we seek any insight to this error:
>>>
>>> svnsync: E000022: Valid UTF-8 data
>>> (hex: 53 65 72 76 65 72 20 43 75 72 72 65 6e 63 79 20)
>>> followed by invalid UTF-8 sequence
>>> (hex: 96 20 42 61)
>>
>>
>> Indeed, the 0x96 is invalid in UTF-8.
>>
>>>>> "".join([chr(int(n, 16)).decode("utf-8") for n in "53 65 72 76 65 72 20 43 75 72 72 65 6e 63 79 20".split()])
>> u'Server Currency '
>>
>>>>> "".join([chr(int(n, 16)).decode("utf-8") for n in "96 20 42 61".split()])
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
>> return codecs.utf_8_decode(input, errors, True)
>> UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 0: invalid start byte
>>
>> Does that text appear in a log message? The 1.7 server is stricter
>> about UTF-8.
>>
>> The svnsync command has a new option --source-prop-encoding,
>> which may be useful if some old client committed a log message in
>> some other encoding.
>>
>> Regards,
>> Steve
>
>
>
> --
> " The fundamental principle here is that the justification for a
> physical concept lies exclusively in its clear and unambiguous
> relation to the facts that it can be experienced" AE
>
> Please Feed and Educate the Children... it's the least any of us can do.

--
Stephen Butler | Consultant
elego Software Solutions GmbH
Gustav-Meyer-Allee 25, 13355 Berlin, Germany
tel: +49 30 2345 8696 | mobile: +49 163 25 45 015
fax: +49 30 2345 8695 | http://www.elego.de
Geschäftsführer: Olaf Wagner | Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg HRB 77719
Received on 2012-02-09 00:28:10 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.