[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: AW: AW: AW: svnlook proplist & unicode characters

From: Branko Čibej <brane_at_wandisco.com>
Date: Thu, 18 Dec 2014 07:13:45 +0100

On 17.12.2014 21:48, Matthias Ludwig wrote:
> There is an error in this string "//a//o\u0308".
> It should be "/a/o\u0308". But this does not change the behaviour. I've
> tried it again, the problem persists.
>
> / -> slash for path separator
> a -> name of subfolder
> / -> slash for path separator
> o -> for "o"
> \u -> escaping: here comes a UTF-16 code in hex
> 0308 unicode Unicode Character 'COMBINING DIAERESIS'
>
> Java String are internaly stored in Unicode (UTF-8, UTF-16 or whatever, it's
> internal - you tell java what you want, when you pull it out)
> The String ist therefore converted in the Runtime.getRuntime().exec method.

... and what does Java do with the String arguments to this method? The
javadocs don't say ... I suspect it converts them to some default
character set, which may well be ISO-8859-1. That would explain why the
combining diaeresis gets converted to its non-combining version; of
course, that would still be a conversion bug, but at least it's sort of
understandable. :)

Since you're driving Subversion from Java, I'd really recommend to use
JavaHL here instead of jumping through command-line hoops: it's a lot
more consistent about string representation.

-- Brane
Received on 2014-12-18 07:15:06 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.