[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: invalid UTF-8 sequence

From: Vincent Lefevre <vincent+svn_at_vinc17.org>
Date: 2007-05-31 14:30:59 CEST

On 2007-05-31 13:39:21 +0200, Jan Torben Heuer wrote:
> svn: Valid UTF-8 data
> (hex: 6b 2f 30 37 67 65 6f 73 6f 66 74 31 2f 61 62 67 61 62 65 6e 2f 4d 5f
> 48)
> followed by invalid UTF-8 sequence
> (hex: fc 62 6e 65)

<fc> is ü encoded in ISO-8859-1 (or ISO-8859-15).

> I am using the locale:
> jtheuer@farpoint ~/daten $ locale
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
>
> and the problem seems to be files with german umlauts (äöü).
> Replacing them is unacceptable.

It seems that the file names are encoded in ISO-8859-1 (or ISO-8859-15),
which is incorrect. There are tools to concvert them into UTF-8.

-- 
Vincent Lefèvre <vincent_at_vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu May 31 15:19:48 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.