[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problem with non-UTF8

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: Wed, 19 Mar 2008 12:56:13 +0100

On 3/19/08, Ben Schonle <ben_at_amigos24.net> wrote:
> Ryan Schmidt wrote:
> > On Mar 18, 2008, at 03:50, Ben Schonle wrote:
> >
> >> Lars Grunewaldt wrote:
> >>
> >>> Am 12.03.2008 um 17:04 schrieb Ben Schonle:
> >>>
> >>>> when importing a folder with mutiple subfolders and files I get the
> >>>> error:
> >>>>
> >>>> svn: Valid UTF-8 data
> >>>> (hex: 4b 61 69 72 6f 73 20 70)
> >>>> followed by invalid UTF-8 sequence
> >>>> (hex: f5 68 69 76)
> >>>>
> >>>> As I understand it there are some characters in the file names that
> >>>> are not UTF8. I was thinking not to rename the respective files /
> >>>> folders, but would need to know their names, locations first.
> >>>>
> >>>> How do you suggest to proceed?
> >>>
> >>> this happens mostly (for me), when entering a commit message that
> >>> contains German Umlaute or other non-ISO8859-1-Characters on a unix
> >>> terminal. Maybe that's the case for you, too?
> >>>
> >>> Otherwise, you could hex-decode the string (treat the numbers as
> >>> 8-Bit-ANSI-Characters), that should be a part of your filename.
> >>
> >> for me the problem is that some folders / file names contain German
> >> Umlaute.
> >>
> >> according to Lars I should hex-decode the string. Thus the questions:
> >>
> >> * how do I find out which folders / files I need to hexdecode?
> >> * how do I hexdecode them once found?
> >
> > Take the hex in the error message and turn it into characters.
> >
> >>>> svn: Valid UTF-8 data
> >>>> (hex: 4b 61 69 72 6f 73 20 70)
> >
> > That's all ASCII data (all bytes are less than hex 80) so in any
> > character encoding that's "Kairos p"
> >
> >>>> followed by invalid UTF-8 sequence
> >>>> (hex: f5 68 69 76)
> >
> > If we assume the character encoding of these bytes is ISO-8859-1, then
> > this is "õhiv".
> >
> > One solution is to rename the files to have no non-ASCII characters.
> >
> > What you probably want to do, though, is set the LANG environment
> > variable correctly so that svn knows what character encoding to use to
> > read the file names.
> >
> Hey Ryan,
>
> I now renamed the respective files to only use ASCII characters.
> However, I would be still interested to know WHERE to set the LANG
> environment variable? Do you refer her to the OS or to SVN settings?

Which operating system? (and probably some more questions depending on
that answer)

Bye,

Erik
Received on 2008-03-19 12:56:57 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.