[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Call For Votes: converting log messages to UTF-8

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-06-04 15:20:24 CEST

Karl Fogel <kfogel@newton.ch.collab.net> writes:

> I hesitate to even mention it again, especially as it doesn't support
> the side I'm favoring here :-), but UTF-8 is not always reversible.
>
> As I wrote earlier:
>
> > For example, suppose you write your log message in stateless
> > encoding FOO (it may be fixed-width or not, but it's not stateful).
> > But Subversion mistakenly deduces from your locale that it's in
> > *stateful* encoding BAR. When it converts to UTF-8, the (alleged)
> > escape sequences of what svn took to be BAR will be lost. You
> > cannot get the original string back now.
>
> In practice, this probably wouldn't happen often, and more
> importantly, I think people will rarely if ever be in the position of
> actually having to recover an original bit-string from UTF-8 anyway.
> But we just can't promise that it's always recoverable.

It should be extremely uncommon. I can't imagine how a system locale
using a stateful character encoding would actually work. Just
making filenames work in the filesystem would be more work for the OS
implementor than it could reasonably be worth.

  // Marcus

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jun 4 15:25:03 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.