[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Call For Votes: converting log messages to UTF-8

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2002-06-04 05:27:31 CEST

Hontvari Jozsef <hontvari@solware.com> writes:
> If somebody worries about the possible data loss: you can recover the
> original data from all three systems. Email and binary stores the original
> byte stream, UTF-8 is reversible.

I hesitate to even mention it again, especially as it doesn't support
the side I'm favoring here :-), but UTF-8 is not always reversible.

As I wrote earlier:

> For example, suppose you write your log message in stateless
> encoding FOO (it may be fixed-width or not, but it's not stateful).
> But Subversion mistakenly deduces from your locale that it's in
> *stateful* encoding BAR. When it converts to UTF-8, the (alleged)
> escape sequences of what svn took to be BAR will be lost. You
> cannot get the original string back now.

In practice, this probably wouldn't happen often, and more
importantly, I think people will rarely if ever be in the position of
actually having to recover an original bit-string from UTF-8 anyway.
But we just can't promise that it's always recoverable.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jun 4 05:31:34 2002

This is an archived mail posted to the Subversion Dev mailing list.