[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

charset neutral? pls solve this

From: Greg Stein <gstein_at_lyra.org>
Date: 2002-05-31 22:51:37 CEST

Something that Karl mentioned got me thinking, and made me realize that we
*already* have a bug due to charset issues. If we are charset neutral, then
I see no possible way to solve this:

    In the log message output, we count the number of newlines, and display
    that count (see clients/cmdline/log-cmd.c::num_lines).

Without knowing the charset, it cannot know that a '\n' or '\r' byte is
actually a newline. Maybe that is one half of a UCS-2 encoding of a
character? Maybe it is part of a shifted character in Shift-JIS or the like.

At a minimum, I'd like to at least use this datapoint as a way to
demonstrate that charset neutral just really isn't a good option.

[ plus my comments about the fact that SVN is library-based and binds with
  apps a lot more tightly than CVS ever did or will; thus, we actually
  operate in a different realm than CVS; thus, CVS's neutrality is not
  really a very good empirical example of the viability of charset-neutral ]


Greg Stein, http://www.lyra.org/
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:10:55 2002

This is an archived mail posted to the Subversion Dev mailing list.