On Thu, 2002-05-30 at 18:45, Greg Stein wrote:
> 3) untenable for the clients.
I'd like to keep a little perspective here.
If we don't solve the log message character set problem, then projects
are happy as long as:
* They are willing to stick with ASCII log messages, or
* All their developers use the same character set, or
* All their developers have use a UTF-8 native locale
(That third statement is a little forward-looking, but there has been
some progress in that direction.)
I believe this covers quite a lot of users--everyone who is happy with
CVS, for instance. Subversion is not going to fail on account of not
doing character set conversion.
This is why I would be happy being charset-neutral and 8-bit clean (not
necessarily binary-clean) for all text fields. Possibly happier, since
we would never be responsible for misconverting text when LC_CTYPE isn't
set properly, or anything like that. Plus our code would be simpler.
On the other hand, there seems to be a fairly broad consensus for doing
UTF-8/$LC_CTYPE character set conversion for filenames. I am...
confused as to why anyone advocates converting filenames and not log
messages, since they are both text. File contents are binary data.
Property values... might be binary data; that seems to be the conensus
for now, anyway, although that leads to questions about how svn:ignore
should be interpreted and such. But log messages are definitely text.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:14:59 2002