It seems to me that everyone's pretty much stated their reasons for
and against now. We're no longer adding new material to the
discussion, we're just reiterating points already made.
So, I'd like to propose a vote.
I hope we all agree that we're just choosing a default behavior for
the client here -- users can get the alternate behavior by setting or
unsetting a config option in ~/.subversion/options. I.e., we should
offer conversion to UTF-8 for those who want it, and should not
unconditionally *force* conversion to UTF-8 for those who know they
don't want it. The only question is how we behave out-of-the-box.
(If this is controversial, I guess we're not ready to vote yet.)
The two choices are
[ ] By default, recode log messages from user input to UTF-8, using
the locale to get a best guess for the original encoding of the
user input.
[ ] By default, do no re-encoding of log messages. Store exactly
the byte sequence the user enters. When printing log messages,
the svn client would simply assume that the byte '\n' is a line
end (it prints out the number of lines in each message as part
of the msg header). When printing out the log message as xml,
we'd do our best to escape bytes that are incompatible with
being xml content; this probably implies treating the message
as Latin-1 or something, but I haven't thought carefully about
that.
Don't worry about implementation difficulty. Both of these choices
are easy (and indeed, if we want to support both, we need to implement
both, which implies changing the log_msg params to counted-length
strings internally no matter what).
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:11:57 2002