[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Call For Votes: converting log messages to UTF-8

From: Jon Trowbridge <trow_at_ximian.com>
Date: 2002-05-31 17:46:28 CEST

On Fri, 2002-05-31 at 10:06, Karl Fogel wrote:
> I hope we all agree that we're just choosing a default behavior for
> the client here -- users can get the alternate behavior by setting or
> unsetting a config option in ~/.subversion/options. I.e., we should
> offer conversion to UTF-8 for those who want it, and should not
> unconditionally *force* conversion to UTF-8 for those who know they
> don't want it. The only question is how we behave out-of-the-box.

Just so that I understand: does this proposal imply a policy that all
log messages are stored in UTF-8? Or does a +1 here imply support for
storing log messages in an unknown and unknowable charset?

-JT

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

ting the message
> as Latin-1 or something, but I haven't thought carefully about
> that.

The two behaviors, in my mind, boil down to a matter of choosing a
risk:

  1. do we risk munging userdata at *input* time, by attempting to
      guess at a charset to convert to UTF-8?

      OR

  2. do we risk munging userdata at *output* time, i.e. not knowing
      how to display the logmsg properly, because we don't know its
      charset?

In my mind, risk #1 is much more dangerous. If the logmsg is
accidentally corrupted at input-time, it's gone forever. This is much
worse than possibly seeing a garbled display in some GUI textbox --
that problem is fixable by heuristics (or project policy).

We already have this scenario going on in our code -- the
svn:eol-style property. By default, we've chosen *not* to start
munging userdata until the user activates this property. That's a
sensible default.

Therefore, I agree with Karl (the 2nd checkbox), because it's less
risky. Given that we support both behaviors via some kind of
~/.subversion/ config option, I think the sensible default is not to
munge data at input time. If users want to flip a switch and force
all log messages into UTF-8, that's totally fine. But I think a
decision that the *user* must make, not one for our client-app to make
right out of the box.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:12:06 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.