[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-05-30 14:48:47 CEST

On Thu, 2002-05-30 at 05:45, Greg Stein wrote:
> > We don't "avoid all this character
> > set nonsense" if we do the translation, but not doing it means users'
> > tools must all use UTF-8 (including all tools which interact with
> > pathnames in the working directory).
>
> We avoid it within the library and its APIs.

No, not really.

Every time we open a file, we have to convert the pathname from UTF-8 to
the local character encoding. In an ideal world APR might take care of
this for us, but it doesn't. (Fortunately, we can just wrap
apr_file_open() with our own function.)

Every time we display a message, we have to convert it. Again, APR
might conceivably take care of this for us, but it doesn't.

When we prompt the user for a log message via $EDITOR, what we get back
is in the local character encoding. Hard to imagine APR taking care of
this.

There are more interactions as well. The libraries interact not just
with the client, but with the operating system.

> To simplify the data storage and flow, we just say "it's all UTF-8". At the
> boundaries between the SVN libraries and the client programs, the program
> can (as appropriate) recode from UTF-8 to another charset.

There are certainly advantages to the UTF-8 approach, but "avoiding
character set nonsense" in the libraries is not one of them.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:34:37 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.