[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-05-29 19:02:10 CEST

Karl Fogel <kfogel@newton.ch.collab.net> writes:
> The first problem in the pipeline is that the `log_msg' variable to a
> lot of internal functions is now `const char *' instead of stringbuf.
> As long as people stick to UTF-8, this is fine. If we want true
> binary log message support, we'll need to go back to stringbufs for
> that data (not a difficult change).

Some corrections:

  1. svn_string_t, not svn_stringfuf_t, for arbitrary binary data
  2. Being safe for "8-bit data" is different from being safe for
"binary data." No 8-bit character set uses the octet 0 (I'm pretty
certain of that), so you can still use C strings for international text
unless you want to support UTF-16 or some such.

Marcus's patch takes a reasonable approach, since it means people can
write log messages in different character encodings and get sane (if not
always perfect) results. It would still be nicer if people's tools just
used UTF-8, of course, so that applications didn't have to know about
character sets.

The alternative is to base64-encode log messages when they're stuffed
into XML documents, which is charset-neutral like CVS is.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:24:46 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.