[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Greg Stein <gstein_at_lyra.org>
Date: 2002-05-30 01:08:29 CEST

On Wed, May 29, 2002 at 10:55:22AM -0500, cmpilato@collab.net wrote:
>...
> > The next problem in the pipeline (based on what Ulf Tigerstedt
> > encountered) is that the message has to be properly XML-encoded before
> > being sent over the wire -- necessary whether UTF-8 or full binary.
>
> The message already is being XML-encoded to some extent, in that '<'
> and '>' and other such special chars are being converted to entity
> representations, IIRC. I think all we need to do is to make sure that
> all this stuff is first converted to UTF-8, and then just add the
> "charset" XML attribute thingy that states that this particular XML
> document is in UTF-8.
>
> Am I remembering XML specs correctly?

Well, first, any XML "document" needs to choose a character set for its
body. Then you declare that in the <?xml?> thing, or in the Content-Type
header. Subversion normally uses the latter:

  Content-Type: text/xml; charset="utf-8"

Placing the character set in the HTTP header is a bit better than using the
<?xml?> processing instruction.

Anyways... after the charset is decided, then you need to escape certain
characters (as somebody already stated: <, >, &, and sometimes ' and ")

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:22:48 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.