[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Bug: Control char in commit message

From: John Barstow <John_Barstow_at_gfsg.co.nz>
Date: 2002-12-04 21:03:36 CET

> But the client layer shouldn't even let things get that far, because
> it should be trying to convert to UTF-8. If it succeeds, the result
> is XML-safe. If it fails, the commit doesn't proceed anyway.

Sorry, but UTF-8 is *not* automatically XML safe. It's just an encoding, and
XML accepts a subset of the characters that can be encoded this way. In
particular, the ASCII control characters (except tab,CR, LF), the surrogate
blocks, FFFE, and FFFF all have valid UTF-8 encodings but are not valid XML
characters.

John C Barstow

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 4 20:59:07 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.