[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Bill Tutt <rassilon_at_lyra.org>
Date: 2002-05-29 18:55:29 CEST

The really weird cases for HTML/XML encoding are non-UTF8 transmissions
of XML/HTML data.

I had to port some code that supported this horrendous edge case
recently to further turn your gut.

The example goes something like this:
I want (for whatever bizarre reason) to transmit my HTML/XML in a Korean
character set. (Windows character set 1361 to pick a specific one)

The data I want to send looks like this: AA'BC
(Pretend for a second that A' really is a capital letter A with an acute
marker over it.)

Now, unsurprisingly A' isn't representable in Korean. Therefore, I'd
like to be able to transform this into an HTML/XML entity. The helper
function that I had to call from C# produced this output for the above
string:
A&Aacute;BC

Yes, this is evil. Yes, this is an unbelievably edge case scenario. Yes,
I have no idea who the hell needs this, but I had to port access to the
code to our new .Net API, whee.

Stomach turning il8n factoid of the day,
Bill

----
Do you want a dangerous fugitive staying in your flat?
No.
Well, don't upset him and he'll be a nice fugitive staying in your flat.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:24:48 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.