[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Control characters in log message cause failure

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2004-12-02 20:40:02 CET

kfogel@collab.net writes:

> Mark Benedetto King <mbk@lowlatency.com> writes:
>> Right, as long as fuzzy_escape() it is assumed to be lossy, since
>> we're always going to use unescape_xml() on the receiver-side.
>>
>> Considering, then, that the receiver is capable of unescaping with
>> a single function, why isn't the transmitter capable of the same?
>>
>> It sounds to me that we just have a lossy encoding, and dressing it
>> up with is_valid_utf8() and svn_xml_is_xml_safe() just obscures that
>> fact. The lossy encoding would just not be lossy if both of those
>> conditions were met. Or maybe I misunderstand what fuzzy_escape()
>> does. Does it incorrectly handle valid utf8, valid xml characters?
>
> I think your logic is sound, and we should just implement it that way.
> However, I had to take this route to that fact to fully grok it.
> Sorry if it seemed a bit roundabout to you :-).

I'm finding it hard to determine exactly what everyone means in this
discussion, apologies if I am repeating an argument already made.

The repository could contain any or all of:

- valid UTF-8 that is valid XML
- valid UTF-8 that is invalid XML
- invalid UTF-8 that is valid XML
- invalid UTF-8 that is invalid XML

Now the invalid XML needs to be escaped, otherwise ra_dav doesn't
work. As I understand it we are free to choose whether to escape the
invalid UTF-8 or not, we could choose to ignore UTF-8 validity on the
server and rely on the client to handle it.

To escape invalid UTF-8 we will need to use a Subversion-specific
mechanism. That means all clients will need to understand that
mechanism if they want to obtain the raw log message. On the other
hand if we don't escape it all clients will need to be capable of
handling invalid UTF-8. It's not clear to me which is better, it's
also not clear to me which we are planning to do.

-- 
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Dec 2 20:41:24 2004

This is an archived mail posted to the Subversion Dev mailing list.