"Peter N. Lundblad" <peter@famlundblad.se> writes:
> > I'm happy with just UTF-8, that is, no control chars except LF, CR,
> > and TAB.
>
> By "just UTF-8" it seems like you're still mixing what's valid UTF-8 and
> what are valid XML 1.0 characters. Just wanting to make sure there is no
> confusion left here. Control chars are valid UTF-8 but invalid in XML 1.0,
> wherever they occur. You just can't have them, without another layer of
> encoding. And a correct XML parser shouldn't give the application any
> control chars (except the whitespace ones named above).
Yes. Sorry -- I'm totally clear on it now, but earlier today I was
still wrecked among heathen dreams.
What you say above matches my understanding.
> > Okay. +1 on prohibiting control characters except LF, CR, TAB. I
> > doubt any users are going to suffer much if we do that. There doesn't
> > seem to be much opposition to it on this list either (am I forgetting
> > anyone?), and clearly it'll make Peter Lundblad very happy :-).
>
> Hehehehe... Somewhere, I rembember having seen something about newlines
> and the dump format. If there is a problem there, we need to ban it as
> well. I don't know. Else, +1 from me as well.
AAAAaaargh. I misspoke. Let me try again:
Valid Subversion paths are UTF8 strings, but with no control chars
allowed except for TAB.
That is what I meant to +1.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 1 22:16:52 2004