Branko Čibej wrote:
> Bhuvaneswaran A wrote:
> > On Fri, 2009-12-04 at 12:36 +0000, Julian Foad wrote:
> >
> >> Bhuvaneswaran A wrote:
> >>
> >>> Please find attached the revised patch. I incorporated following
> >>> feedback:
> >>> a) Fix the array slicing part
> >>> b) Escape using ord() instead of removing those characters
> >>> c) Handle "]]>" in CDATA section
> >>> d) Define the ascii table globally (once) and re-use
> >>>
> >>> I also verified this fix by generating the junit files for tests having
> >>> special characters and simulating a test that has "]]>" in failure text.
> >>> With this patch, it generates valid junit file.
> >>>
> >> It looks great. You could also move the definition of 'chars_to_remove'
> >> out of the function, but either way it's fine. Go on, commit it!
> >>
> >
> > Branko, Julian: Thank you for the review comments.
Branko wrote (elsewhere in the thread):
> Julian Foad wrote:
> > I searched on the web and didn't find a really really simple way to
> > escape a set of characters. I think something like
> >
> > for c in chars_to_remove:
> > data = data.replace(c, '%%%0x' % ord(c))
> >
> > would do it.
>
> Please read what I wrote earlier. Second, this is URL escaping, not XML
> quoting. But first, there is no way to represent such control chars in
> XML. Only CR and LF are valid according the the XML spec. Others are
> not; and you can't use character references, e.g.,  to represent
> ESC. That's not valid XML.
Sure. The idea is to make the occasional unexpected control character
appear as something that is valid in an XML CDATA section and is
human-readable. URL escaping rules seemed a fine choice. Have I missed
something?
> > Incorporated the above suggestion and committed in r887178.
>
> Wait, you committed a script that does URL quoting on XML contents? Did
> you look at the output?
I didn't look at any real output, only a hand-crafted test string. Did
you? Why do you ask?
- Julian
Received on 2009-12-04 15:45:04 CET