[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: TortoiseMerge bug - character encoding

From: Stuart Celarier <SCelarier_at_corillian.com>
Date: 2006-01-14 18:19:05 CET

Stefan wrote:
>Sure, for xml it's publicly specified, but it still
>requires heavy parsing to get the encoding from those
>files - maybe even a full xml parser.

One would only have to process the XML declaration to see if it has an
encoding declaration [1]. The declaration itself is always in UTF-8.
That's not much parsing and far short of a full XML parser.

This is not relevant to us: we are not using TortoiseMerge. It is too
simplistic for an important job.

For real XML jobs, the OP probably wants to use something that compares
and merges XML infosets (e.g., so that attribute order is correctly
ignored), probably post-schema validation infosets - PSVI - (e.g., so
that default attribute values are correctly handled). At that point,
encoding issues are moot.

Stuart Celarier | Corillian Corporation

[1] http://www.w3.org/TR/REC-xml/#NT-EncodingDecl.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: users-help@tortoisesvn.tigris.org
Received on Sat Jan 14 18:19:15 2006

This is an archived mail posted to the TortoiseSVN Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.