[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [TSVN] Re: New info on hooks in docs

From: SteveKing <steveking_at_gmx.ch>
Date: 2004-11-19 21:30:05 CET

Norbert Unterberg wrote:

> For what I know about UNICODE, this is not entirely true.
> See:
>
> #1: unicode.org FAQ
> http://www.unicode.org/faq/utf_bom.html#28
> "Q: How I should deal with BOMs?

- 'microsoft conventions for txt files' require BOMs
- only where the precise data stream is know, BOMs are optional. Do you
suggest TSVN keeps a list of file formats and parses each of those
individually with its own parser for the correct encoding (like the xml
utf tag)?
- since TSVN has to deal with _all_ text-like files/streams, there's
just _no_ way to know the encoding of the data. Yes, e.g. for xml files
you could parse the xml tag, but programs dealing with those usually
have a built in xml parser and can easily do that. TSVN can't.

Well, all I can say is that if a file doesn't have BOMs in it, TSVN
(TortoiseMerge in particular) just treats them as plain ASCII files
encoded with the current system encoding. That's it.
Since there's not 100% reliable way to determine the encoding of a file
without BOMs, I won't even try to implement some guessing function for
that (even if I could get up to a reliability of >95%). There would
always be cases where the guessing wouldn't work and that would keep me
busy forever trying to fix it.

So I stick to my recommendations to always use BOMs. They are sometimes
optional, but never illegal. And if there's a BOM, that means the
encoding of the file is defined, if there's none, then it could be anything.

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
Received on Fri Nov 19 21:31:10 2004

This is an archived mail posted to the TortoiseSVN Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.