[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: cvs2svn.py : about --encoding option

From: Greg Stein <gstein_at_lyra.org>
Date: 2003-07-15 23:58:06 CEST

On Tue, Jul 15, 2003 at 02:37:01PM -0700, Jack Repenning wrote:
> At 2:35 PM -0700 7/15/03, Greg Stein wrote:
> >Mike and I are somewhat familiar with kanjilib.py. Great module, but note
> >that its autodetection can return "yes, no, maybe". That third case can be
> >troublesome :-)
>
> That "maybe" case is inherent in the problem, isn't it? Some codes
> just plain are ambiguous.

Yup, definitely.

In this case, ShiftJIS and EUC share some codepoints. If a string contains
characters *only* in that shared space, then you get a "maybe". If there is
*any* character in the string which falls into one of the exclusive
codepoint spaces, then you get to say "yes" or "no".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 15 23:52:08 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.