[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Removing the --enable-utf8 flag

From: Bill Tutt <rassilon_at_lyra.org>
Date: 2002-07-18 22:44:33 CEST

I think this is fine for Alpha. I'm not so sure for 1.0 though.

FYI,
Bill

----
Do you want a dangerous fugitive staying in your flat?
No.
Well, don't upset him and he'll be a nice fugitive staying in your flat.
 
> -----Original Message-----
> From: Karl Fogel [mailto:kfogel@newton.ch.collab.net]
> Sent: Thursday, July 18, 2002 1:30 PM
> To: dev@subversion.tigris.org
> Subject: Removing the --enable-utf8 flag
> 
> I'd like to remove the --enable-utf8 configuration option from
> Subversion, even though HEAD of apr/apr-util doesn't have working i18n
> at the moment.  Here's how this would work:
> 
> Currently, subversion/libsvn_subr/utf.c has two compile-time
> conditional code paths:
> 
>    * If --enable-utf8, then attempt conversion from/to native/utf8.
>      If a conversion function returns error, then bomb out entirely.
> 
>    * Else if not --enable-utf8, then never attempt conversion, but
>      just check for "illegal" chars in the data we would have
>      converted.  (Illegal here means eighth-bit set and non-whitespace
>      control characters.  See check_non_ascii() in utf.c.)
> 
> Here's how this would become a run-time decision:
> 
>    * Always attempt conversion.  If the conversion fails (for example
>      because the underlying xlation mechanism isn't working, as is
>      currently the case), *then* check for non_ascii, and bomb only if
>      there are illegal characters in the data.  Otherwise, we proceed,
>      effectively treating the data as if it were already UTF-8,
>      because we know it's all safe ascii characters.
> 
> Thus we remove a compile-time option, become more robust, make
> everyone's lives simpler, and fulfill our requisite ten hours of
> mandatory asteroid mining per week.
> 
> Does anyone see any problems with this?
> 
> Even the shifted charset encodings use ESC or something to signal the
> shift, so I feel pretty confident that check_non_ascii() will rarely
> allow a false positive to pass.  But i18n is a treacherous minefield
> -- anyone who sees a hole in this plan, please speak up now.
> 
> -K
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Jul 18 22:44:59 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.