[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: started applying Marcus' patch

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-07-18 23:16:53 CEST

=?UTF-8?B?QnJhbmtvIMSMaWJlag==?= <brane@xbc.nu> writes:

> >Explicitly telling subversion which properties are really binary and
> >which are really text also allows for commands like proplist -v to
> >avoid printing binary data to stdout and messing up the users tty.
> >
>
> Uh, that would mean adding property properties. -0.999999.

Or having some syntactic convention of the name (other than "svn:") as
I suggested in the other mail, or some other mechanism. I don't think
a generic metaproperty system is a necessary or even particularily
good solution.

> >Moot point. If the native locale is EBCDIC, we're gonna crash and
> >burn anyway, since lots of code assumes that ASCII can be used without
> >transcoding.
> >
>
> Really? Which code is that? I don't think there can be very many shuch
> places.

There's basically two interresting cases.

1) You're compiling on an EBCDIC system, in which case the character
   set of the C compiler is EBCDIC.

   In this case, all code that assumes literal characters and strings
   with only letters and numbers are passable as UTF-8 or can be sent
   as (part of) a HTTP request will be at fault.

2) You're compiling on an ASCII based system, but selecting an EBCDIC
   locale when running the program.

   In this case, anything that e.g. printfs a string literal to stdout
   without converting it will be at fault.

I can point out numerous code points for each of these cases.

> >apr_isascii is equivalent to the arithmetic test, by the way.
> >
>
> No it's not, it's implemented in terms of the libc's isascii(), which
> should work correctly even in EBCDIC locales. APR definitely does
> support those.

Ahem? Notice the occurrence of "ASCII" in the name "isascii()". It's
not called "isebcdic()" or "ischaracter()" or anything, but
"isascii()". It should return 1 for integers in the range 0-127,
since that is the range for "ascii".

From the Solaris manpage:

| isascii()
| Tests for any ASCII character, code between 0 and 0177

If isascii() is going to return true for EBCDIC characters on some
system, then we can't use it here. The check_non_ascii function must
return an error if possible when fed with non-ASCII text. That is its
sole purpose.

  // Marcus

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Jul 18 23:23:15 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.