[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r12029 - trunk/subversion/libsvn_client

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2004-11-25 17:08:52 CET

On Wed, 24 Nov 2004 kfogel@tigris.org wrote:

> Author: kfogel
> Date: Wed Nov 24 20:00:46 2004
> New Revision: 12029
>
> Modified:
> trunk/subversion/libsvn_client/prop_commands.c
> Log:
> Fix issue #1832: Don't use potentially locale-sensitive tests to
> determine if characters are in a specific subset of the ASCII subset
> of UTF8. Thanks to Joe Orton <joe@light.plus.com> for analysis.
>
> * subversion/libsvn_client/prop_commands.c
> (is_valid_prop_name): Test numeric values of characters directly.
>
>
> Modified: trunk/subversion/libsvn_client/prop_commands.c
> Url: http://svn.collab.net/viewcvs/svn/trunk/subversion/libsvn_client/prop_commands.c?view=diff&rev=12029&p1=trunk/subversion/libsvn_client/prop_commands.c&r1=12028&p2=trunk/subversion/libsvn_client/prop_commands.c&r2=12029
> ==============================================================================
> --- trunk/subversion/libsvn_client/prop_commands.c (original)
> +++ trunk/subversion/libsvn_client/prop_commands.c Wed Nov 24 20:00:46 2004
> @@ -46,14 +46,27 @@
> {
> const char *p = name;
>
> - /* Each byte of a UTF8-encoded non-ASCII character has its high bit set and
> - * so will be rejected by this function. */
> - if (! isalpha (*p) && ! strchr ("_:", *p))
> + /* The characters we allow use identical representations in UTF8
> + and ASCII, so we can just test for the appropriate ASCII codes.
> + But we can't use standard C character notation ('A', 'B', etc)
> + because there's no guarantee that this C environment is using
> + ASCII. So we hardcode the numbers below. */
> +
> + if (! ((*p >= 65 && *p <= 90) /* ASCII 'A' to 'Z' */
> + || (*p >= 97 && *p <= 122) /* ASCII 'a' to 'z' */
> + || (*p == 95) /* ASCII '_' */
> + || (*p == 58))) /* ASCII ':' */
> return FALSE;
> p++;
> for (; *p; p++)
> {
> - if (! isalnum (*p) && ! strchr (".-_:", *p))
> + if (! ((*p >= 65 && *p <= 90) /* ASCII 'A' to 'Z' */
> + || (*p >= 97 && *p <= 122) /* ASCII 'a' to 'z' */
> + || (*p >= 48 && *p <= 57) /* ASCII '0' to '9' */
> + || (*p == 95) /* ASCII '_' */
> + || (*p == 58) /* ASCII ':' */
> + || (*p == 46) /* ASCII '.' */
> + || (*p == 45))) /* ASCII '-' */

We have to decide, once and for all, if we support systems with an
execution character set that isn't ASCII-based (i.e. ASCII is a subset).
As brane points out, we have other places that use character constants.
Unless you are going to fix all these, it makes no sense to say 65 instead
of 'A'. If we want to support EBCDIC and such, we should define symbolic
constants instead of spreading these numbers all over the code.

An aside: why limit property names to ASCII? Is there a reason other than
"it is simple for now"? Else, I'll file an issue about supporting the
whole Name production in XML 1.1. (Or do we have to stick to XML 1.0?)

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Nov 25 17:00:35 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.