[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r12091 - in trunk/subversion: include libsvn_client libsvn_subr

From: <kfogel_at_collab.net>
Date: 2004-12-01 19:41:01 CET

brane@tigris.org writes:
> Log:
> Introduce character classification functions that work correctly with
> (the ASCII subset of) UTF-8 strings regardless of locale settings.
>
> * subversion/include/svn_ctype.h: New.
> * subversion/libsvn_subr/ctype.c: New.
> * subversion/libsvn_subr/genctype.py: Generates the class table in ctype.c.
>
> * subversion/libsvn_client/prop_commands.c: Include svn_ctype.h, not ctype.h.
> (is_valid_prop_name): Use the new svn_ctype macros to test characters.
> Follow-up to r12029.

This commit is a masterpiece.

A few minor comments below:

> --- (empty file)
> +++ trunk/subversion/include/svn_ctype.h Mon Nov 29 19:56:04 2004
> @@ -0,0 +1,156 @@
> +/**
> + * @copyright
> + * ====================================================================
> + * Copyright (c) 2000-2004 CollabNet. All rights reserved.
> + *
> + * This software is licensed as described in the file COPYING, which
> + * you should have received as part of this distribution. The terms
> + * are also available at http://subversion.tigris.org/license-1.html.
> + * If newer versions of this license are posted there, you may use a
> + * newer version instead, at your option.
> + *
> + * This software consists of voluntary contributions made by many
> + * individuals. For exact contribution history, see the revision
> + * history and logs, available at http://subversion.tigris.org/.
> + * ====================================================================
> + * @endcopyright
> + *
> + * @file svn_ctype.h
> + * @brief Character classification routines
> + * @since New in 1.2.
> + */
> +
> +
> +#ifndef SVN_CTYPE_H
> +#define SVN_CTYPE_H

At the end of this file, you have

   #ifdef __cplusplus
   }
   #endif /* __cplusplus */

But here at the beginning, you don't have the corresponding opening
form.

> +#include <apr.h>
> +
> +/** Table of flags for character classification. */
> +extern const apr_uint32_t *const svn_ctype_table;
> +
> +
> +/** Check if @a c is in the character class described by @a flags.
> + * The @a flags is a bitwise-or combination of @t SVN_CTYPE_* *
> + * constants.
> + */
> +#define svn_ctype_test(c, flags) \
> + (0 != (svn_ctype_table[(unsigned char)(c)] & (flags)))

Just curious, why the "0 !="? Did you want to fold the boolean value
to either 1 or 0, instead of non-zero or 0?

> +/**
> + * @defgroup ctype_basic Basic character classification - ASCII only
> + * @{
> + */
> +
> +/* Basic character classes */
> +#define SVN_CTYPE_CNTRL 0x0001 /**< Control character */
> +#define SVN_CTYPE_SPACE 0x0002 /**< Whitespace */
> +#define SVN_CTYPE_DIGIT 0x0004 /**< Decimal digit */
> +#define SVN_CTYPE_UPPER 0x0008 /**< Uppercase letter */
> +#define SVN_CTYPE_LOWER 0x0010 /**< Lowercase letter */
> +#define SVN_CTYPE_PUNCT 0x0020 /**< Punctuation mark */
> +#define SVN_CTYPE_XALPHA 0x0040 /**< Hexadecimal digits A to F */
> +#define SVN_CTYPE_ASCII 0x0080 /**< ASCII subset*/
> +
> +/* Derived character classes */
> +/** ASCII letter */
> +#define SVN_CTYPE_ALPHA (SVN_CTYPE_LOWER | SVN_CTYPE_UPPER)
> +/** ASCII letter or decimal digit */
> +#define SVN_CTYPE_ALNUM (SVN_CTYPE_ALPHA | SVN_CTYPE_DIGIT)
> +/** ASCII hexadecimal digit */
> +#define SVN_CTYPE_XDIGIT (SVN_CTYPE_DIGIT | SVN_CTYPE_XALPHA)
> +/** Printable ASCII except space */
> +#define SVN_CTYPE_GRAPH (SVN_CTYPE_PUNCT | SVN_CTYPE_ALNUM)
> +/** All printable ASCII */
> +#define SVN_CTYPE_PRINT (SVN_CTYPE_GRAPH | SVN_CTYPE_SPACE)
> +
> +
> +/** Check if @c is an ASCII control character. */
> +#define svn_ctype_iscntrl(c) svn_ctype_test((c), SVN_CTYPE_CNTRL)

Here and throughout this file, you've accidentally left out the "c"
after "@c", thus formatting the word "is" instead. I won't bother to
list out all the other instances.

Also, you might want to state explicitly in this header that "ASCII"
means 7-bit ASCII, and foreshadow the UTF8 stuff coming later. I know
this might technically be redundant, but in colloquial use, many
people imagine a beast "ASCII" with two parts, upper and lower (i.e.,
8-bit and 7-bit). They might wonder why we seem to ignore upper.

> +static const apr_uint32_t ctype_table[256] =
> + {
> + /* nul */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* soh */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* stx */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* etx */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* eot */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* enq */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* ack */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* bel */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* bs */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* ht */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL | SVN_CTYPE_SPACE,
> + /* nl */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL | SVN_CTYPE_SPACE,
> + /* vt */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL | SVN_CTYPE_SPACE,
> + /* np */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL | SVN_CTYPE_SPACE,
> + /* cr */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL | SVN_CTYPE_SPACE,

We're going to remember, when it comes time to take care of that
"#TBD" for XMLNAME and URISAFE, that VT and NP are special, right?

Not at all a complaint about this commit, of course. I'm sure the
question of what officially is and isn't whitespace in ASCII was
settled at the Council of Elrond long ago, and we can only humbly
reflect those decisions here. I'm just making a mental note to myself
(well, and to this list :-) ) that for XMLNAME purposes, only three of
these SVN_CTYPE_SPACEs are useable (plus SP below, of course).

> + /* so */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* si */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* dle */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* dc1 */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* dc2 */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* dc3 */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* dc4 */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* nak */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* syn */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* etb */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* can */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* em */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* sub */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* esc */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* fs */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* gs */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* rs */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* us */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* sp */ SVN_CTYPE_ASCII | SVN_CTYPE_SPACE,
> + /* ! */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* " */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* # */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* $ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* % */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* & */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ' */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ( */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ) */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* * */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* + */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* , */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* - */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* . */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* / */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* 0 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 1 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 2 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 3 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 4 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 5 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 6 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 7 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 8 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* 9 */ SVN_CTYPE_ASCII | SVN_CTYPE_DIGIT,
> + /* : */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ; */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* < */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* = */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* > */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ? */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* @ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* A */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* B */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* C */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* D */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* E */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* F */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER | SVN_CTYPE_XALPHA,
> + /* G */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* H */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* I */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* J */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* K */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* L */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* M */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* N */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* O */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* P */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* Q */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* R */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* S */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* T */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* U */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* V */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* W */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* X */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* Y */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* Z */ SVN_CTYPE_ASCII | SVN_CTYPE_UPPER,
> + /* [ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* \ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ] */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ^ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* _ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ` */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* a */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* b */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* c */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* d */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* e */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* f */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER | SVN_CTYPE_XALPHA,
> + /* g */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* h */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* i */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* j */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* k */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* l */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* m */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* n */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* o */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* p */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* q */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* r */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* s */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* t */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* u */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* v */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* w */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* x */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* y */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* z */ SVN_CTYPE_ASCII | SVN_CTYPE_LOWER,
> + /* { */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* | */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* } */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* ~ */ SVN_CTYPE_ASCII | SVN_CTYPE_PUNCT,
> + /* del */ SVN_CTYPE_ASCII | SVN_CTYPE_CNTRL,
> + /* x80 */ SVN_CTYPE_UTF8CONT,
> + /* x81 */ SVN_CTYPE_UTF8CONT,
> + /* x82 */ SVN_CTYPE_UTF8CONT,
> + /* x83 */ SVN_CTYPE_UTF8CONT,
> + /* x84 */ SVN_CTYPE_UTF8CONT,
> + /* x85 */ SVN_CTYPE_UTF8CONT,
> + /* x86 */ SVN_CTYPE_UTF8CONT,
> + /* x87 */ SVN_CTYPE_UTF8CONT,
> + /* x88 */ SVN_CTYPE_UTF8CONT,
> + /* x89 */ SVN_CTYPE_UTF8CONT,
> + /* x8a */ SVN_CTYPE_UTF8CONT,
> + /* x8b */ SVN_CTYPE_UTF8CONT,
> + /* x8c */ SVN_CTYPE_UTF8CONT,
> + /* x8d */ SVN_CTYPE_UTF8CONT,
> + /* x8e */ SVN_CTYPE_UTF8CONT,
> + /* x8f */ SVN_CTYPE_UTF8CONT,
> + /* x90 */ SVN_CTYPE_UTF8CONT,
> + /* x91 */ SVN_CTYPE_UTF8CONT,
> + /* x92 */ SVN_CTYPE_UTF8CONT,
> + /* x93 */ SVN_CTYPE_UTF8CONT,
> + /* x94 */ SVN_CTYPE_UTF8CONT,
> + /* x95 */ SVN_CTYPE_UTF8CONT,
> + /* x96 */ SVN_CTYPE_UTF8CONT,
> + /* x97 */ SVN_CTYPE_UTF8CONT,
> + /* x98 */ SVN_CTYPE_UTF8CONT,
> + /* x99 */ SVN_CTYPE_UTF8CONT,
> + /* x9a */ SVN_CTYPE_UTF8CONT,
> + /* x9b */ SVN_CTYPE_UTF8CONT,
> + /* x9c */ SVN_CTYPE_UTF8CONT,
> + /* x9d */ SVN_CTYPE_UTF8CONT,
> + /* x9e */ SVN_CTYPE_UTF8CONT,
> + /* x9f */ SVN_CTYPE_UTF8CONT,
> + /* xa0 */ SVN_CTYPE_UTF8CONT,
> + /* xa1 */ SVN_CTYPE_UTF8CONT,
> + /* xa2 */ SVN_CTYPE_UTF8CONT,
> + /* xa3 */ SVN_CTYPE_UTF8CONT,
> + /* xa4 */ SVN_CTYPE_UTF8CONT,
> + /* xa5 */ SVN_CTYPE_UTF8CONT,
> + /* xa6 */ SVN_CTYPE_UTF8CONT,
> + /* xa7 */ SVN_CTYPE_UTF8CONT,
> + /* xa8 */ SVN_CTYPE_UTF8CONT,
> + /* xa9 */ SVN_CTYPE_UTF8CONT,
> + /* xaa */ SVN_CTYPE_UTF8CONT,
> + /* xab */ SVN_CTYPE_UTF8CONT,
> + /* xac */ SVN_CTYPE_UTF8CONT,
> + /* xad */ SVN_CTYPE_UTF8CONT,
> + /* xae */ SVN_CTYPE_UTF8CONT,
> + /* xaf */ SVN_CTYPE_UTF8CONT,
> + /* xb0 */ SVN_CTYPE_UTF8CONT,
> + /* xb1 */ SVN_CTYPE_UTF8CONT,
> + /* xb2 */ SVN_CTYPE_UTF8CONT,
> + /* xb3 */ SVN_CTYPE_UTF8CONT,
> + /* xb4 */ SVN_CTYPE_UTF8CONT,
> + /* xb5 */ SVN_CTYPE_UTF8CONT,
> + /* xb6 */ SVN_CTYPE_UTF8CONT,
> + /* xb7 */ SVN_CTYPE_UTF8CONT,
> + /* xb8 */ SVN_CTYPE_UTF8CONT,
> + /* xb9 */ SVN_CTYPE_UTF8CONT,
> + /* xba */ SVN_CTYPE_UTF8CONT,
> + /* xbb */ SVN_CTYPE_UTF8CONT,
> + /* xbc */ SVN_CTYPE_UTF8CONT,
> + /* xbd */ SVN_CTYPE_UTF8CONT,
> + /* xbe */ SVN_CTYPE_UTF8CONT,
> + /* xbf */ SVN_CTYPE_UTF8CONT,
> + /* xc0 */ 0,
> + /* xc1 */ SVN_CTYPE_UTF8LEAD,
> + /* xc2 */ SVN_CTYPE_UTF8LEAD,
> + /* xc3 */ SVN_CTYPE_UTF8LEAD,
> + /* xc4 */ SVN_CTYPE_UTF8LEAD,
> + /* xc5 */ SVN_CTYPE_UTF8LEAD,
> + /* xc6 */ SVN_CTYPE_UTF8LEAD,
> + /* xc7 */ SVN_CTYPE_UTF8LEAD,
> + /* xc8 */ SVN_CTYPE_UTF8LEAD,
> + /* xc9 */ SVN_CTYPE_UTF8LEAD,
> + /* xca */ SVN_CTYPE_UTF8LEAD,
> + /* xcb */ SVN_CTYPE_UTF8LEAD,
> + /* xcc */ SVN_CTYPE_UTF8LEAD,
> + /* xcd */ SVN_CTYPE_UTF8LEAD,
> + /* xce */ SVN_CTYPE_UTF8LEAD,
> + /* xcf */ SVN_CTYPE_UTF8LEAD,
> + /* xd0 */ SVN_CTYPE_UTF8LEAD,
> + /* xd1 */ SVN_CTYPE_UTF8LEAD,
> + /* xd2 */ SVN_CTYPE_UTF8LEAD,
> + /* xd3 */ SVN_CTYPE_UTF8LEAD,
> + /* xd4 */ SVN_CTYPE_UTF8LEAD,
> + /* xd5 */ SVN_CTYPE_UTF8LEAD,
> + /* xd6 */ SVN_CTYPE_UTF8LEAD,
> + /* xd7 */ SVN_CTYPE_UTF8LEAD,
> + /* xd8 */ SVN_CTYPE_UTF8LEAD,
> + /* xd9 */ SVN_CTYPE_UTF8LEAD,
> + /* xda */ SVN_CTYPE_UTF8LEAD,
> + /* xdb */ SVN_CTYPE_UTF8LEAD,
> + /* xdc */ SVN_CTYPE_UTF8LEAD,
> + /* xdd */ SVN_CTYPE_UTF8LEAD,
> + /* xde */ SVN_CTYPE_UTF8LEAD,
> + /* xdf */ SVN_CTYPE_UTF8LEAD,
> + /* xe0 */ 0,
> + /* xe1 */ SVN_CTYPE_UTF8LEAD,
> + /* xe2 */ SVN_CTYPE_UTF8LEAD,
> + /* xe3 */ SVN_CTYPE_UTF8LEAD,
> + /* xe4 */ SVN_CTYPE_UTF8LEAD,
> + /* xe5 */ SVN_CTYPE_UTF8LEAD,
> + /* xe6 */ SVN_CTYPE_UTF8LEAD,
> + /* xe7 */ SVN_CTYPE_UTF8LEAD,
> + /* xe8 */ SVN_CTYPE_UTF8LEAD,
> + /* xe9 */ SVN_CTYPE_UTF8LEAD,
> + /* xea */ SVN_CTYPE_UTF8LEAD,
> + /* xeb */ SVN_CTYPE_UTF8LEAD,
> + /* xec */ SVN_CTYPE_UTF8LEAD,
> + /* xed */ SVN_CTYPE_UTF8LEAD,
> + /* xee */ SVN_CTYPE_UTF8LEAD,
> + /* xef */ SVN_CTYPE_UTF8LEAD,
> + /* xf0 */ 0,
> + /* xf1 */ SVN_CTYPE_UTF8LEAD,
> + /* xf2 */ SVN_CTYPE_UTF8LEAD,
> + /* xf3 */ SVN_CTYPE_UTF8LEAD,
> + /* xf4 */ SVN_CTYPE_UTF8LEAD,
> + /* xf5 */ SVN_CTYPE_UTF8LEAD,
> + /* xf6 */ SVN_CTYPE_UTF8LEAD,
> + /* xf7 */ SVN_CTYPE_UTF8LEAD,
> + /* xf8 */ 0,
> + /* xf9 */ SVN_CTYPE_UTF8LEAD,
> + /* xfa */ SVN_CTYPE_UTF8LEAD,
> + /* xfb */ SVN_CTYPE_UTF8LEAD,
> + /* xfc */ 0,
> + /* xfd */ SVN_CTYPE_UTF8LEAD,
> + /* xfe */ 0,
> + /* xff */ 0
> + };
> +
> +const apr_uint32_t *const svn_ctype_table = ctype_table;

By the way, I noticed some nice UTF8 tests in the Putty code at

   $ svn cat svn://ixion.tartarus.org/main/putty/charset/utf8.c

Looks like the license is compatible with ours, if you want to swipe
the tests :-).
 
> --- (empty file)
> +++ trunk/subversion/libsvn_subr/genctype.py Mon Nov 29 19:56:04 2004
> @@ -0,0 +1,93 @@
> +#! /usr/bin/python
> +"""getctype.py - Generate the svn_ctype character classification table.
> +"""

If the table was autogenerated from this file, then a comment in the C
file ought to say so, so people realize they're dealing with generated
content.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 1 19:47:43 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.