[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] #1 Update on port to OS400/EBCDIC

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2006-02-10 20:20:47 CET

Paul Burba wrote:
>
> Ouch, I always thought redundant code was a good thing ;-)

Heh! (Actually, that reminds me, one day I want to work on software that does
make use of redundancy for improved reliability. Obviously that requires a
monitoring function that checks if one of the multiple memories/algorithms/etc.
gives a different result from the others. The idea is used for fault-tolerant
RAID storage, but that's only redundancy in data, not code. With redundant
self-healing code you can write software that can run from RAM that is
suffering random corruptions from cosmic rays or faulty hardware or whatever.
Of course the monitoring function itself must be redundantly distributed.
Interesting stuff.)

> [...] I confess I'm a bit uneasy about how I modify argv. I pass &argv to
> svn_cmdline_args_from_ebcdic(), convert the args to utf-8 in a separately
> allocated char ** and then cast away argv's "constness" so I can point it
> at the converted char **. Is avoiding const like this legit? (Somewhere
> in the back of my head the ghostly voice of a CS TA is saying
> "Woooo...don't cast away a const!" :-)

Indeed casting away "const" is generally wrong. You've just got a bit confused
over the use of "const", which is hardly surprising with so many levels of
indirection going on in this pointer. See below.

> In hindsight I should have presented that next patch first, since this
> (argv) patch is dependent on it. Briefly, this is because IBM uses ints
> not strings as arguments to the apr_xlate_* functions. So a call like
> "svn_utf_cstring_to_utf8_ex (&argv_utf8[i], (*argv)[i], "0",
> SVN_UTF_ETOU_XLATE_HANDLE, pool)" in this patch won't work without the
> upcoming patch which converts "0" to 0.

OK... maybe. I'm afraid I don't understand the encoding parameters so I can't
comment on that aspect. This present patch does compile, though?

>
> [[[
> OS400/EBCDIC Port: Convert command-line arguments from EBCDIC to UTF-8.
>
> This is the first of several patches to allow Subversion to run on under IBM's
> OS400 V5R4. Despite IBM's building of APR/Apache with UTF support in V5R4,
> command line arguments to main() are still encoded in EBCDIC.

Nice.

>
> * subversion/include/svn_cmdline.h
> (svn_cmdline_args_from_ebcdic): New function declaration.
>
> * subversion/libsvn_subr/cmdline.c
> (SVN_UTF_ETOU_XLATE_HANDLE): New xlate key for EBCDIC (CCSID 0) to
> UTF-8 (CCSID 1208) conversions.
> (svn_cmdline_args_from_ebcdic): New function definition.
>
> * subversion/svn/main.c
> * subversion/svnadmin/main.c
> * subversion/svndumpfilter/main.c
> * subversion/svnlook/main.c
> * subversion/svnserve/main.c
> * subversion/svnsync/main.c
> (main): Convert command line arguments from EBCDIC to UTF-8
> with svn_cmdline_args_from_ebcdic().
> ]]]
>
> Index: subversion/include/svn_cmdline.h
> ===================================================================
[...]
> +#ifdef AS400_UTF8
> +/** Convert each of the @argc strings in @argv from ebcdic to utf-8.
> + * @a pool is used for all allocations.
> + */

That would be, "@a argc" and "@a argv" for Doxygen notation.

Let's rename the argument to "argv_p", as it's a pointer to what's normally
called "argv", and this is just too confusing otherwise!

Be a bit more explicit about what this function actually does with its
parameters. So, something like:

   /** Convert each of the @a argc strings in @a *argv_p from ebcdic to utf-8.
    * Replace @a *argv_p with a pointer to a new array of the new strings, all
    * allocated in @a pool.
    */

(or being even more explicit wouldn't hurt; this is still a bit opaque).

> +svn_error_t *
> +svn_cmdline_args_from_ebcdic (int argc,
> + char * const * const *argv,
> + apr_pool_t *pool);

I think you've got the "const"s in the wrong place; this presently says that
the individual characters are writable (which isn't what's wanted) and that the
argument itself points to something constant, whereas you want to write to what
the argument points to. So:

   const char *const **argv_p,

> Index: subversion/libsvn_subr/cmdline.c
> ===================================================================
> --- subversion/libsvn_subr/cmdline.c (revision 18362)
> +++ subversion/libsvn_subr/cmdline.c (working copy)
> @@ -46,6 +46,10 @@
> #define SVN_UTF_CONTOU_XLATE_HANDLE "svn-utf-contou-xlate-handle"
> #define SVN_UTF_UTOCON_XLATE_HANDLE "svn-utf-utocon-xlate-handle"
>
> +#ifdef AS400_UTF8
> +#define SVN_UTF_ETOU_XLATE_HANDLE "svn_utf-etou_xlate_handle"

In this value (between the quotes), can we have all dashes rather than mixing
with some underscores, to match the style of the existing handles just above.

> @@ -446,3 +450,22 @@
> +
> +#ifdef AS400_UTF8
> +svn_error_t *
> +svn_cmdline_args_from_ebcdic (int argc,
> + char * const * const *argv,
> + apr_pool_t *pool)
> +{
> + int i;
> + char **argv_utf8 = apr_palloc (pool, sizeof(char *) * argc);

Then if you declare this as "const char **"...

> + for (i = 0; i < argc; i++)
> + {
> + SVN_ERR (svn_utf_cstring_to_utf8_ex (&argv_utf8[i], (*argv)[i], "0",

... it will firstly pass the right kind of pointer as the first arg here (I was
getting a warning) ...

> + SVN_UTF_ETOU_XLATE_HANDLE,
> + pool));
> + }
> + *((char***)argv) = argv_utf8;

... and then you won't need any cast at all here, and all is well.

> + return SVN_NO_ERROR;
> +}
> +#endif
> \ No newline at end of file

Oops - please let the last line of the file end with a newline character. It
would be a trivial matter except that it's an error to some compilers (at least
it was an error if it occurred on a preprocessor line on one compiler that I
used to use).

> Index: subversion/svn/main.c
> ===================================================================
> --- subversion/svn/main.c (revision 18362)
> +++ subversion/svn/main.c (working copy)
> @@ -830,6 +830,10 @@
> return EXIT_FAILURE;
> }
>
> +#ifdef AS400_UTF8
> + SVN_INT_ERR (svn_cmdline_args_from_ebcdic (argc, &argv, pool));
> +#endif
> +

Beautiful.

> /* Else, parse options. */
> apr_getopt_init (&os, pool, argc, argv);
> os->interleave = 1;

Thanks. Nearly there.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Feb 10 20:24:50 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.