[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Is --enable-utf8 working everywhere?

From: Branko Čibej <brane_at_xbc.nu>
Date: 2002-07-17 05:57:53 CEST

Karl Fogel wrote:

>Branko Čibej <brane@xbc.nu> writes:
>
>
>>Karl, about the file name conversions in Subversion: remember what I
>>said about APR using UTF-8 file names directly on some platforms?
>>Well, the conversions in SVN should probably be coded like this:
>>
>>
>
>Yeah, remember that. But does the platform-specific conditional code
>have to go in Subversion? I understand that APR rightly shouldn't do
>any conversion here, since the output would be the same as the input,
>but perhaps APR could somehow signal this fact back? (The fact that
>no conversion is necessary, that is). After all, it is our
>portability layer... Then Subversion, on receiving this signal, could
>simply dup the input in the right pool and return it, without having
>any platform-specific tests in its own code.
>

We'd have to add some kind of predicate into APR that lets us check for
that. Something like this, implemented in the platform-specific filepath.c:

APR_DECLARE(int) apr_filepath_is_utf8 (void);

The Unix implementation (and probably most others) would just return 0.
On Windows, it would be

APR_DECLARE(int) apr_filepath_is_utf8 (void)
{
#if APR_HAS_UNICODE_FS
    IF_WIN_OS_IS_UNICODE
        return 1;
#endif
#if APR_HAS_ANSI_FS
    ELSE_WIN_OS_IS_ANSI
        return 0;
#endif
}

and the conversions would become, e.g.,

    char *utf8_filename = (something);
    char *native_filename;
    if (apr_filepath_is_utf8())
       native_filename = apr_pstrdup(pool, utf8_filename);
    else
      native_filename = convert_utf8_to_locale_charset(utf8_filename);

>(Btw we *do* have to dup into the right pool, can't just assign back,
>as Marcus and I learned the hard way :-) ).
>

Bang? Crash? Clobber, even? :-)

BTW; I just noticed that the apr_filepath_* functions on Windows can
potentially fail horribly if the paths are not UTF-8 (so, not
IF_WIN_OS_IS_UNICODE) and the locale uses Shift-JIS, because '\' can be
the second byte in a SJIS doublebyte char. Talk about fun.

-- 
Brane Čibej   <brane_at_xbc.nu>   http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jul 17 05:58:22 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.