[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion 1.9.10/1.10.4 choke on non-ASCII resources on HP-UX

From: Branko Čibej <brane_at_apache.org>
Date: Wed, 1 May 2019 02:17:40 +0200

On 30.04.2019 20:18, Osipov, Michael wrote:
> Hi folks,
>
> I am facing an issue a previously not had with 1.9.4 on another HP-UX
> machine. Installed a new one and compiled 1.10.4, does not work and
> downgraded to 1.9.10 same issue.
>
> Using HP-UX 11.31 with "aCC: HP C/aC++ B3910B A.06.29 [Oct 18 2016]".
>
> configure:
>> export PREFIX=/opt/ports
>> export LIBDIR=$PREFIX/lib/hpux32
>> export CC=/opt/aCC/bin/aCC
>> export CONFIGURE="./configure --prefix=$PREFIX --libdir=$LIBDIR"
>> export CPPFLAGS="-I$PREFIX/include"
>> export LDFLAGS="-L$LIBDIR"
>> $CONFIGURE --with-apr=$PREFIX --with-apr-util=$PREFIX --without-apxs
>> --without-berkeley-db --with-serf=$PREFIX --disable-nls
>
> Everything compiles fine besides that the portable object files cannot
> properly converted to machine objects:
>> bash-5.0# /opt/ports/bin/msgfmt -c -o subversion/po/zh_TW.mo
>> subversion/po/zh_TW.po
>> subversion/po/zh_TW.po: warning: Charset "UTF-8" is not supported.
>> msgfmt relies on iconv(),
>>                                  and iconv() does not support "UTF-8".
>>                                  Installing GNU libiconv and then
>> reinstalling GNU gettext
>>                                  would fix this problem.
>>                                  Continuing anyway.
>
> Therefore, I have disabled nls for now. (installed GNU libiconv,
> didn't make a change)

You'd most likely have to recompile (or at least relink) gettext to use
the GNU libiconv.

> All other deps have been compiled on the same machine with the most
> recent version. I have also run the utf8proc tests (swapped getline()
> for fgets()):

utf8proc doesn't convert between encodings, it just handles the Unicode
transformation forms. It's completely self-contained and doesn't use
libiconv.

>> bash-5.0# gmake check
>> gmake -C bench
>> gmake[1]: Entering directory
>> '/tmp/system-compile/apache/utf8proc/utf8proc-2.3.0-patched/bench'
>> gmake[1]: Nothing to be done for 'all'.
>> gmake[1]: Leaving directory
>> '/tmp/system-compile/apache/utf8proc/utf8proc-2.3.0-patched/bench'
>> test/normtest data/NormalizationTest.txt
>> line 42: Part0 # Specific cases
>> line 70: Part1 # Character by character test
>> checking line 1000...
>> checking line 2000...
>> checking line 3000...
>> checking line 4000...
>> checking line 5000...
>> checking line 6000...
>> checking line 7000...
>> checking line 8000...
>> checking line 9000...
>> checking line 10000...
>> checking line 11000...
>> checking line 12000...
>> checking line 13000...
>> checking line 14000...
>> checking line 15000...
>> checking line 16000...
>> line 16969: Part2 # Canonical Order Test
>> checking line 17000...
>> checking line 18000...
>> line 18696: Part3 # PRI #29 Test
>> Passed tests after 18874 lines!
>> test/graphemetest data/GraphemeBreakTest.txt
>> checking line 100...
>> checking line 200...
>> checking line 300...
>> checking line 400...
>> checking line 500...
>> checking line 600...
>> Passed tests after 630 lines!
>> test/charwidth
>> Mismatches with system wcwidth (not necessarily errors):
>>    ... (positive widths for 135297 chars unknown to wcwidth) ...
>> Character-width tests SUCCEEDED.
>> test/misc
>> NFC "ṛ̇" -> "ṛ̇" vs. "ṛ̇"
>> NFD "ṛ̇" -> "ṛ̇" vs. "ṛ̇"
>> NFKC_Casefold "X⁥È­ᴬ" -> "xèa" vs. "xèa"
>> NFKC_Casefold "X⁥È­ᴬ" -> "x⁥èa" vs. "x⁥èa"
>> Unicode version: Makefile has 12.0.0, has API 12.0.0
>> Misc tests SUCCEEDED.
>> test/valid
>> Validity tests SUCCEEDED.
>> test/iterate
>> utf8proc_iterate tests SUCCEEDED, (673) tests passed.
>> test/case
>> More up-to-date than OS unicode tables for 2746 tests.
>> utf8proc case conversion tests SUCCEEDED.
>> test/custom
>> mapped "AaSba" -> "abssba"
>> map_custom tests SUCCEEDED.
>
> Here is the failure:
>> $ svn co https://deblndw011x.ad001.siemens.net/repos/svn/Playground
>> A    Playground/test1234
>> A    Playground/a
>> A    Playground/{U+043F}{U+0440}{U+0438}{U+0432}{U+0435}{U+0442}.txt
>> A    Playground/a/b.txt
>> svn: E155009: Failed to run the WC DB work queue associated with
>> '/net/home/osipovmi/Playground/a', work item 1 (file-install 16
>> {U+043F}{U+0440}{U+0438}{U+0432}{U+0435}{U+0442}.txt 1 0 1 1)
>> svn: E000022: Safe data '/net/home/osipovmi/Playground/' was followed
>> by non-ASCII byte 208: unable to convert to/from UTF-8
>
> but SQLite works:
>> $ sqlite3 Playground/.svn/wc.db
>> SQLite version 3.28.0 2019-04-16 19:49:53
>> Enter ".help" for usage hints.
>> sqlite> select * from WORK_QUEUE;
>> 1|(file-install 16 привет.txt 1 0 1 1)
>> 2|(file-install a/b.txt 1 0 1 1)
>> sqlite> .quit
>
> The terminal and locale are fine though:
>> $ locale
>> LANG=de_DE.utf8
>> LC_CTYPE="de_DE.utf8"
>> LC_COLLATE="de_DE.utf8"
>> LC_MONETARY="de_DE.utf8"
>> LC_NUMERIC="de_DE.utf8"
>> LC_TIME="de_DE.utf8"
>> LC_MESSAGES="de_DE.utf8"
>> LC_ALL=

LC_ALL should probably not be empty. Could be that on HP-UX, the empty
value causes Subversion to use the default C (or POSIX) locale. Try setting

    LC_ALL="de_DE.utf8"; export LC_ALL

Subversion tries setlocale(LC_ALL, "") first and only if that fails, it
tries setlocale(LC_CTYPE, ""). Evidently the first call succeeds, which
is strange but not strictly wrong. I think.

>
> as well as
> > $ curl https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
>
> gives me http://home.apache.org/~michaelo/term%20utf8.png
>
> Where I can start digging now, to make this work again?
>
> Regards,
>
> Michael
Received on 2019-05-01 02:18:02 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.