On 2020/09/19 5:26, Yasuhito FUTATSUKI wrote:
> Hi,
>
> While I tried to make an example that escape_path() does not work as
> expected in specific locale such as ja_JP.SJIS or CP932 (suggested by
> Jun), I found it seems svn_cmdline__edit_file_externally() always
> passes the file name as UTF-8 even if the LC_CTYPE is other than
> UTF-8.
>
> I think it is also need svn_path_cstring_from_utf8() conversion for
> file_name in svn_cmdline__edit_file_externally().
As far as I read the code, svn_cmdline__edit_file_externally() is
called with paths come from svn_io_open_unique_file3() or
"local_abspath" from svn_client_conflict_t, so I believe it is
needed.
I wrote a patch address it.
(attached fix-edit-file-externally-patch.txt)
[[[
Fix file name to edit from utf8 to local style.
* subversion/libsvn_subr/cmdline.c (svn_cmdline__edit_file_externally):
Apply svn_path_cstring_from_utf8() to target file name.
]]]
> Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale).
As it is quite few system can use ja_JP.SJIS locale (I only know
FreeBSD and macOS) and there is also escape_path issue in SJIS locale,
I modified this script to use ja_JP.eucJP locale.
[[[
#!/bin/sh
# assuming UTF-8 encoding in this file
testdir=/tmp/svn-conflict-edit-filename-test
if [ ! -d ${testdir} ]; then
mkdir -p ${testdir}
fi
reposdir=${testdir}/testrepo
reposurl=file://${reposdir}
:${svn:=svn}
svnadmin create ${reposdir}
cat > ${testdir}/record_filename.sh <<EOF
#!/bin/sh
LC_CTYPE=C; export LC_CTYPE
echo \$* > ${testdir}/svn-conflict-edit-file-name.txt
exit 0
EOF
LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE
# add a file "予定表.txt" (it means schedule in Japanese)
# in UTF-8 working copy.
# "予定表.txt" represented in hex are followings:
# e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8)
# 97 5c 92 e8 95 5c 2e 74 78 74 (SJIS; contains two '\'(== 0x5c))
# cd bd c4 ea 9c bd 2e 74 78 74 (EUC-JP)
schedfn_utf8="予定表.txt"
schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis`
schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp`
${svn} checkout ${reposurl} ${testdir}/wc-utf-8
cd ${testdir}/wc-utf-8
cat > ${schedfn_utf8} <<EOF
2020/09/19 foo
EOF
${svn} add ${schedfn_utf8}
${svn} commit -m 'add schedule memo.'
# prepare EUC-JP locale wc.
(LC_CTYPE=ja_JP.eucJP; export LC_CTYPE ; \
${svn} checkout $reposurl ${testdir}/wc-eucjp)
# update the file in UTF-8 wc and commit it
cat >> ${schedfn_utf8} <<EOF
2020/09/20 bar
EOF
${svn} commit -m 'add schedule at 2020/09/20'
# add local modification in EUC-JP wc
LC_CTYPE=ja_JP.eucJP ; export LC_CTYPE
cd ${testdir}/wc-eucjp
cat >> ${schedfn_eucjp} <<EOF
2020/09/21 baz
EOF
${svn} update --force-interactive --accept edit \
--editor-cmd "/bin/sh ${testdir}/record_filename.sh"
LC_CTYPE=C ; export LC_CTYPE
ls | od -t x1
od -t x1 ${testdir}/svn-conflict-edit-file-name.txt
]]]
The result with unpatched svn (last 4 lines):
[[[
0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
0000013
0000000 e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 0a
0000016
]]]
The result with patched svn (last 4 lines):
[[[
0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
0000013
0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
0000013
]]]
Cheers,
--
Yasuhito FUTATSUKI <futatuki_at_yf.bsclub.org>
Received on 2020-09-20 16:07:58 CEST