[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

svn_cmdline__edit_file_externally() may not be able to open the target file in locale other than UTF-8

From: Yasuhito FUTATSUKI <futatuki_at_yf.bsdclub.org>
Date: Sat, 19 Sep 2020 05:26:30 +0900

Hi,

While I tried to make an example that escape_path() does not work as
expected in specific locale such as ja_JP.SJIS or CP932 (suggested by
Jun), I found it seems svn_cmdline__edit_file_externally() always
passes the file name as UTF-8 even if the LC_CTYPE is other than
UTF-8.

I think it is also need svn_path_cstring_from_utf8() conversion for
file_name in svn_cmdline__edit_file_externally().

Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale).
[[[
#!/bin/sh
# assuming UTF-8 encoding in this file

testdir=/tmp/svn-conflict-edit-filename-test

if [ ! -d ${testdir} ]; then
  mkdir -p ${testdir}
fi

reposdir=${testdir}/testrepo
reposurl=file://${reposdir}

svnadmin create ${reposdir}
cat > ${testdir}/record_filename.sh <<EOF
#!/bin/sh
LC_CTYPE=C; export LC_CTYPE
echo \$* > ${testdir}/svn-conflict-edit-file-name.txt
exit 0
EOF

LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE

# add a file "予定表.txt" (it means schedule in Japanese)
# in UTF-8 working copy.
# "予定表.txt" represented in hex are followings:
# e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8)
# 97 5c 92 e8 95 5c 2e 74 78 74 (SJIS; contains two '\'== 0x5c)
# cd bd c4 ea 9c bd 2e 74 78 74 (EUC-JP)
schedfn_utf8="予定表.txt"
schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis`
schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp`

svn checkout ${reposurl} ${testdir}/wc-utf-8
cd ${testdir}/wc-utf-8

cat > ${schedfn_utf8} <<EOF
2020/09/19 foo
EOF

svn add ${schedfn_utf8}
svn commit -m 'add schedule memo.'

# prepare SJIS locale wc.
(LC_CTYPE=ja_JP.SJIS; export LC_CTYPE ; \
    svn checkout $reposurl ${testdir}/wc-sjis)

# update the file in UTF-8 wc and commit it
cat >> ${schedfn_utf8} <<EOF
2020/09/20 bar
EOF
svn commit -m 'add schedule at 2020/09/20'

# add local modification in SJIS wc
LC_CTYPE=ja_JP.SJIS ; export LC_CTYPE
cd ${testdir}/wc-sjis
cat >> ${schedfn_sjis} <<EOF
2020/09/21 baz
EOF

svn update --force-interactive --accept edit \
  --editor-cmd "/bin/sh ${testdir}/record_filename.sh"

LC_CTYPE=C ; export LC_CTYPE
ls | od -t x1
od -t x1 ${testdir}/svn-conflict-edit-file-name.txt
]]]

The last 2 lines in this script makes hexdump of conflict file names,
actual and passed to the editor command. Out put of those lines
should be same, however, I got below:

[[[
Checked out revision 0.
A 予定表.txt
Adding 予定表.txt
Transmitting file data .done
Committing transaction...
Committed revision 1.
A /tmp/svn-conflict-edit-filename-test/wc-sjis/�\��\.txt
Checked out revision 1.
Sending 予定表.txt
Transmitting file data .done
Committing transaction...
Committed revision 2.
Updating '.':
C �\��\.txt
Updated to revision 2.
Merge conflicts in '�\��\.txt' marked as resolved.
Summary of conflicts:
  Text conflicts: 0 remaining (and 1 already resolved)
0000000 97 5c 92 e8 95 5c 2e 74 78 74 0a
0000013
0000000 e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 0a
0000016
]]]

Cheers,

-- 
Yasuhito FUTATSUKI <futatuki_at_yf.bsclub.org>
Received on 2020-09-18 22:27:39 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.