[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] svn_cmdline__edit_file_externally() may not be able to open the target file in locale other than UTF-8

From: Stefan Sperling <stsp_at_elego.de>
Date: Sun, 20 Sep 2020 16:41:15 +0200

On Sun, Sep 20, 2020 at 11:06:21PM +0900, Yasuhito FUTATSUKI wrote:
> On 2020/09/19 5:26, Yasuhito FUTATSUKI wrote:
> > Hi,
> >
> > While I tried to make an example that escape_path() does not work as
> > expected in specific locale such as ja_JP.SJIS or CP932 (suggested by
> > Jun), I found it seems svn_cmdline__edit_file_externally() always
> > passes the file name as UTF-8 even if the LC_CTYPE is other than
> > UTF-8.
> >
> > I think it is also need svn_path_cstring_from_utf8() conversion for
> > file_name in svn_cmdline__edit_file_externally().
>
> As far as I read the code, svn_cmdline__edit_file_externally() is
> called with paths come from svn_io_open_unique_file3() or
> "local_abspath" from svn_client_conflict_t, so I believe it is
> needed.

Your patch looks good. I have one question below:
 
> I wrote a patch address it.
> (attached fix-edit-file-externally-patch.txt)
> [[[
> Fix file name to edit from utf8 to local style.
>
> * subversion/libsvn_subr/cmdline.c (svn_cmdline__edit_file_externally):
> Apply svn_path_cstring_from_utf8() to target file name.
> ]]]
>
> > Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale).
>
> As it is quite few system can use ja_JP.SJIS locale (I only know
> FreeBSD and macOS) and there is also escape_path issue in SJIS locale,
> I modified this script to use ja_JP.eucJP locale.

It looks like apr_escape_shell() will escape 0x5c (backslash, \) when
looking for ASCII character bytes. When 0x5c is escaped this breaks the
SJIS-encoded byte sequence.

Have you already tried always using escape_path() on the UTF-8 version of
the string, and then converting the escaped path to the locale's encoding?
In other words: First use escape_path, then use svn_path_cstring_from_utf8?
Perhaps that will make SJIS work?

Cheers,
Stefan

> [[[
> #!/bin/sh
> # assuming UTF-8 encoding in this file
>
> testdir=/tmp/svn-conflict-edit-filename-test
>
> if [ ! -d ${testdir} ]; then
> mkdir -p ${testdir}
> fi
>
> reposdir=${testdir}/testrepo
> reposurl=file://${reposdir}
> :${svn:=svn}
>
> svnadmin create ${reposdir}
> cat > ${testdir}/record_filename.sh <<EOF
> #!/bin/sh
> LC_CTYPE=C; export LC_CTYPE
> echo \$* > ${testdir}/svn-conflict-edit-file-name.txt
> exit 0
> EOF
>
> LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE
>
> # add a file "予定表.txt" (it means schedule in Japanese)
> # in UTF-8 working copy.
> # "予定表.txt" represented in hex are followings:
> # e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8)
> # 97 5c 92 e8 95 5c 2e 74 78 74 (SJIS; contains two '\'(== 0x5c))
> # cd bd c4 ea 9c bd 2e 74 78 74 (EUC-JP)
> schedfn_utf8="予定表.txt"
> schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis`
> schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp`
>
> ${svn} checkout ${reposurl} ${testdir}/wc-utf-8
> cd ${testdir}/wc-utf-8
>
> cat > ${schedfn_utf8} <<EOF
> 2020/09/19 foo
> EOF
>
> ${svn} add ${schedfn_utf8}
> ${svn} commit -m 'add schedule memo.'
>
> # prepare EUC-JP locale wc.
> (LC_CTYPE=ja_JP.eucJP; export LC_CTYPE ; \
> ${svn} checkout $reposurl ${testdir}/wc-eucjp)
>
> # update the file in UTF-8 wc and commit it
> cat >> ${schedfn_utf8} <<EOF
> 2020/09/20 bar
> EOF
> ${svn} commit -m 'add schedule at 2020/09/20'
>
> # add local modification in EUC-JP wc
> LC_CTYPE=ja_JP.eucJP ; export LC_CTYPE
> cd ${testdir}/wc-eucjp
> cat >> ${schedfn_eucjp} <<EOF
> 2020/09/21 baz
> EOF
>
> ${svn} update --force-interactive --accept edit \
> --editor-cmd "/bin/sh ${testdir}/record_filename.sh"
>
> LC_CTYPE=C ; export LC_CTYPE
> ls | od -t x1
> od -t x1 ${testdir}/svn-conflict-edit-file-name.txt
> ]]]
>
> The result with unpatched svn (last 4 lines):
> [[[
> 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
> 0000013
> 0000000 e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 0a
> 0000016
> ]]]
>
> The result with patched svn (last 4 lines):
> [[[
> 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
> 0000013
> 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a
> 0000013
> ]]]
>
> Cheers,
> --
> Yasuhito FUTATSUKI <futatuki_at_yf.bsclub.org>

> Index: subversion/libsvn_subr/cmdline.c
> ===================================================================
> --- subversion/libsvn_subr/cmdline.c (revision 1881848)
> +++ subversion/libsvn_subr/cmdline.c (working copy)
> @@ -1400,6 +1400,7 @@
> apr_pool_t *pool)
> {
> const char *editor, *cmd, *base_dir, *file_name, *base_dir_apr;
> + const char *file_name_local;
> char *old_cwd;
> int sys_err;
> apr_status_t apr_err;
> @@ -1423,9 +1424,11 @@
> return svn_error_wrap_apr
> (apr_err, _("Can't change working directory to '%s'"), base_dir);
>
> + SVN_ERR(svn_path_cstring_from_utf8(&file_name_local, file_name, pool));
> /* editor is explicitly documented as being interpreted by the user's shell,
> and as such should already be quoted/escaped as needed. */
> - cmd = apr_psprintf(pool, "%s %s", editor, escape_path(pool, file_name));
> + cmd = apr_psprintf(pool, "%s %s", editor,
> + escape_path(pool, file_name_local));
> sys_err = system(cmd);
>
> apr_err = apr_filepath_set(old_cwd, pool);
Received on 2020-09-20 16:41:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.