[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Unicode characters in filenames on windows

From: Branko Čibej <brane_at_wandisco.com>
Date: Wed, 12 Jun 2013 04:37:44 +0200

On 12.06.2013 02:58, Варфоломеев Игорь wrote:
> Hi all,
> I'm still not sure if it's a bug, or if I'm doing something wrong. But I'm unable to get TortoiseSVN
> command-line tool to work with files with UTF-8 characters in their name.
> (1.7.10 r1485443, part of TortoiseSVN 1.7.13 @Win7 x64)
>
>
> I've posted the following message here
> ( http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3057388 )
> but was suggested
> ( http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3057494 )
> to re-address it to users_at_subversion.apache.org :
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------
> *** THE ISSUE ***
>
>
> Workflow:
>
> 1. Create "c:\temp\UNCtest\R_UNCtest\" folder
> 2. Create a repository with default file structure in it
> 3. Checkout "trunk" dir to "c:\temp\WC\trunk"
> 4. Create file "c:\temp\WC\trunk\1‐2.txt" ,
> note, that filename consists of 3 symbols, and the one in the middle is "HYPHEN" or &#8208, (see http://www.fileformat.info/info/unicode/char/2010/index.htm )
>
> 5. Add and commit this file with Tortoise GUI.
> (this works OK)
> 6. start windows cmd
> 7. make sure your cmd is set to use UTF-8 compatible font, for example, "Consolas" (see http://stackoverflow.com/questions/10764920/utf-16-on-cmd-exe/10765469#10765469 ).
> 8. navigate to "c:\temp\WC\trunk"
> 9. type "dir" - you should see the listing correctly, including "1‐2.txt" file
> 10. Type "mkdir 1‐2" - this should correctly create a directory.
> 11. Type "svn info 1‐2.txt"
> Result:
> --------------------------------------------------
> svn: warning: W155010: The node 'C:\TEMP\UNCtest\WC\trunk\1?2.txt' was not found
> .
>
> svn: E200009: Could not display info for all targets because some targets don't
> exist

I believe this happens because the "chcp" command changes the OEM
(console) code-page, but Subversion uses the ANSI (Windows) code page
for input and output conversion. In other words, the "chcp" does not
affect the command-line client in any way.

> * Am I doing something wrong?

Not as such. :)

> * Or could this situation be treated as a bug?
> * Or, maybe a “feature request”?

I think the only way to actually get this right is to change the way we
read from and write to the console on Windows. Instead of converting
strings to some native encoding and using the ordinary output functions,
we should convert to UTF-16 and use wide-char output functions instead.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com
Received on 2013-06-12 04:38:23 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.