[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: AW: svnlook proplist & unicode characters

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Wed, 17 Dec 2014 10:27:54 +0000

"Matthias Ludwig" <matthias.ludwig_at_stl-software.de> writes:

> run(pathToSvn, pathToTest, repo, env,pathToSvn+"\\svnadmin","create",repo.getAbsolutePath());
> run(pathToSvn, pathToTest, repo, env,pathToSvn+"\\svn","checkout",url,wc.getAbsolutePath());
>
>
> new File(wc.getAbsoluteFile()+"\\a\\o\u0308").mkdirs();

That shows you passing a literal decomposed character through Java
String to the OS without going through Subversion. Are you using the
65001 code page here? Does Java String do any conversion on decomposed
literals? The filename will be UTF-16 on disk so some conversion has
happened somewhere.

>
> run(pathToSvn, pathToTest, repo, env,pathToSvn+"\\svn","add",wc.getAbsolutePath()+"\\a","--depth","infinity");
>
> run(pathToSvn, pathToTest, repo, env,pathToSvn+"\\svn","commit",wc.getAbsolutePath(),"-m","comment");

That does not involve the decomposed literal. Subversion will get
something from the OS when it looks inside 'a' but whether it is
decomposed depends on what conversion happened above. I don't know what
tools are available on Windows to look at encoding of file names but you
could run "svnadmin dump" on the repository and see what encoding was
used in the repository.

When your mail got to me it included:

A wc\a\o?

so I can't tell what encoding was used on disk.

> run(pathToSvn, pathToTest, repo, env,pathToSvn+"\\svnlook","proplist",repo.getAbsolutePath(),"//a//o\u0308");
>

Now you are attempting to pass the decomposed literal to Subversion and
using code page 65001. I don't know what, if any, conversion Java
String will do.

Your mail included

svnlook: E160013: Pfad »/a/o¨« existiert nicht

and what I see is 'o' '0xC2' '0xA8' so the decomposed character U+0308
has been converted to U+00A8. I don't know if that conversion happened
during the test or as part of the email process. U+0308 and U+00A8 are
different paths as far as Subversion is concerned.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*
Received on 2014-12-17 11:28:26 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.