[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Unicode, dump files, and file names

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: 2006-07-29 09:26:54 CEST

On 7/29/06, Kenneth Porter <shiva@sewingwitch.com> wrote:
> I'm assisting with development of the vss2svn conversion program for
> converting a Visual Source Safe repository to a Subversion dump file.
> <http://www.pumacode.org/projects/vss2svn/>
> In my VSS repo I have some source files that include character 0x85
> (ellipsis) in the name. A typical filename is "Move to Point...-D.bmp",
> where the "..." is the single-byte CP1252 character 0x85.
> <http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT>
> vss2svn extracts the filename from CP1252-encoded VSS repository DB and
> writes it to a UTF-8-encoded XML file.
> What should we put in the dump file for the file name? Should it be
> UTF-8-encoded?

Definitely. Subversion is all UTF-8 internally and with the dump
files, you're immediately in those internals.

> What does "svnadmin load" expect?

'svnadmin load' expects a dumpfile to look like it's generated by
"svnadmin dump" which at least means that all path names are UTF-8.
Files with svn:eol-style property are expected to have CRLF eols when
set to CRLF, same thing for LF - ie svn:eol-style value corresponds to
actual - and CR. If you use svn:eol-style 'native', the file should
have LF eols in the dump.

If you use svn:keywords, the keywords should be in the dump in
unexpanded form; ie: $Id$, not: $Id: $ or $Id: <id value>.



> (I'm seeing gibberish in
> the filename in the resulting WC, and I suspect that double-encoding is
> happening somewhere in the conversion process.)

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jul 29 09:27:23 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.