[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Canonical Paths

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-08-29 21:55:29 CEST

Russ Allbery <rra@stanford.edu> writes:

> I'm not sure if this matters for the purposes of your use of this, but
> quite a lot of UTF-8 software will refuse characters like this. There was

We're talking about the internal path format of Subversion, so we only
have to consider the UTF-8 software "Subversion". Other apps will
never see it in this format. Is is either converted to local notation
(if we're to use to locally, for example when creating a file in the
wc), or URL-escaped using the special "fix" I outlined, before passed
to anything else.

> much discussion of this a while back and varient representations have been
> explicitly banned in the UTF-8 spec now, so data containing such sequences
> is invalid UTF-8.

Which, as I said to Brane, is a plus, since then we are guaranteed not
to get them from iconv.

> The justification was security worries about having multiple
> representations for special characters.

Here we don't have that though, since we are canoncializing. The
special path separator character would always be encoded as 0x2f, and
the slash character _when not used as a path separator_ (conceptually
a different character) would always be encoded as 0xc0 0xaf (of 0xfe,
if we decide that's better).

  // Marcus

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Aug 29 21:57:06 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.