[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Canonicalizing relative URLs

From: C. Michael Pilato <cmpilato_at_red-bean.com>
Date: Tue, 11 Jan 2011 16:16:02 -0500

I'm looking at issue #3601, and am reworking the way that
svn_uri_canonicalize() behaves -- namely, I'm teaching it to normalize the
case of hex-digit pairs (of the "%AB" variety) for all URIs, not just URLs.
 (Currently, it does this only for URIs with scheme data.) But I find
myself with a small problem: what to do about calls like these in

      if (canonicalize_url)
          item->url = svn_uri_canonicalize(item->url, pool);

where item->url -- which is parsed from an externals definition -- looks
like "^/foo/bar". Or "//foo/bar". They aren't really well-formed URIs.
They aren't really URLs. We don't really want to resolve them into full
schema-bearing URLs at this point because some callers need it explicitly
unresolved. But we'd like to canonicalize the non-magical-prefix parts of
them, for sure.

Note that even today the trunk code has problems. If you have an external
definition that makes use of the "//server.com/bar" format, it gets munged
by the above code to just "/server.com/bar", which ain't right.

Externals definitions are weird in this respect, but I think the only place
we need to pay this much attention to them is in the aforementioned parser
function. Perhaps it needs to just examine the URL strings itself,
determine the portions of them which are safe for canonicalization, and then
tack the canonicalized bits back onto the old prefixes before returning?
Anybody have any other bright ideas?

C. Michael Pilato <cmpilato@red-bean.com> | http://cmpilato.blogspot.com/
Received on 2011-01-11 22:16:43 CET

This is an archived mail posted to the Subversion Dev mailing list.