[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r917523 - mod_dav_svn/mirror.c - Are the URIs URI-encoded?

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Wed, 17 Mar 2010 13:21:09 -0400

C. Michael Pilato wrote:
> Julian Foad wrote:
>> On Mon, 2010-03-01, kameshj_at_apache.org wrote:
>>> Author: kameshj
>>> Date: Mon Mar 1 13:48:01 2010
>>> New Revision: 917523
>>>
>>> URL: http://svn.apache.org/viewvc?rev=917523&view=rev
>>> Log:
>>> With the below apache configuration(See the <space> character "/svn 1/").
>>>
>>> <Location "/svn 1/">
>> [...]
>>> Write through proxy is *not* happening and commit happens *directly* inside the slave.
>> Hi Kamesh.
>>
>> Today I have studied this change thoroughly, and it appears to me that
>> it changes the interpretation of the URL given in the <Location>
>> directive (and the SVNMasterURI directive) in a way that is probably
>> undesirable.
>>
>> Previously, the URL was assumed to be URI-encoded, and thus this one
>> with a space in it did not work because it did not match the canonical
>> equivalent ("/svn%201/"). That caused the "bug" that you were trying to
>> fix, I believe.
>>
>> After this patch, it appears to me that the URL given in the <Location>
>> directive (and the one in the SVNMasterURI directive) is assumed to be
>> NOT encoded, and mod_dav_svn will encode it internally.
>>
>> The new behaviour will be wrong for any Location URL that is already
>> written with %XX encoding, because it will encode it again.
>>
>> If we agree that the URL should be assumed already to be URI-encoded (at
>> least to the extent necessary to make "%" characters unambiguous), I
>> think there are two possible solutions:
>>
>> 1) Tell the user to use proper (canonically URI-encoded) URIs.
>>
>> 2) Assume the URI is already in (partly) URI-encoded form but
>> re-encode it into canonical URI-encoded form, encoding characters such
>> as SPACE but not re-encoding the "%" character, and also unencode any
>> characters that should not be encoded in canonical URI-encoded form.
>>
>> (1) is rather unfriendly.
>
> This agreement isn't something "we" need to have. <Location> is a directive
> fully owned by the Apache HTTP Server software, and there are many years of
> precedent for its proper usage. We need only consult the Apache docs to
> determine whether the path provided to the <Location> directive is defined
> to be URI-encoded or not (then consult the Apache devs if the docs are
> insufficient) and then we need to make sure that our code behaves according
> to those expectation.
>
> The SVNMasterURI directive is ours, and we can define its encoding level any
> way we wish to do so. (I would suggest that it is most wise to define it
> similarly to mod_proxy's directives.) Whatever we decide to go with, our
> code must behave accordingly. That might mean extra work performing our own
> internal canonicalization (beyond whatever rules core Apache might have in
> place for its directives). But it's the only sane way to go about things.
>
> Solution (2) is a recipe for disaster. The URI is or it isn't URI-encoded,
> period. Choose one, document it, enforce it, and move on.
>

FWIW, the Apache docs have this to say about <Location>:

   For all origin (non-proxy) requests, the URL to be matched is a URL-path
   of the form /path/. No scheme, hostname, port, or query string may be
   included. For proxy requests, the URL to be matched is of the form
   scheme://servername/path, and you must include the prefix.

Not the clearest thing in the world. Some testing seems to indicate that
the path to Location is *not* expected to be URI-encoded, though.

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on 2010-03-17 18:21:41 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.