[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Backport review - r1035992 in branches/1.6.x-r917523: ./ subversion/mod_dav_svn/mirror.c

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 17 Nov 2010 18:57:48 +0000

I've reviewed and tested this change. Here are the results. Executive
summary: basically works in the simple case, but I have too many
reservations to give a +1.

The patch attempts to fix DAV mirroring when the Location directive
value contains a space, such as <Location "/repo 1">. For the original
log message of r917523 on trunk, see below [1].

I tested this change using the attached 'dav-mirror-autocheck.sh' [2].

Using dav-mirror-autocheck.sh configured internally as:
  MASTER_LOCATION="repo 1"
  SLAVE_LOCATION="repo 1"
  SYNC_LOCATION="sync"
  [...]
  # slave 'normal' location
    <Location "/${SLAVE_LOCATION}">
  # slave 'sync' location
    <Location "/${SYNC_LOCATION}">
  # master
    <Location "/${MASTER_LOCATION}">

the script's test passed. Without this patch, the svnsync commits
occurred only on the slave repository and so the script's test failed
with:

  dav-mirror-autocheck.sh: FAIL: master, slave are at rev 0, 4, not 4

So in terms of achieving its goal of allowing a space in a Location
directive, it works.

I then tried
  SYNC_LOCATION="sync 1"

and that didn't work - the initial sync failed with:

svnsync: Server sent unexpected return value (405 Method Not Allowed) in
response to PROPPATCH request for '/sync%25201/!svn/bln/0'

That makes me wonder whether the patch is missing a bit that it should
include, or whether in fact that similar-looking case is handled
entirely separately.

Reviewing the code:

On Wed, 2010-11-17, julianfoad_at_apache.org wrote:
> New Revision: 1035992
>
> URL: http://svn.apache.org/viewvc?rev=1035992&view=rev
> Log:
> On the 1.6.x-r917523 branch: Merge r917523 from trunk, resolving conflicts.
>
> Modified:
> subversion/branches/1.6.x-r917523/ (props changed)
> subversion/branches/1.6.x-r917523/subversion/mod_dav_svn/mirror.c

> Modified: subversion/branches/1.6.x-r917523/subversion/mod_dav_svn/mirror.c
> URL: http://svn.apache.org/viewvc/subversion/branches/1.6.x-r917523/subversion/mod_dav_svn/mirror.c?rev=1035992&r1=1035991&r2=1035992&view=diff
> ==============================================================================
> --- subversion/branches/1.6.x-r917523/subversion/mod_dav_svn/mirror.c (original)
> +++ subversion/branches/1.6.x-r917523/subversion/mod_dav_svn/mirror.c Wed Nov 17 11:57:58 2010
> @@ -40,8 +40,10 @@ static void proxy_request_fixup(request_
>
> r->proxyreq = PROXYREQ_REVERSE;
> r->uri = r->unparsed_uri;
> - r->filename = apr_pstrcat(r->pool, "proxy:", master_uri,
> - uri_segment, NULL);
> + r->filename = (char *) svn_path_uri_encode(apr_pstrcat(r->pool, "proxy:",
> + master_uri,
> + uri_segment,
> + NULL), r->pool);

This change effectively says, "We are deciding that the MASTER_URI and
URI_SEGMENT arguments to this local function should be passed as
unencoded strings".

> r->handler = "proxy-server";
> ap_add_output_filter("LocationRewrite", NULL, r, r->connection);
> ap_add_output_filter("ReposRewrite", NULL, r, r->connection);
> @@ -74,7 +76,7 @@ int dav_svn__proxy_merge_fixup(request_r
> of the repository isn't guaranteed to have on hand. */
> if (r->method_number == M_PROPFIND ||
> r->method_number == M_GET) {
> - seg = ap_strstr(r->unparsed_uri, root_dir);
> + seg = ap_strstr(r->uri, root_dir);
> if (seg && ap_strstr_c(seg,
> apr_pstrcat(r->pool, special_uri,
> "/wrk/", NULL))) {
> @@ -87,7 +89,7 @@ int dav_svn__proxy_merge_fixup(request_r
> /* If this is a write request aimed at a public URI (such as
> MERGE, LOCK, UNLOCK, etc.) or any as-yet-unhandled request
> using a "special URI", we have to doctor it a bit for proxying. */
> - seg = ap_strstr(r->unparsed_uri, root_dir);
> + seg = ap_strstr(r->uri, root_dir);
> if (seg && (r->method_number == M_MERGE ||
> r->method_number == M_LOCK ||
> r->method_number == M_UNLOCK ||
> @@ -134,9 +136,11 @@ apr_status_t dav_svn__location_in_filter
> ctx = f->ctx = apr_pcalloc(r->pool, sizeof(*ctx));
>
> apr_uri_parse(r->pool, master_uri, &ctx->uri);
> - ctx->remotepath = ctx->uri.path;
> + /* We are url encoding the current url and the master url
> + as incoming(from client) request body has it encoded already. */
> + ctx->remotepath = svn_path_uri_encode(ctx->uri.path, r->pool);

Note: ctx->uri is not used elsewhere, so is acting here as a temporary
variable. So this change is equivalent to having encoded MASTER_URI,
which is the return value of dav_svn__get_master_uri().

> ctx->remotepath_len = strlen(ctx->remotepath);
> - ctx->localpath = dav_svn__get_root_dir(r);
> + ctx->localpath = svn_path_uri_encode(dav_svn__get_root_dir(r), r->pool);

Here we encode the return value of dav_svn__get_root_dir().

Note: dav_svn__get_master_uri() and dav_svn__get_root_dir() appear to
return raw strings from the config file, however the values were written
there. In this test, some debug logging reveals they are
'http://127.0.0.2:28929/repo 1' and '/repo 1' respectively.

> ctx->localpath_len = strlen(ctx->localpath);
> ctx->pattern = apr_strmatch_precompile(r->pool, ctx->localpath, 0);
> ctx->pattern_len = ctx->localpath_len;
> @@ -187,6 +191,7 @@ apr_status_t dav_svn__location_header_fi
> const char *master_uri;
>
> master_uri = dav_svn__get_master_uri(r);
> + master_uri = svn_path_uri_encode(master_uri, r->pool);

Here we encode the return value of dav_svn__get_master_uri().

>
> if (!r->main && master_uri) {
> const char *location, *start_foo = NULL;
> @@ -203,6 +208,7 @@ apr_status_t dav_svn__location_header_fi
Context here: new_uri = ap_construct_url(pool, apr_pstrcat(pool,
> dav_svn__get_root_dir(r),
> start_foo, NULL),
> r);
> + new_uri = svn_path_uri_encode(new_uri, r->pool);

So this encodes the result of dav_svn__get_root_dir(), and *also*
START_FOO; but START_FOO appears to be already encoded, as it is part of
one of the R->headers_out that has just been matched against an encoded
string.

A typical example of start_foo is
  '/!svn/wbl/5d58b555-c3c5-4cbb-a4ed-2d90d74e309f/3'. In this example,
encoding makes no difference, and it appears that only in extreme
configurations would encoding make any difference to it.

So technically wrong, practically probably OK.

> apr_table_set(r->headers_out, "Location", new_uri);
> }
> }
> @@ -229,9 +235,11 @@ apr_status_t dav_svn__location_body_filt
> ctx = f->ctx = apr_pcalloc(r->pool, sizeof(*ctx));
>
> apr_uri_parse(r->pool, master_uri, &ctx->uri);
> - ctx->remotepath = ctx->uri.path;
> + /* We are url encoding the current url and the master url
> + as incoming (from master) request body has it encoded already. */
> + ctx->remotepath = svn_path_uri_encode(ctx->uri.path, r->pool);
> ctx->remotepath_len = strlen(ctx->remotepath);
> - ctx->localpath = dav_svn__get_root_dir(r);
> + ctx->localpath = svn_path_uri_encode(dav_svn__get_root_dir(r), r->pool);
> ctx->localpath_len = strlen(ctx->localpath);
> ctx->pattern = apr_strmatch_precompile(r->pool, ctx->remotepath, 0);
> ctx->pattern_len = ctx->remotepath_len;

This block is the same as the one I commented on above.

Main issues:

* As a matter of style, I would like to see dav_svn__get_master_uri()
and dav_svn__get_root_dir() return properly encoded URI results, rather
than all these callers doing it. These callers here are the *only*
callers of those two functions. I added comments in the doc strings of
those functions on trunk, some time ago, noting that it's not clear how
the return value is encoded.

That makes for an equivalent but much simpler patch. Such a patch is
attached as 'simpler-percent-encoding-1.patch'.

* We should consider doing canonicalization and "autoescaping" instead
of this brute-force encoding, so as to automatically escape common
non-URI characters such as a space while not preventing users from using
percent escape sequences to include other special characters such as
accented characters.

* We should investigate the similar-looking issue of percent-escaping
the SYNC_LOCATION value in <Location "sync 1">. I tried but ran out of
time for tracking down where that needs to happen, or by the look of the
error message perhaps where it is incorrectly happening twice.

So as you can tell, I'm not really keen to +1 this back-port.

[1] Original log message:
[[[
With the below apache configuration(See the <space> character "/svn 1/").

<Location "/svn 1/">
  DAV svn
  SVNParentPath /repositories
</Location>
<Location "/svn 2/">
  DAV svn
  SVNParentPath /repositories-slave
  SVNMasterURI "http://localhost/svn 1"
</Location>

Write through proxy is *not* happening and commit happens *directly* inside the slave.

* subversion/mod_dav_svn/mirror.c
  (proxy_request_fixup): URI encode the to be proxied file name.
  (dav_svn__proxy_request_fixup): r->unparsed_uri is in url encoded form while
    root_dir is not in encoded form. So use r->uri to compare with root_dir.
  (dav_svn__location_in_filter): URL Encode the 'find & replace' urls as
    the request body has it in url encoded format.
  (dav_svn__location_header_filter): Encode the master_uri as the response from
    master has the Location header url encoded already. Set the outgoing Location
    header url encoded.
  (dav_svn__location_body_filter): URL Encode the 'find & replace' urls as
    the response body has it in url encoded format.
]]]

[2] 'dav-mirror-autocheck.sh' sets up an Apache httpd instance with two
virtual hosts and performs a simple svnsync mirroring test between them.
It was originally contributed by Dave Brown of WANdisco about a year
ago, and this version is tweaked by me. We might want to commit it in
subversion/tests/cmdline, alongside 'davautocheck.sh'.

Received on 2010-11-17 19:58:33 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.