[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Issue with merging files containing spaces

From: Matthew Inger <mattinger_at_gmail.com>
Date: Tue, 2 Sep 2008 14:50:08 -0400

It is also my opinion that this should be fixed in subversion. I have found
the ne_path_escape
function for doing precisely this. However, there's no easy way for
subversion itself to find the
point in the uri where the path starts. It would probably be helpful if
neon provided a way for
subversion to do this. At the moment, this functionality is embedded in the
ne_uri_parse function.

Perhaps at the very least a function such as this:

int ne_path_find(const char *uri) {
    const char *p, *s;

    p = s = uri;

    /* => s = p = URI-reference */

    if (uri_lookup(*p) & URI_ALPHA) {
        while (uri_lookup(*p) & URI_SCHEME)
            p++;

        if (*p == ':') {
            s = p + 1;
        }
    }

    /* => s = heir-part, or s = relative-part */

    if (s[0] == '/' && s[1] == '/') {
        const char *pa;

        /* => s = "//" authority path-abempty (from expansion of
         * either heir-part of relative-part) */

        /* authority = [ userinfo "@" ] host [ ":" port ] */

        s = pa = s + 2; /* => s = authority */

        while (*pa != '/' && *pa != '\0')
            pa++;
        /* => pa = path-abempty */

        p = s;
        while (p < pa && uri_lookup(*p) & URI_USERINFO)
            p++;

        if (*p == '@') {
            s = p + 1;
        }
        /* => s = host */

        if (s[0] == '[') {
            p = s + 1;

            while (*p != ']' && p < pa)
                p++;

            if (p == pa || (p + 1 != pa && p[1] != ':')) {
                /* Ill-formed IP-literal. */
                return -1;
            }

            p++; /* => p = colon */
        } else {
            /* Find the colon. */
            p = pa;
            while (*p != ':' && p > s)
                p--;
        }

        if (p == s) {
            p = pa;
            /* No colon; => p = path-abempty */
        } else if (p + 1 != pa) {
            /* => p = colon */
        }

        s = pa;

        if (*s == '\0') {
            s = "/"; /* FIXME: scheme-specific. */
        }
    }

    /* => s = path-abempty / path-absolute / path-rootless
     * / path-empty / path-noscheme */

    p = s;

    return strlen(uri) - strlen(p);
}

It's purpose would be to return the index at which the path starts within
the uri.
From that that point, it would be simple for subversion code to take the
earlier
part of the string, and append the escaped path:

static svn_error_t *
parse_url(ne_uri *uri, const char *url)
{
  char * uri_esc;

  int ppos = ne_path_find(url);
  if (ppos != -1) {
    char * pe = ne_path_escape(url + ppos);
    uri_esc = ne_malloc(ppos + strlen(pe) + 1);
    strncpy(uri_esc, url, ppos);
    strcpy(uri_esc + ppos, pe);
    ne_free(pe);
  }
  else {
    uri_esc = ne_malloc(strlen(url) + 1);
    strcpy(uri_esc, url);
  }

  if (ne_uri_parse(uri_esc, uri)
      || uri->host == NULL || uri->path == NULL || uri->scheme == NULL)
    {
      ne_uri_free(uri);
      return svn_error_createf(SVN_ERR_RA_ILLEGAL_URL, NULL,
                               _("URL '%s' is malformed or the "
                                 "scheme or host or path is missing"), url);
    }
  if (uri->port == 0)
    uri->port = ne_uri_defaultport(uri->scheme);

  ne_free(uri_esc);

  return SVN_NO_ERROR;
}

This seems to have done the trick for me in my local environment, and is
purely a client side fix.

On Tue, Sep 2, 2008 at 1:59 PM, C. Michael Pilato <cmpilato_at_collab.net>wrote:

> Matthew Inger wrote:
> > I'm running into an issue when doing branch merges. It seems that file
> > names with spaces in them
> > cannot properly be merged, resulting in the merging error:
> >
> > svn: URL 'http://host/svn/repos/trunk/dir with spaces/file.txt' is
> > malformed or the scheme or host or path is missing.
> >
> > I've isolated down the cause of the problem to the following function:
> >
> > int ne_uri_parse(const char *uri, ne_uri *parsed);
> >
> > It seems that when trying to parse the path component, it is doing the
> > following:
> >
> > while (uri_lookup(*p) & URI_SEGCHAR)
> > p++;
> >
> >
> > The problem here is that definition of URI_SEGCHAR includes only the
> > following:
> >
> > FS -- /
> > PC -- %
> > PS -- +
> > SD -- ! $ & ' ( ) * + , ; =
> > CL -- :
> > AL -- Alpabet
> > DG -- Digit
> > DS -- dash
> > DT -- .
> > US -- _
> > TD -- ~
> >
> > Notice that spaces are not included in this definition.
> >
> > So the solution is either to change how neon parses the URI, or to
> > properly escape the spaces with a "+" symbol
> > in the "session.c" file which is calling into the neon library.
> >
> > Any thoughts?
>
> Subversion should be properly URI-encoding the paths it sends through Neon,
> and apparently it is not. That's a bug. (Though, the correct URI encoded
> form of a space character is "%20" -- the plus sign is, I believe, only for
> when the space occurs in the query parameters portion of a URL, not the
> path
> part.)
>
> --
> C. Michael Pilato <cmpilato_at_collab.net>
> CollabNet <> www.collab.net <> Distributed Development On Demand
>
Received on 2008-09-02 20:52:24 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.