[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

SVN-DEV HELP NEEDED: What to do about the ra-get-log interface (Was: Clarification for paths argument to svn_ra_get_log2)

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Tue, 03 Jun 2008 12:20:27 -0400

C. Michael Pilato wrote:
> C. Michael Pilato wrote:
>> Martin von Gagern wrote:
>>> Resending, as I forgot to cc the list... sorry.
>>>
>>> C. Michael Pilato wrote:
>>> | I seem to recall that recently I realized that NULL and empty had two
>>> | different interpretations in 'svn log'. I think empty resulted in
>>> (b),
>>> | and (a surprise to me) NULL in (c).
>>>
>>> If that is indeed the case and intended this way, it would certainly
>>> merit documentation.
>>
>> Okay, looking in svn_repos_get_logs4(), the code does the following
>> pretty early:
>>
>> if (! paths)
>> paths = apr_array_make(pool, 0, sizeof(const char *));
>>
>> So it would seem that NULL and empty have the same meaning.
>
> Ohohoh. But that code isn't in the 1.4.x line. I see what you mean,
> now. (And maybe this explains what I *thought* I saw in testing recently.)
>
> So, in 1.4.x, svn_repos_get_logs3() does *not* have the code above. In
> fact, it explicitly treats a NULL 'paths' array as the same as "Gimme
> the logs for all revisions, regardless of what was changed in them."
> Somewhere along the 1.5.x way, that got dropped (probably by me, but
> maybe by Hyrum, as the 'log -g' stuff was being added). And that's bad.
>
> So, first of all, there's a bug in the trunk and in 1.5.x regarding the
> handling of NULL paths in svn_repos_get_logs4().
>
> Interestingly thouth, in 1.4.x there doesn't *appear* to be any way via
> the WebDAV RA layers to pass NULL paths. You can't do it through
> mod_dav_svn, because mod_dav_svn's log REPORT parser always instantiates
> an empty 'paths' array and puts provided paths (if any) into it. (And
> it doesn't seem to care if it never find paths in the REPORT.) svnserve
> and ra_local do the same. So, unless I'm overlooking something, in
> 1.4.x, you cannot make use of this special behavior regarding NULL paths
> via the RA interfaces.
>
> Looking at the 1.5.x codebase, I see the same basic behaviors as in 1.4.x.
>
> So I remain confused both about your compatability concern and the
> behavior I thought I saw the other day.
>
> By the way, I remember where I saw this discrepency now. It was when I
> was writing libsvn_client/merge.c:remove_noop_merge_ranges().
> Originally I passed NULL into the svn_ra_get_logs2() call, but I thought
> I was seeing every single revision coming back through the log_entry
> receiver. So I instead passed an empty array. I was testing against
> svn.collab.net, which was running a 1.5.x RC at the time.
>

Okay, in IRC, Martin clarified some things, namely that it isn't just a NULL
'paths' array in svn_repos_get_logs3() that get special treatment -- it's
also the case where 'paths' is non-NULL and contains exactly one empty path
(which refers to the root of the repository).

So while there is no known way of getting NULL through the RA APIs, there
*was* a way of getting a single root-pointing path there. In the past, some
third-party consumers of our APIs (such as bzr-svn) were violating the
svn_ra_get_logs2() interface by passing absolute paths instead of paths
relative to the session URL through this interface. The RA implementations
were sloppy in their handling, and threw everything through svn_path_join(),
which of course effectively discard the base path when the to-be-added bit
is absolute. This effectively meant that you could request logs of any path
in the repository with any arbitrary session URL.

Some time many months ago, I got frustrated by the sheer number of bugs that
were coming out of inconsistent use of our RA layer within our own codebase,
so I taught the RA interface how to enforce the rules it claimed to have had
since its inception. We now know one side-effect of that change: those
folks who were misusing this API are no longer able to do so, and it is
causing complications for them.

Martin tells me that bzr-svn was banking on this ability to request all
revision logs from any location, because reparenting to the root URL and
fetching logs from there might not be permissible due to access
restrictions. So, if we assume a scenario where I'm allowed to read /trunk
but not /, presumably bzr-svn would use /trunk for the session URL (which is
allowed), but then ask for logs on /. Because we only validate the returned
logs, this would work fine -- we'd just wind up return incomplete metadata
for any revisions that contained changed to paths outside of /trunk. As a
general statement about the usability of Subversion in auth-restricted
scenarios, this kinda makes sense. (We've seen other situations where, say,
I can't do a move of /trunk/foo to /trunk/bar because we try to anchor the
commit editor at /. Those kinds of things are easy to explain to a
Subversion developer, but users are left bewildered.)

So, what to do? Do we explicitly allow absolute paths through all of our RA
interfaces? (Please don't make inconsistent policy here and give per-method
exceptions.) Do we stick with the published APIs and send our best wishes
and a vase of flowers to folks who were previously misusing the APIs and
have now been caught doing so?

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on 2008-06-03 18:20:50 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.