On 29.07.2014 14:55, Stefan Fuhrmann wrote:
> On Tue, Jul 29, 2014 at 1:52 PM, Branko Čibej <brane_at_wandisco.com
> <mailto:brane_at_wandisco.com>> wrote:
>
> On 29.07.2014 13:34, Stefan Fuhrmann wrote:
>> On Mon, Jul 28, 2014 at 7:38 PM, Branko Čibej <brane_at_wandisco.com> <mailto:brane_at_wandisco.com> wrote:
>>> On 28.07.2014 19:19, stefan2_at_apache.org <mailto:stefan2_at_apache.org> wrote:
>>>> Author: stefan2
>>>> Date: Mon Jul 28 17:19:47 2014
>>>> New Revision: 1614080
>>>>
>>>> URL: http://svn.apache.org/r1614080
>>>> Log:
>>>> On the authzperf branch: Implement the notion of path rule ordering
>>>> by making svn_config_t iterate through sections in declaration order.
>>>> This is done using a simple linked list because we can't remove
>>>> sections but only add them.
>>> No no no no!
>> What exactly did my change break?
>>
>> Right now, the svn_config interface returns the sections in random order.
>> I changed the code such that this random order happens to be the same
>> as the (static and dynamic) declaration order.
>>
>>> Changing the svn_config_t structure is not the right to do this. The correct
>>> approach here is to parametrise the svn_config parser with a constructor
>>> method, then create a new constructor for the authz parser which will then
>>> get all entries in the file order. Please revert these svn_config changes.
>> I reverted the changes for now as I agree that the new behaviour
>> should be somewhat more explicit.
>>
>> However, since this is not about construction (the result would
>> still be a hash)
>
> You are making way too many assumptions about that. See my
> svn_config parser parametrization in r1614213
>
>
>
>
>> but about iteration, I see only two basic options:
>>
>> 1. Invent a new data structure that duplicates 80+% of svn_config_t
>> to provide the same functionality but different ordering through
>> its 'enumerate sections' interface. This basically would be r1614080
>> minus a few bits that we happen not to call in the authz context.
>>
>> 2. Reapply r1614080 but let 'enumerate sections' return data in
>> hash order as on /trunk. Bump the 'enumerate sections' API
>> to include a "use declaration order' option.
>
> 3. Feed the authz in-memory representation directly from the
> config parser, without using an intermediate svn_config_t. This is
> what we should do. The authz structure is inherently different
> from a config structure, because it has different semantics and
> access patterns.
>
>
> Except that this will be equivalent to 1. but worse and here is why.
>
> In many situation, two "users" will be used with the same connection,
> anonymous and the actual user. Having svn_config_t as an intermediate
> representation prevents us from parsing the file twice. Just reading
> the in-memory config representation twice should be faster than parsing
> the data twice with no intermediate. Throw svnserve's ability to cache
> the intermediate rep into the mix and things look even worse for direct
> parsing.
Well then, change the in-memory representation so that which it doesn't
depend on which user's behalf the lookup is being done. I believe Ben
already suggested that, and for exactly the reason you mention above.
The in-memory representation should be immutable once parsed. That
implies that you can optimize path lookup, and you can optimize (alias,
group) -> user mappings, you can even cache path->ACL mappings, but you
cannot cache (user, path)->access mappings.
I believe that last is exactly what you're doing, and I very strongly
disagree with this idea. Here's why: while we do currently parse the
authz file once per connection in mod_authz_svn, it's entirely
reasonable to someday not do that in svnserve (or some future long-lived
server) and instead either pick up changes automatically (by looking at
the file timestamp, or noticing changes at commit time), or implementing
a SIGHUP or similar mechanism to refresh the in-memory configuration.
Another reason is that during authz lookup, we must not ignore rules
that do not (implicitly) mention the user: we must first find the rule
that is the best match for the path, and only later look at what
permissions, if any, it gives the user, because you can't assume that if
the user doesn't have access to /foo, she won't have access to /foo/bar/baz.
I'm also not at all convinced that re-parsing the whole authz
configuration for every new user is going to save time; the opposite is
more likely.
Implementing my suggestion does imply that the ACL has to become a
rather smart structure that can be used to quickly determine the access
rights for a given user.
[...]
> The second point is more severe, though, as it prevents a nicer
> intermediate
> model than what svn_config_t already provides. The following is a
> legal authz
> file:
I'm going to skip commenting on this because I don't see how it's relevant.
[...]
> Well, I disagree, I'd expect an exact match to override a wildcard match.
>
> I agree that we could impose such a rule. However, it's semantics
> may not be obvious to everyone because we always apply rules to whole
> subtrees. Consider:
>
> [/foo/bar]
> user = r
>
> [/**/bar]
> user =
** is a special case because it matches more than one path segment, so
in this case I would expect the longest match to take precedence.
I agree though that it complicates matters, not so much in the
implementation, but in documentation; explaining to users why '*'
behaves differently than '**' might be harder than explaining why a
wildcard defined later in the file completely overrides an exact match
defined earlier.
-- Brane
--
Branko Čibej | Director of Subversion
WANdisco | Realising the impossibilities of Big Data
e. brane_at_wandisco.com
Received on 2014-07-29 15:28:57 CEST