Re: svn commit: r1614080 - in /subversion/branches/authzperf: BRANCH-README subversion/libsvn_repos/authz.c subversion/libsvn_subr/config.c subversion/libsvn_subr/config_impl.h

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 29 Jul 2014 14:55:37 +0200

On Tue, Jul 29, 2014 at 1:52 PM, Branko ÄŒibej <brane_at_wandisco.com> wrote:

> On 29.07.2014 13:34, Stefan Fuhrmann wrote:
>
> On Mon, Jul 28, 2014 at 7:38 PM, Branko ÄŒibej <brane_at_wandisco.com> <brane_at_wandisco.com> wrote:
>
> On 28.07.2014 19:19, stefan2_at_apache.org wrote:
>
> Author: stefan2
> Date: Mon Jul 28 17:19:47 2014
> New Revision: 1614080
>
> URL: http://svn.apache.org/r1614080
> Log:
> On the authzperf branch: Implement the notion of path rule ordering
> by making svn_config_t iterate through sections in declaration order.
> This is done using a simple linked list because we can't remove
> sections but only add them.
>
> No no no no!
>
> What exactly did my change break?
>
> Right now, the svn_config interface returns the sections in random order.
> I changed the code such that this random order happens to be the same
> as the (static and dynamic) declaration order.
>
>
> Changing the svn_config_t structure is not the right to do this. The correct
> approach here is to parametrise the svn_config parser with a constructor
> method, then create a new constructor for the authz parser which will then
> get all entries in the file order. Please revert these svn_config changes.
>
> I reverted the changes for now as I agree that the new behaviour
> should be somewhat more explicit.
>
> However, since this is not about construction (the result would
> still be a hash)
>
>
> You are making way too many assumptions about that. See my svn_config
> parser parametrization in r1614213
>

>
> but about iteration, I see only two basic options:
>
> 1. Invent a new data structure that duplicates 80+% of svn_config_t
> to provide the same functionality but different ordering through
> its 'enumerate sections' interface. This basically would be r1614080
> minus a few bits that we happen not to call in the authz context.
>
> 2. Reapply r1614080 but let 'enumerate sections' return data in
> hash order as on /trunk. Bump the 'enumerate sections' API
> to include a "use declaration order' option.
>
>
> 3. Feed the authz in-memory representation directly from the config
> parser, without using an intermediate svn_config_t. This is what we should
> do. The authz structure is inherently different from a config structure,
> because it has different semantics and access patterns.
>

Except that this will be equivalent to 1. but worse and here is why.

In many situation, two "users" will be used with the same connection,
anonymous and the actual user. Having svn_config_t as an intermediate
representation prevents us from parsing the file twice. Just reading
the in-memory config representation twice should be faster than parsing
the data twice with no intermediate. Throw svnserve's ability to cache
the intermediate rep into the mix and things look even worse for direct
parsing.

The second point is more severe, though, as it prevents a nicer intermediate
model than what svn_config_t already provides. The following is a legal
authz
file:

[/]
@group = r

[groups]
group = &alias

[aliases]
alias = user

[/]
@other = w

[groups]
other = &other

[aliases]
other = user

It has to be interpreted as

[aliases]
other = user
alias = user

[groups]
group = &alias
other = &other

[/]
@group = r
@other = w

and is equivalent to

[/]
user = rw

> IMO, number 2 is the better choice as it gives full control over the
> added functionality without adding significant maintenance costs.
>
>
> You're already creating an in-memory structure that is significantly
> different from svn_config_t, and does not require the presence of an
> svn_config_t, nor does it have to enumerate anything, since it can be
> constructed incrementally and in-order during parsing.
>

This time, you are making to many assumptions. See above.

>
> [...]
>
> Are you saying that a wildcard path should match before a concrete path when
> it happens to appear later in the authz file? Perish the thought. Exact
> matches should always override fuzzy matches, anything else is not intuitive
> at all and we'll receive a ton of spurious bug reports about how authz files
> don't work.
>
> No, that is not what I'm saying. My rules are simple:
>
>
> Let's please have the rules documented in the wiki, then we can comment on
> them.
>

I will update the design to match the current implementation and then
upload it.

> * Select the path rule(s) that cover the largest section of the path.
> This is what we do today and there can be only one such match
> w/o the use of wildcards.
>
> * If there is more than one such rule, apply the one declared last.
> This seems to be the easiest rule to follow and understand.
>
>
> Well, I disagree, I'd expect an exact match to override a wildcard match.
>

I agree that we could impose such a rule. However, it's semantics
may not be obvious to everyone because we always apply rules to whole
subtrees. Consider:

[/foo/bar]
user = r

[/**/bar]
user =

Applying your suggestion, the user has access to /foo/bar and /for/bar/x
but not to /foo/bar/bar nor /foo/bar/bar/x. Basically, you are moving all
wildcard rules to the top of the file and then apply precedence.

There is no need to introduce the concept of "exact match" to model
the same set of rules. OTOH, complicates system behavior as it only
applies to the path itself but not its sub-paths. Moreover, we cannot fix
this by extending it to whole subtree because

[/]
* = r

is a common rule. It would then override any wildcard rule.

-- Stefan^2.
Received on 2014-07-29 14:56:09 CEST

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]