Re: subversion-authorization (other than authz)

From: Michel Brabants <michel.brabants_at_euphonynet.be>
Date: 2006-09-19 23:07:11 CEST


I'm nto sure if I'll have a look ath the patch because I wouldn't like the
maintenance-time it could cost me. Maybe I'll have a look ...

Let's see how I would design it ... :)

Op dinsdag 19 september 2006 20:51, schreef Lieven Govaerts:
> michel.brabants@euphonynet.be wrote:
> > Hello,
> >
> > I checked on roadmap-site of subversiona nd can't seem to find any plans
> > to provide a kind of plugin-system to provide additional methods of
> > authorization . This is a point in which subversion is lacking to my
> > opinion. I could use apache to direct this, but I don't find this a good
> > solution. If I would use apache to limit access, those permissions
> > wouldn't be applied when ssh+svn would be used (which isn't the case).
> > However, we use trac, which also only shows the content that a user may
> > view, based on the authz-file. So, using apache for authorization would
> > allow users to view the content through the trac-browse-feature.
> > I could synchronize the files manually or using hooks to update the
> > authz-file, but I'm not sure if that covers everything (have to check) and
> > a plugin-system of a direct ldap-implentation would be better.
> >
> > So, are there any plans to implement ldap-authorization and is there a
> > planning?
> >
> This has been discussed before, but there's no real set of requirements
> let alone a design of how that will look like. Feel free to step in :)

Maybe an important note: I'm not so familiar with the code of the
Ok, this is a quick try ...

* Requirements:
  It has to define who can access which data in the subversion-system.

* Which are the different types of data in the subversion-system, are there
dependencies between them and what type of data do they expose?

types of data:
 1) versionned data -> versionned files, directories and the relationships
between them.
 2) subversion-properties that are part of a particular version? For example:
log-data(?), bug-id, ...

I'll leave the dependencies a little bit open for the moment although it at
least partly mentionned above.

 + So, to have full controll, one should have access-control on all data in a
change-set. Now, let's say that a file A appears in changesets 1,2 and 5. An
entity John has access to the file in changesets 1 and 5, but not in 2.
However, the patch for the file A in changeset 2 can be applied properly
without the changes in changeset 2. What should one do? Maybe another entity
Jeffrey was adding information to the file 1 in changeset 2 that shouldn't be
disclosed to the entity John, so that is why John doesn has access to the
patch for file A in changeset 2. However, this means that the view of how the
patch for file 1 should look like after changeset 2 is different for John and
Jeffrey. The "worst" part is, that if there would be a conflict because of
John uploading a patch after changeset 2, John would have a difficult time
resolving the problem because he doesn't have access to the changes.
If you would do this in a "secure" way, one would encrypt each separate part
according to the persons who have access to the data. Maybe one could write
this down in a more formal way ... Maybe it helps. Ofcourse, the data could
appear unencrypter afterwards ... So, if one would apply restrictions on
already existing changesets ..., one can cause conflicts n the context of the
view that those restrictions create on the repository. One should be able to
see the conflicts it creates when the restrictions are applied ...

 + Additional logic could be applied by saying that if a complete file/part(?)
is hidden completely from entity John in revision 7, that that file is also
hidden from John in revision 2 from then onwards. The file wasn't hidden from
the entity John in revision 2. Ofcourse, the entity John could already have
downloaded the file as it existed in revision 2.

2) Log-data can apply to parts of code, ... With this I mean, that it could
reveal data concerning data that was hidden from the user. I'll stop here for
the moment ...

Maybe I went already further than what you meant, but I would like to have
versionned access-controll. Why? Let's say that I move a file, then the
restrictions of the "old" version should still be applied to the new one.
However, one can say that I'm already describing behavour that is not wanted
by everyone. This is true and I think that it doesn't matter in my above
explanation when security is applied to parts within changesets. One could
describe their own behaviour when the above explanation is applied (after
improvement maybe). Another example maybe. What when I have defined a
directory as a way of protecting my files and afterwards I move a file ...
This could actually be a wrong way of applying permissions. There should be a
permission of not be able to see/delete/... files. A directory is only the
groupation(?) of files. However, one could maybe define logic-rules that will
allow the method of "a directory protecting files" still to be secure.
This seem to be getting a security-implementation. This is actually true I
think :).

So, to conclude. When a person applies a patch, the repository should call an
api that shows restricts it to the context of the user. More specifcally, it
shows him the repository as the user would/should see it and then applies the
operation of the user. Ofcourse, there could be conflicts with views (of the
repository) of other users ...
Maybe there already exist theories about security for versionned data ... it
is actually not "more" I think than relationships between data which cause
the security-data to be applied to other data also.

Ok, I know that the authorization-information is not versionned now, but maybe
it would be interesting to version it like mentionned above.
Ofcourse the api ...

+ getRepositoryView(entity) for getting the view on the repoitory. I don't
care if the security-information (what he may see) is available in a
text-file, in a database, an ldap-server of in which format it could exist.
If groups are also used, the groups that appear in the repository could be
fetched from an ldap-server. Ofcourse, there is information-exposure to the
ldap-server, but ok I'll leave it at this for now :).

getRepositoryView returns a repository-object which allows to browse the
repository as the user sees it. The repository-object contains files,
directories, properties and all the other (types of) data available in the
repository. One could implement the object that returns the view as a
different server (like it is maybe now), so that one could easily build a
(fuse-)filesystem or so on top of it ...

The object that returns the repository could implement a function
+ getRepositoryView(DataRestrictions/DataAllowonce) (or an object that returns
this information)
The object that return the information could be a textreader for a specific
format, a class that reads the information from a database or an ldap-server.
These classes would define what type of information they need to define the
Explicitly allowing data could maybe usefull for avoiding mistakes ... Why
should denying it later overwrite allowing it earlier?

I hope that this information was usefull. Greetings,


> Some time ago there was a patch on the dev list which might or might not
> do what you want:
> http://svn.haxx.se/dev/archive-2006-07/0107.shtml
> Lieven.

