Re: Changeset Signing

From: Ben Reser <ben_at_reser.org>
Date: Thu, 11 Jun 2015 23:44:39 -0700

(resending to the list since I missed changing from the google groups email
that was on the thread, my apologies to those that get this email twice)

On 6/11/15 7:25 PM, Ruchir Arya wrote:
> Hi Brane, i didnt get you. How can the server admin modify the content if
> contents are signed? Let me give a scenario, suppose we implement Public Key
> Infrastructure in SVN, where each client generates its private key and public
> key and registers this public key with the server so that anyone can access the
> public key to verify the contents.
>
> Suppose algorithm works in this way.
>
> 1. Client computes hash of (contents concatenated with some revision
> properties), then sign this hash with its private key and sends this signed
> hash with the contents and revision properties.
> 2. So, now if server modifies any content, server dont know the private key of
> client, so server cant generate valid signed hashed.
> 3. Hence i agree with, server can put some garbage data. But server wont be
> able to do false accusation on some other clients. (Like in current SVN, server
> can change the name of client in log files, and it can accuse some other client
> for that particular commit.
> 4. But after implement PKI, server cant accuse another client. It just can
> currupt data, which can be determined too at the time of verification of signed
> hash using public key.

Your signing scheme only protects individual revisions. It does not protect
the sequence of revisions that make up the repository. Thus a server admin can
add or delete revisions. You have no way of knowing if this happened. Perhaps
that's sufficient for you if all you care about is validating that a particular
identity created the revision.

But your scheme as suggested falls apart even for that if you have the server
be in charge of holding the public keys and vending them out. Because a server
admin can simply modify a revision, generate a new public/private key pair and
insert it into the public keys that it vends to you. You might conclude that's
fine because a human can pick out the fact that an extra keypair exists for the
identity (or possibly for an unknown identity). But when you start considering
very large repositories with large user bases that becomes unwieldy.

Consider the ASF repository currently has 1,685,036 revisions. Last time I
checked there were several thousand committers. It's practically impossible
for anyone to know what key pairs would be valid for all of those identities.
Any sort of automation would ultimately hide that.

You might solve that with a web of trust or a central authority that signs the
keypairs (PGP model or SSL model). But both of these have their flaws. I
think the whole thing would be very difficult to implement in a way that's useful.

If you wanted to try to solve the greater problem of whole repository integrity
then you pretty much need to handle signature chaining. I.E. by taking a hash
of the predecessor revision and including it in what you sign. The problem
with that is that the client doesn't know what the predecessor revision is
until it receives back the revision from the commit command. The current
design allows SVN to allow multiple clients to build up transactions
simultaneously and then only have to take out a exclusive lock on the
repository for the final merge. If a predecessor conflicts with the
transaction the client receives a out of date error and then the user can run
update, deal with conflicts and then commit again. Changing this would result
in a significant performance degradation.

Of course you could create a modified client that handles the simplistic
signing of commits and verification without any server involvement whatsoever.
You'd have to get everyone to use it in order to be useful. But it could be
required by way of hook scripts. If you really want this I'd implement
something like this as a proof of concept.

But honestly, despite thinking this was a cool idea at one time I'm pretty much
not sold on the actual utility. As Branko alluded to in his email it's
actually really hard to change a repository after the fact (other than revision
properties). The client and the server both presume that revisions are
immutable. Changing them will break things. Which means all anyone can really
change is revision properties. So you can change the commit message, the
author and the date of the commit. That's not terribly useful.

In the case of an open source project, that information is typical transmitted
to an email list. Those lists are cached by many people. Modifications of
revision properties also trigger emails to these lists. Which means you could
search for any potentially malicious modification of the important revision
properties by simply going through multiple archives of the mailing list and
comparing it to the repository.

That sounds like a lot of work, but then so is what you're proposing. And yet
I think at the end your solution is not any better, and in fact is far weaker.
Because it'd be practically impossible to modify all the mailing lists. Which
leaves a malicious server admin to stop the mail while they did something
untoward and then all they can really achieve is changing the revision properties.

It's simply not worth it. Solving it might be an interesting but academic
puzzle. But I don't think it has any practical purpose. Signing commits for a
distributed version control system is of course a very different matter.
That's why you see git supporting this and SVN without this support.
Received on 2015-06-12 08:44:49 CEST

This message: [ Message body ]
Next message: Philip Martin: "Re: Blame behaviour change in 1.9"
Previous message: Ben Reser: "Re: Changeset Signing"
In reply to: Ruchir Arya: "Re: Changeset Signing"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]