[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Merkle trees in svn [was: Quick question about the sha1-checksum for directories in svn.]

From: Paul Hammant <paul_at_hammant.org>
Date: Tue, 10 Oct 2017 10:02:57 -0400

As my forthcoming multi-user app that uses Subversion as a backing store is
going to kill Svn it with Depth-∞ PROPFINDs from root, I really want to see
this implemented.

Because I can't wait I will implement something that calculates SHA1s for
the directory in question, and drops it into a '.sha1' file in the folder
in question. It'll be a job that runs perpetually, and keeps committing as
things change.

To cut through the permissions based quandary that y'all are having, how
about ignore the subsetting that's implicit from permissions, and just calc
the SHA1 for the 'all' situation, requiring the client-side end-user of the
this setup to correlate a localally calculated SHA1 for the permitted set
and the remote canonical SHA1 for the potential superset. If the same
algorithm is used on the client and the server, then all the end user can
determine is that they are not (for some directories) permitted to read all
files.

Crude example, using command line unix tools:

$ cat foo
2013.json 1674790a70b984c9041ab86c370f942861ead004
2016.json 194f6519cd60b773a82857cf1aeba8dad4a223ed
2020.json 20e3ff1ade2385c593f73fd44fd157391d2424e7
2050.json 19b2da433a273840deddb7a46b16891acab16e3f
2060.json 45418423999c155abc434e175d42ccf6534bee6d
2068.json 69576d3632c7ce8b0b2a42d87e9e75049bdaff9d
$ cat foo | sha1sum
c1874a0ba80cb51245cb78f567dbbd46271d7ba1 -
$ head -3 foo | sponge foo
$ cat foo
2013.json 1674790a70b984c9041ab86c370f942861ead004
2016.json 194f6519cd60b773a82857cf1aeba8dad4a223ed
2020.json 20e3ff1ade2385c593f73fd44fd157391d2424e7
$ cat foo | sha1sum
1cbbec9747ab817dac26892cfd437f72d7366858 -

^ the server side says c1874a... for the 'directory' that foo represents
and passes that to all clients that ask (provided they are permitted to the
directory). The client side holds that ref, *and* 1cbbec... maybe in a
database of some sort. This way, it can track whether it has kept abreast
of the Svn server or not. It may determine that in needs to do a depth-1
PROPFIND and because the server side SHA1 for the directory changed, and in
the not-permitted-to-see-all situation may determine that that was
ultimately unnecessary.

Of course I can alternatively correlate subversion revision numbers right
now with locally calc'd SHA1, but it doesn't feel very Merkle-like.
Received on 2017-10-10 16:03:04 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.