[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Revision Reconciliation Algorithm

From: Ben Collins-Sussman <sussman_at_red-bean.com>
Date: Wed, 2 Jan 2008 20:28:00 -0600

On Jan 2, 2008 8:17 PM, Sharmarke Aden <aden.list_at_gmail.com> wrote:
> Hi all,
> I have a few questions I'm hoping someone could help me answer.

If you read http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_base/notes/structure
then all your questions below will be answered. :-)

> 1. I'm not entirely sure I understand how the DAG is stored. Does
> subversion simply use the file system or is there some kind of db
> that's utilized to house DAG edges and veracities? I mean, when
> comparing two DAGs where does the data structures being compared
> reside? I've seen references to Berkeley DB and I'm not sure if it
> plays a role in all of this.

We create our own C structures, then serialize them into a database.
It's the way the C structures point at each other that makes them a

> 2. How are properties associated with nodes in the DAG and where/how
> are they stored?

They're part of the node-revision structure. See the doc I pasted above.

> 3. How do property changes impact nodes in the DAG?

Any change at all -- text or properties -- causes a new node_t
structure to be created.

> 4. How does filtering using properties work?

What filtering?

> 5. Is DAG comparison done on the fly and is it cached?

The C code walks over a network of node_t structures and compares
their identities. Both the BDB and FSFS implementations have various
degrees of internal LRU caching to speed things up.

> 6. When a client requests a "svn update" what does it's request to the
> server look like what exactly is happening behind the scenes?

The client sends the minimum report possible: "I have revision 23 of
/trunk." If the working copy has mixed-revisions (which is extremely
common), then it sends the minimum report it can: "I have revision 23
of /trunk, but revision 27 of /trunk/blah, and in there I have
revision 25 of /trunk/blah/bloo", and so on. See the reporter_t
vtable in svn_ra.h.

The net result is that the server ends up with a precise description
of the working copy, and it most cases it's a lot more efficient than
CVS, which has to forcibly report the revision of *every* file in the
working copy. Because subversion versions directories, it can simply
say "I have revision N of this path", and leave it at that.

But really, you need to read the 'structure' doc I posted above. It
should explain the low-level details. You should also read this:


...especially the part that shows how the DAG 'bubbles up' after each change.

If you've read both those documents and still have questions, please ask. :-)

To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-01-04 04:57:01 CET

This is an archived mail posted to the Subversion Dev mailing list.