[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Prototype design for merge tracking in Subversion

From: David James <djames_at_collab.net>
Date: 2006-02-12 23:45:52 CET

I'd like to help design a toy SVK-like wrapper around Subversion,
which stores merge-tracking information in an SQL database. This toy
program could be developed quickly without worrying about compatibility
concerns, and serve as a prototype for our final merge-tracking
implementation.

I've outlined a potential database format for this SVK-like wrapper
program below. Please review this format and check whether it could help
us solve merge tracking.

===============
Database format
===============

Tables
------

  Table #1: files
     Columns: file_id, path_id, first_rev, last_rev

     The 'files' table maps file_id / revision pairs to paths.

  Table #2: changes:
     Columns: file_id, op_type, change_id, rev

     The 'changes' table registers operations made on files (e.g. add,
     edit, delete).

  Table #3: changes
     Columns: path_id, path

     The 'paths' table maps path_ids to paths.

Columns
-------

file_id: Each newly created file is assigned a unique file_id upon
           creation. If a file is copied, it is treated as if it is a
           new file. Directories are treated as normal files, but they
           never have any edits.

op_type: One of: { ADD, EDIT, DELETE, RENAME }
           ADD operations create a file.
           EDIT operations change the contents of a file.
           DELETE operations delete a file.
           RENAME operations rename a file.

change_id: Every time a file is modified (added, deleted, renamed, or
           edited), this change is assigned a unique change_id. These
           ids are separate from revision numbers, because a single
           commit may contain several changes.

rev_id: A revision number (e.g. r1223 would be stored as simply 1223)

path_id: The path associated with this operation

first_rev: The rev in which this file was added.

last_rev: The rev in which this file was deleted. If the file was not
           deleted, last_rev is the maximum integer value.

===================================
Merge-tracking for basic operations
===================================

add(path)
1. Create entries in 'files' and 'paths' for the new file.
2. Create an 'add' operation in changes, for the new file.
3. Create an 'edit' operation in 'changes', for the contents of the
     new file

modify(path)
1. Create an 'edit' operation in 'changes'

copy(path1, path2, rev)
1. Create entries in 'files' and 'paths' for the new file.
2. Create an 'add' operation in changes, for the new file.
3. Copy the history from all 'EDIT' operations from file1 to the new
     file id for file2. Update the revs for the new history to the
     current revision.

rename(path1, path2)
1. Update the 'last_rev' for the 'files' entry for path1 to the
     previous revision.
2. Create a new 'files' entry for path2, with the same file_id.
3. Create a 'rename' operation in 'changes'

delete(path):
1. Update the 'last_rev' for the 'files' entry for path1 to the
     previous revision.
2. Create a 'delete' operation in 'changes'

=======================
Advanced merge-tracking
=======================

Problem #1: What revisions need to be merged from "trunk" to "branch"?
----------------------------------------------------------------------

Find all change_ids present in "trunk", but not present in the "branch".
The revisions associated with these change_ids need to be merged.

For the sake of simplicity from a user's point of view, we may want to
assume that if any portion of a revision has already been merged from
'trunk' to 'branch', then the entire revision does not need to be merged
again.

If the user specifies a verbose option, we may want to report on
'partially merged' revisions.

Problem #2: What branches contain this change?
----------------------------------------------

1. Find the change_ids associated with the change.
2. Check which branches contain intersecting change_ids.

Problem #3: What branches contain this exact version of this file?
------------------------------------------------------------------

1. Find change_ids associated with the 'edits' on this file.
2. Check which files contain the exact same set of change_ids.

Problem #4: Where has this file been copied?
------------------------------------------------------------------

1. Find change_ids associated with the 'edits' on this file.
2. Check which files contain intersecting change_ids.

Problem #5: Is this version of 'foo.c' the latest version? Has this file
            perhaps been updated on a branch?
------------------------------------------------------------------

1. Find change_ids associated with the 'edits' on this file.
2. Check which files contain intersecting change_ids.
3. If none of the 'intersecting' files contain additional change_ids,
   then this is the latest version.

--
David James -- http://www.cs.toronto.edu/~james
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 12 23:46:21 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.