Revision graph cache implementation
Suppose the following repository
Action |
Path |
Copy src path |
Copy src revision |
Revision 1 |
A |
/trunk |
|
|
A |
/branches |
|
|
A |
/tags |
|
|
Revision 2 |
A |
/trunk/foo.txt |
|
|
A |
/trunk/bar.txt |
|
|
Revision 3 |
D |
/trunk |
|
|
Revision 4 |
A |
/branches/1.0 |
/trunk |
2 |
Algorithm to build the database
This database has three tables: files, change_paths and revisions.
I'm going to explain step by step how this tables are built. The revisions table
stores the information about authors, comments and dates. This table won't be shown
in the examples because its information it stores is not useful to understand
the cache.
In each revision I'm going to show the log information and then
how appear the files and change_paths tables after the update.
The rows inserted or the fields updated are shown in green.
In this algorithm all the log messages are stored and processed at the same time.
All branches and change paths are calculated at the same time.
Revision 1
The user creates three directories
Action |
Path |
Copy src path |
Copy src revision |
A |
/trunk |
|
|
A |
/branches |
|
|
A |
/tags |
|
|
Files table
File id |
Revision from |
Revision to |
Path |
1 |
1 |
NULL |
/trunk |
2 |
1 |
NULL |
/branches |
3 |
1 |
NULL |
/tags |
Change paths table
Path |
Revision id |
Copy src revision |
Copy src path |
Action |
File id |
/trunk |
1 |
NULL |
NULL |
A |
1 |
/branches |
1 |
NULL |
NULL |
A |
2 |
/tags |
1 |
NULL |
NULL |
A |
3 |
Revision 2
The user creates two files under the /trunk directory
Action |
Path |
Copy src path |
Copy src revision |
A |
/trunk/foo.txt |
|
|
A |
/trunk/bar.txt |
|
|
Files table
File id |
Revision from |
Revision to |
Path |
1 |
1 |
NULL |
/trunk |
2 |
1 |
NULL |
/branches |
3 |
1 |
NULL |
/tags |
4 |
2 |
NULL |
/trunk/foo.txt |
5 |
2 |
NULL |
/trunk/bar.txt |
Change paths table
Path |
Revision id |
Copy src revision |
Copy src path |
Action |
File id |
/trunk |
1 |
NULL |
NULL |
A |
1 |
/branches |
1 |
NULL |
NULL |
A |
2 |
/tags |
1 |
NULL |
NULL |
A |
3 |
/trunk/foo.txt |
2 |
NULL |
NULL |
A |
4 |
/trunk/bar.txt |
2 |
NULL |
NULL |
A |
5 |
Revision 3
The user deletes the /trunk directory.
The problem here is that the behaviour is not lineal.
There is one log message in this revision but
three rows are updated in each table.
Action |
Path |
Copy src path |
Copy src revision |
D |
/trunk |
|
|
Files table
Every row with a path starting with /trunk is marked
as deleted at the current revision number.
File id |
Revision from |
Revision to |
Path |
1 |
1 |
3 |
/trunk |
2 |
1 |
NULL |
/branches |
3 |
1 |
NULL |
/tags |
4 |
2 |
3 |
/trunk/foo.txt |
5 |
2 |
3 |
/trunk/bar.txt |
Change paths table
Path |
Revision id |
Copy src revision |
Copy src path |
Action |
File id |
/trunk |
1 |
NULL |
NULL |
A |
1 |
/branches |
1 |
NULL |
NULL |
A |
2 |
/tags |
1 |
NULL |
NULL |
A |
3 |
/trunk/foo.txt |
2 |
NULL |
NULL |
A |
4 |
/trunk/bar.txt |
2 |
NULL |
NULL |
A |
5 |
/trunk/foo.txt |
3 |
NULL |
NULL |
D |
4 |
/trunk/bar.txt |
3 |
NULL |
NULL |
D |
5 |
/trunk |
3 |
NULL |
NULL |
D |
1 |
Revision 4
A branch is created.
The problem here is similar to the problem when a directory is deleted.
One row is inserted for each subfolder and subfile of the directory copied.
Action |
Path |
Copy src path |
Copy src revision |
A |
/branches/1.0 |
/trunk |
2 |
Files table
File id |
Revision from |
Revision to |
Path |
1 |
1 |
3 |
/trunk |
2 |
1 |
NULL |
/branches |
3 |
1 |
NULL |
/tags |
4 |
2 |
3 |
/trunk/foo.txt |
5 |
2 |
3 |
/trunk/bar.txt |
1 |
4 |
NULL |
/branches/1.0 |
4 |
4 |
NULL |
/branches/1.0/foo.txt |
5 |
4 |
NULL |
/branches/1.0/bar.txt |
Change paths table
Path |
Revision id |
Copy src revision |
Copy src path |
Action |
File id |
/trunk |
1 |
NULL |
NULL |
A |
1 |
/branches |
1 |
NULL |
NULL |
A |
2 |
/tags |
1 |
NULL |
NULL |
A |
3 |
/trunk/foo.txt |
2 |
NULL |
NULL |
A |
4 |
/trunk/bar.txt |
2 |
NULL |
NULL |
A |
5 |
/trunk/foo.txt |
3 |
NULL |
NULL |
D |
4 |
/trunk/bar.txt |
3 |
NULL |
NULL |
D |
5 |
/trunk |
3 |
NULL |
NULL |
D |
1 |
/branches/1.0 |
4 |
2 |
/trunk |
A |
1 |
/branches/1.0/foo.txt |
4 |
2 |
/trunk/foo.txt |
A |
4 |
/branches/1.0/bar.txt |
4 |
2 |
/trunk/bar.txt |
A |
5 |
Queries to build the graph
Suppose the user wants to see the graph for the /branches/1.0/bar.txt file.
The user is working at revision 4.
With this row in the files table we know that the file_id for that file is: 4.
File id |
Revision from |
Revision to |
Path |
4 |
4 |
NULL |
/branches/1.0/foo.txt |
Now we query for all change_paths for that file_id
Path |
Revision id |
Copy src revision |
Copy src path |
Action |
File id |
/trunk/foo.txt |
2 |
NULL |
NULL |
A |
4 |
/trunk/foo.txt |
3 |
NULL |
NULL |
D |
4 |
/branches/1.0/foo.txt |
4 |
2 |
/trunk/foo.txt |
A |
4 |
This result joined with the revisions table,
that stores the information about author, comments and dates,
gives us all the information to build the graph.