On 8/30/06, Kamesh Jayachandran <kamesh@collab.net> wrote:
> Hi All,
> I have the following points of improvements to mergeinfo sqlite db schema.
>
> 1. There is a duplicate info in the form of 'mergeinfo_changed.path' and
> 'N' records of 'mergeinfo.mergedto'. Where 'N' is the number of merges as
> of this commit on the path. This leads to arithmetic series kind of
> records
> in 'mergeinfo' upon each commit.(I have a working patch for this
> normalization but would like my other patches to find its way before
> posting this one.)
>
> 2. Currently we track 'mergedfrom', 'mergedto' paths in the sqlite db.
> This is not correct, rather we should maintain 'nodeid' and 'copyid'
Actually, this is wrong. You can use the paths just fine.
>
> 3. The schema+code change to avoid the 'arithmetic series' of records in
> 'mergeinfo' table as mentioned in point 1.
> (After 20 merges I could see 210 records in my
> 'mergeinfo' table, It goes like this n(n+1)/2. for 1000 '1 merge+1
> commit'
> on a same target lead to (1000*1001/2) 500500 records.
>
> 4. The name of the table 'mergeinfo' should rather be 'mergeinfo_details'
> and the name of the table 'mergeinfo_changed' should be 'mergeinfo'.
>
>
> Would like to know your thoughts.
The mergeinfo schema has known N^2 space usage issues. I would
suggest you look at something like the alternative schema posted a
while ago if you want something to work on.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Aug 30 16:49:32 2006