When combining windows with source ops + new data only, almost all of
the current code in the delta combiner can go away.
This actually includes all of the range tree (which is actually the
majority of the code, and more complex to understand than the op fixups)
we currently use.
I've attached a document that describes my plans for how to do this.
It should be faster than the current delta combiner, which spends most
of its times in looking up operand indexes again and again, because it
has better cache behavior. Using an n log n sort, the new algorithm is
only some constant factor better than the old one.
At least, my prototype is faster by about 10-15% just using random-test
timings for it's combiner test
Also, on the testcase where combining windows was taking any amount of
time (55 seconds total on a 30000 revision blame), it now takes less
than half that time.
If we used an O(n) sort (or assume that quicksort is going to be O(n)
most of the time), the new one would be algorithmically faster than the
old one.
Comments welcome (including saying this is a waste of time, but any
claims of time wasting will have to give me my 27.5 seconds per blame
back :P).
Even with only source ops, most of the current delta combiner time is spend
searching for window_A operands that define a given piece of window_B's source.
The reason such an index is currently needed is because operands of window_B
are walked in the order they appear in window_B, not in window B's source
(window A's target) order. There is no correspondence between the portion of
the window_B source defined by window_A and the order in which we process the
operands. In simple terms, the two windows are not aligned.
I assume this was done because maintaining output order was viewed to be
"hard" if you don't
The key to maintaing output order is noticing that each original window B
operation can do one of three things in the combined window:
1. Stay the same operation
2. Be replaced with a single new operation
3. Be replaced with multiple operations
In case 1 and 2, you can maintain the output order by placing the operand in
the same relative place it was in window_B.
In case 3, to maintain output order, you need to place *all the new operands*
in the same relative place the original operation was in window_B.
This is actually easier than one thinks, however, you just need an array of size
"window_B->num_ops" that is a linked list of operations that each original
operand in window_B gets transformed into.
Output structure
struct output_op
{
svn_txdelta_op_t op;
struct output_op *next;
};
struct output_op *output_array[window_B->num_ops];
One only needs to know the original operand index in window B in order to put
it in the correct place in the output array.
The output array is then simply walked in order, like so:
for (i =0; i < window_B->num_ops; i++)
{
struct output_ops *op;
for (output_op = output_array[i]; output_op; output_op = output_op->next)
{
throw output_op->op into the combined window.
}
}
Now that we can maintain the relative output order, all we need to do is align
the window_A target offsets, and the window_B source offsets, and we can
simply walk both windows together.
struct window_a_piece
{
svn_txdelta_op_t op;
int target_offset;
int target_len;
} *window_a_piece_array[];
struct window_b_piece
{
svn_txdelta_op_t op;
int original_ndx_in_window_b;
int source_offset;
int source_len;
} *window_b_piece_array[];
(source_offset and len aren't actually necessary, since they are in op, i'm
just using this as a demonstration).
Process window_a into an array of window_a_piece, and then sort it by
target_offset, then by target_len.
Process window_b into an array of window_b_piece, and then sort it by
source_offset, then by source_len.
At this point, the two arrays should be aligned.
We should then be able to do the following:
aindex = 0;
bindex = 0;
for (bindex = 0; bindex < length of window_b_piece_array; bindex++)
{
window_b_piece = window_b_piece_array[bindex];
window_a_piece = window_a_piece_array[aindex];
while (window_a_piece->target_offset < window_b_piece->source_offset
&& window_a_piece->target_offset + window_a_piece->target_len <
window_b_piece->source_offset )
{
aindex++;
window_a_piece = window_a_piece_array[aindex];
}
startinga = aindex;
while (we haven't covered window_b_piece->source_len worth of data)
{
window_a_piece = window_a_piece_array[aindex];
transform the window_a_piece op into the approriate window b
op, and insert at the tail of the linked list at
output_array[window_b_piece->original_ndx_in_window_b].
This takes into account the target position vs the source offset, etc.
aindex++;
}
/* The next thing in window B may need the same operands from window
A, so we have to reset aindex to the point it was at before we started
copying operands, but after we advanced it to the position that
matches our source offset. */
aindex = startinga;
}
Note again that there are only three main cases for transformation.
Either you have
1. They are the same size
| ------ a op ------ |
| ------ b op ------ |
2. The matching a op is smaller than b op
| a op || next a op |
| ------ b_op ------ |
3. The matching a op is bigger than b op
| ------ a op ------ |
| b op || next b op |
There can be overlaps and underlaps as well, so that you have
-- a op -- || -- a op -- || -- a op -- |
| b op | | ----- b op ----- |
These are just special cases of what "the matching a op" is above.
The algorithmically slowest part of this algorithm is the sorting required.
However, sorting two arrays using qsort, once each, should be faster than
continually looking up operand indexes in the array.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Mar 1 05:03:33 2005