Hi Stefan,
so these are my ideas for how to speed up
the log dialog.
First of all, I want to say that the dialog
has already become quite responsive up to at
least 10k revisions. The actual number seems
to vary accross different machines. The most
time is spent expanding the data efficiently
encoded in log cache to ordinary strings.
Therefore, fundamental changes to the log
dialog data model are required. They can
be carried out on trunk without disturbing
the dialog's functionality, though.
I'm not sure how well this will work or what
changes to The Plan might be necessary. But
those issues will show up and can be discussed
as we go.
-- Stefan^2.
Step 1: Switch from plain strings to log
cache data model
----------------------------------------
(a) If we get data directly from SVN
(merge info or log cache is off),
feed it into a temporary cache object.
Keep the current data model as it is,
duplicating the data for the time being.
This has already been implemented for
the revision graph. Please note that
the log cache is only used for storage
not for mimicking 'svn log'. All problems
with the caching since its introduction
were with the second.
(b) Introduce the future main data index:
A plain list of <revision, level> pairs.
Initially, the data has to come from
the current data model. But once the
latter has been removed, the pair list
is the only thing that needs to be recorded
outside the log cache.
The current list view content uses a
filtered copy of said list.
(c) Incrementally (column by column) replace
access to the old DM with access to the
log cache DM.
(d) Drop the remnants of the old DM.
Introduce ILogReceiver2 that only reports
revision numbers.
At that stage, startup should be so fast
that the progress bar becomes useless
while receiving for cached data (~10Mrevs/s).
Step 2: Index-based filtering
-----------------------------
(a) Provide a class that maps and index_t to
{match, no_match, untested}.
Every item in the log cache is identified
by an index_t value. Different authors,
paths etc. are stored only once and need
to evaluated only once.
(b) Use instances of that class when filtering
for the author column, actions and the paths.
The latter speeds things up considerably.
We should get a factor of 2 out of this.
Step 3: Further filter speedup
------------------------------
(a) Use different filter classes
- plain sub-string
- wildcard
- regex
implementing a common IFilter interface.
A factory class decides what filter class
would be most efficient for the filter
string (plain sub-string will be sufficient
in most cases).
(b) Create a filter instance per column, start
them in parallel and combine the results
afterwards.
Step 4: Multi-filter
--------------------
(a) The combined filter result for one filter
applied to all columns (3.b) shall be a
mapping: <rev in log> -> {true, false}
(b) Introduce a class that can combine multiple
such mappings left to right. The method
signature might be something like
Add (MapVector rhs, op, bool negate)
Supported ops are union (+), intersection
(default), difference / removal (-) and
symmetric difference (^). Before the
results are begin combined, optional negation
is possible.
(c) Parse the filter spec and create the
individual filter instances in parallel,
and combine the results.
Filters are separated by spaces (honoring
regex parenthesis etc.). The operation
used to combine the results is the first
char of filter spec.
Escapement via '\' is supported.
Examples:
.ppt +.xls TSVN -2009
-> (((matches(.ppt) or matches(.xls))
and matches(TSVN))
excluding matches(2009))
"all ppt or xls files about TSVN
but not of this year"
!me +!you \!
-> ((not matches(me) or not matches(you))
and matches(!))
"everything containing an exclamation
mark and not being about 'me' and 'you'
at the same time"
This should allow for sufficiently complex
queries without complicating the implementation.
The main point is, however, we can run all
filters at once.
------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=757&dsMessageId=2384161
To unsubscribe from this discussion, e-mail: [dev-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2009-08-17 01:04:41 CEST