Stefan Fuhrmann wrote:
> Hi Stefan, hi devs,
>
> the revision graph as implemented on /trunk has a few
> issues with mid- and large-sized repositories.
>
> * analysis phase takes super-quadratic time
> (i.e. worse than O(revNum^2))
> * memory consumption becomes an issue with
> large repositories (~1GB RAM per 100.000 revs)
> * complex changes (e.g. renaming the project root
> folder) result in graphs that are hard to read
>
> For the repository at my office (37k revisions, >100
> tags, >100 branches, most of them already closed),
> the /trunk code takes 6 minutes to build the graph
> while eating about 300 megs RAM.
>
> Since I hear quite a few similar complaints from my
> colleagues, I wondered if I could do better ;) This is
> what I did on /branches/RevisionGraphRewrite:
>
> * Use the log caching data structures to represent
> the data (if l.c. is disabled, collect the data in a
> temporary container). This will simplify and speed
> up path operations while reducing memory consumption.
There seems to be a slight problem with that. I've enabled the cache.
Then I created the test repository
(http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk/test/BuildRevisionGraphTestRepo.bat).
Started the revision graph on /trunk of that repository.
I then get an assertion inside the Subversion lib about the passed path
not being canonicalized.
If I first show the log for / so the cache is built, it works.
(correction: if I show the log with my build from trunk, then it works.
If I show the log with the binaries from the revision graph branch, it
doesn't).
> * Use a breadth search algorithm to scan the thistory
> (i.e. follow all open branches simultaneously).
> Organize the search paths in a tree, so unaffected
> sub-trees (e.g. "tags" for most revisions) can be
> skipped efficiently. This brings the runtime down
> to an almost linear O (N log N).
>
> * Use a simplified placing of the graph nodes.
> At least for TSVN and for my office repository,
> it results in "nice" graphs. However, there is
> plenty of space for improvement without hurting
> performance. As a result, no complex routing
> for the graph's edges is necessary.
>
> * Simplify the algorithm that draws connections.
> Plain curved lines (Bezier splines) are in most
> cases well to tell apart and if they should cross
> a node box, the latter will still be readable.
> If a denser node placement is defined in future,
> we may of course add a more sophisticated
> routing scheme.
While I like the bezier lines, they're currently too simple. I get way
too many lines drawn over the nodes. See attached image.
> For the office repository, the graph shows up in
> less than a second, taking only 30 MB of RAM.
> (even if showing all changes = 37k nodes).
>
> Could you have a look at the result (and try the
> options buttons)? I would love to merge that
> into /trunk and then continue with the log caching /
> merge tracking / documentation stuff.
The graph looks good. But why are you drawing it with the lowest
revision at the top? It's 'backwards' that way. The important part of
the graph is IMHO the latest revisions, not the earliest. And the
important stuff should be on top.
It also seems to me that it is still very unstable. All the graphs I
tried didn't work at all (at least at first): I got an assertion every
time. Even for the TSVN trunk I got an assertion. And that one I
couldn't get to work by fetching the logs for / first.
Stefan
--
___
oo // \\ "De Chelonian Mobile"
(_,\/ \_/ \ TortoiseSVN
\ \_/_\_/> The coolest Interface to (Sub)Version Control
/_/ \_\ http://tortoisesvn.net
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
Received on Sun Jul 8 11:14:13 2007