On 05.02.2011 13:56, Stefan Sperling wrote:
> I think we should go into this direction.
> In fact, I think we should simply change the existing APIs to use
> the fastest possible way of getting at information.
Well, currently there is no API that does what I suggested (basically
return all results of a db query without even touching any files in the
WC or have do this for every file/folder separately).
> Most code we have still crawls the working copy, and that is an
> artifact of how the 1.6.x working copy was structured.
> We're now at single DB, but we're not yet using the single DB to
> its full potential. We're currently treating it more or less like
> a key/value store for information about paths.
>
> Have you seen r1039808 and the resulting the "Sqlite and callbacks"
> thread on dev@? That thread describes some of the issue we're facing
> with the interaction of callbacks in our APIs and sqlite queries.
>
> There were two approaches discussed in that thread. I am currently
> experimenting with the "queries per-directory" approach (see r1051452
> and r1066541). I'm expecting this to be too slow, but I'm doing it
> anyway for two reasons. One is that we'll have real data to look at.
> The other is that we might need code that does per-directory queries
> anyway to satisfy backwards compatibility constraints (see the thread
> "sqlite and callbacks" thread for details).
>
> I think we will eventually need to query the database like people would
> normally query a database, letting sqlite do most of the work of pulling data
> out of the db. However we need to agree on how to solve problems with
> implications this has on the existing APIs.
I've read up on that thread. It seems the problem you're facing comes
from the fact that you need to stay compatible with pre 1.7 APIs and
clients, and the fact that you can't enforce clients to behave, only to
ask them to behave and then hope for the best.
However what I'm asking for here are *new* APIs which do something no
existing API currently does. So staying compatible wouldn't be a
problem. And if you're worried about clients not behaving properly, why
not get rid of the callback completely and just return all information
at once in one big chunk of memory.
Talking about UI clients, this won't be a problem because they usually
have to store all information they receive in a callback anyway so they
have it ready to show in the UI. So for them, the memory use wouldn't be
bigger at all.
Of course, those APIs I'm asking for might not be very useful for
existing APIs or other stuff that is done in the svn library. Those
might only be useful for some svn clients. But I hope that's not a
blocker for implementing those.
I also thought of just query the SQLite db myself directly, but then I
don't like to do something that's not really allowed.
However: I did a quick test with the Check-for-modifications dialog in
TSVN. It has a feature where you can enable showing all properties. To
do that, a separate thread is started which lists all properties of all
items in the working copy. On one of my working copies, this takes about
50 seconds. Using a simple SQLite query on the NODE table took in
average 1260ms. Parsing the data and preparing it for use in the UI took
another 3.5 seconds. Now *that* a speed improvement I really like.
Stefan
--
___
oo // \\ "De Chelonian Mobile"
(_,\/ \_/ \ TortoiseSVN
\ \_/_\_/> The coolest Interface to (Sub)Version Control
/_/ \_\ http://tortoisesvn.net
Received on 2011-02-05 16:23:19 CET