[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: request for new API function

From: Stefan Sperling <stsp_at_elego.de>
Date: Sat, 5 Feb 2011 13:56:41 +0100

On Sat, Feb 05, 2011 at 10:28:20AM +0100, Stefan Küng wrote:
> Hi,
>
> To find all files and folders that have a specific property set I
> need to crawl the whole working copy and fetch the properties of
> each and every item, then scan the returned property list for that
> property.
> But WC-NG uses an SQLite db so this task should be much faster with
> a lot less disk access.
>
> I'd like to ask for a new API which would allow me to search the
> database for properties much faster. Something like
>
> svn_wc_propsearch()
>
> with parameters:
>
> * string to search for
> * value indicating how the search is done (equal, contains, begins
> with, ... basically all the possible match functions SQLite offers)
> * bool indicating whether the search is done on the NODES table or
> the ACTUAL_NODE table
> * callback function which receives the search results
>
> The callback function would receive not just the properties and
> paths of the items the found properties belong to, but also any
> information that the table row contains (because when an information
> is available for free, an API should return it to avoid forcing
> clients to call yet another API to get that info - if clients don't
> need the info they can just ignore it).
>
> And while we're at it: new APIs to search for other columns in the
> NODES and ACTUAL_NODE table would also be nice and useful:
> * search for 'changed_revision' and/or 'changed_date': allows to
> show a view with all files that haven't changed for a long time, or
> files that have changed recently, or ...
> * search for 'changed_author': allows to show a quick view of who
> changed what last, gather statistics, ...
> * search for 'depth' to quickly determine whether the working copy
> is full or may be missing something
> * search for 'file_external'
>
> With such new APIs, clients could use the advantages of WC-NG too.
>
> A lot of ideas I had couldn't be done before because it would have
> been just too slow. But now with WC-NG and the database it would be
> fast enough.
>
> Thoughts?

I think we should go into this direction.
In fact, I think we should simply change the existing APIs to use
the fastest possible way of getting at information.

Most code we have still crawls the working copy, and that is an
artifact of how the 1.6.x working copy was structured.
We're now at single DB, but we're not yet using the single DB to
its full potential. We're currently treating it more or less like
a key/value store for information about paths.

Have you seen r1039808 and the resulting the "Sqlite and callbacks"
thread on dev@? That thread describes some of the issue we're facing
with the interaction of callbacks in our APIs and sqlite queries.

There were two approaches discussed in that thread. I am currently
experimenting with the "queries per-directory" approach (see r1051452
and r1066541). I'm expecting this to be too slow, but I'm doing it
anyway for two reasons. One is that we'll have real data to look at.
The other is that we might need code that does per-directory queries
anyway to satisfy backwards compatibility constraints (see the thread
"sqlite and callbacks" thread for details).

I think we will eventually need to query the database like people would
normally query a database, letting sqlite do most of the work of pulling data
out of the db. However we need to agree on how to solve problems with
implications this has on the existing APIs.

Stefan
Received on 2011-02-05 13:57:24 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.