RE: Need fast ways to get Info once WC-NG is introduced

From: Bert Huijben <bert_at_qqmail.nl>
Date: Wed, 4 Aug 2010 10:30:25 +0200

> -----Original Message-----
> From: Stefan Küng [mailto:tortoisesvn_at_gmail.com]
> Sent: maandag 2 augustus 2010 21:52
> To: Bert Huijben
> Cc: 'Subversion Development'
> Subject: Re: Need fast ways to get Info once WC-NG is introduced
>
> On 02.08.2010 12:32, Bert Huijben wrote:
>
> > I don't think there is a specific per folder check like this, but
> retrieving
> > specific data about just one node (instead of its folder) will be
> *much*
> > faster than in the old entries store. With the entries files we had
> to read
> > the entire file in all cases, but a real database doesn't have that
> > limitation.
> >
> > For all metadata except for pristine files we only have to open one
> file and
> > sqlite just seeks to the right locations to fetch the data using its
> > indexes.
> >
> > For AnkhSVN I'm thinking about splitting the status cache in two
> layers,
> > instead of doing a 'svn status' per folder like we do in 1.6. (I
> think
> > TortoiseSVN might do the same thing, but maybe it calls status with
> depth
> > infinity)
>
> Yes, TSVN does the same: one 'svn st' per folder with depth immediate.
>
> > Getting information from the working copy per individual file will be
> so
> > much cheaper than before, that I will look for metadata changes first
> (and
> > cache only a fraction of the informational details I used to cache
> before)
> > and only when I really need to, I will perform the pristine file
> comparison.
> > (I don't know yet if I will use svn_(client|wc)_status for this or by
> just
> > calling svn_wc_text_modified_p2() myself).
> >
> > I would imagine that TortoiseSVN's folder glyph status would be
> calculated
> > much faster by using a similar strategy: First check if there is a
> metadata
> > change or conflict somewhere in the tree (keeping track of translated
> > filesize + filedate as these will be useful in the next step).
> > (This would be +- svn_client_infoX(). This should also inform you of
> any
> > property changes (I don't know if it already does that; but the
> information
> > in our internal API's is there now))
> > If there is such a status: just set the right glyph (early out; no
> need to
> > check any pristine files)
>
> So basically use svn_client_info() instead of svn_client_status(), then
> only check the status for files that don't have a defined status yet
> from that info. That seems like a good idea - a lot of work to rewrite
> the existing code, but it should be worth it.
>
> > And only if there isn't a status perform the
> svn_wc_text_modified_p2() calls
> > where needed.
>
> Would this API get renamed to svn_client_*? Or should I risk calling an
> svn_wc_ API? It's still not clear whether the svn_wc_ APIs will get
> made
> private as was discussed before.

Personally I don't see a problem with calling a wc api for this task. (It
has the same version guarantees as the client apis: we can't break this
before 2.0). Of course we can also add a wrapper in the client layer, but in
this case that would be just a one-on-one wrapper. (You can get the wc_ctx
from the svn_client_ctx_t). But we would have to maintain both until 2.0.

If you know exactly what you need for your cache, I would prefer adding a
few helpers for that task in libsvn_client over adding exact copies of the
libsvn_wc apis.

Other system or applications integrations like SCPlugin, KSvn and AnkhSVN
will face the same issues and would want to use these same helper apis.

> I thought about implementing a small cache for that, so that I don't
> have to walk up the tree every time to find an .svn dir.
> But I thought I read something about such a small cache getting
> implemented in the svn library itself so I wanted to ask first - maybe
> there's already an API to use that cache. Or maybe I just remember it
> wrong.

Yes, the wc_db api has a cache for this, but it has two issues that would
make me avoid it in TortoiseSVN:
* It sees every directory below a working copy as part of the working copy.
(So it is just like keeping a cache of the top level databases. Probably
not the answer you were looking for)
* And it keeps an sqlite database handle open for you.

If keeping the sqlite handle open is not an issue to you, I would recommend
keeping it open as long as possible. But with that handle open you can't
just delete a working copy by removing its files.
(I have some ideas on how we might fix that in a Windows specific way on
Vista+ using oplocks, but that will take quite some research and building
our own filesystem-layer for SQLite.)

Bert
Received on 2010-08-04 10:38:21 CEST

This message: [ Message body ]
Next message: Julian Foad: "Re: opening fsfs rev files rw"
Previous message: Bert Huijben: "RE: NODE_DATA (2nd iteration)"
In reply to: Stefan Küng: "Re: Need fast ways to get Info once WC-NG is introduced"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]