[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: SQLite and callbacks

From: Bert Huijben <bert_at_qqmail.nl>
Date: Tue, 8 Feb 2011 16:34:52 +0100

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp_at_elego.de]
> Sent: dinsdag 8 februari 2011 15:18
> To: Bert Huijben
> Cc: 'Branko Čibej'; dev_at_subversion.apache.org
> Subject: Re: SQLite and callbacks
>
> On Tue, Feb 08, 2011 at 10:50:46AM +0100, Bert Huijben wrote:
> >
> >
> > > -----Original Message-----
> > > From: Branko Čibej [mailto:brane_at_xbc.nu] On Behalf Of Branko Cibej
> > > Sent: dinsdag 8 februari 2011 4:39
> > > To: dev_at_subversion.apache.org
> > > Subject: Re: SQLite and callbacks
> > >
> > > On 07.02.2011 21:51, Stefan Sperling wrote:
> > > >> A lot of wc databases out there will be
> > > >> so small that the user will hardly notice the memory increase.
> > > > All we'd be doing is allowing sqlite to flush data to disk if
> needed.
> > > > Even with a temporary table backed by a file, most operations
> happen
> > > in
> > > > memory. Either in buffers managed by sqlite or the operating
> system's
> > > > buffer cache (until sqlite does an fsync). So for small databases
> it
> > > > shouldn't make a difference.
> >
> > On NTFS just creating a new 0 byte tempfile requires an fsync (and
> probably a few in a row), so using the in memory buffers instead of a
> tempfile improved our SQLite performance significantly (and not only on
> Windows). Assuming using tempfiles was cheap was one of the major
> slowdowns of WC-1.0 on Windows.
> >
> >
> > Please don't suggest 'just making it file backed' as an easy feature
> if you only measured it on a non journaling filesystem.
> >
> >
> > With our current query scheme on 'common operations', switching to
> file based temporary storage will require rewriting almost every sql
> operation and how we use it to have release acceptable performance on
> Windows. A simple 'OR' or using a subquery may introduces a temptable.
> >
> > In this thread we are looking at property storage... which was
> probably always slower than it is today.
> >
> > Yes, we can improve that, but please don't suggest introducing a 30%
> slowdown on the more common code paths like those used in 'svn status'
> or 'svn update' to improve reading many properties, without measuring
> the consequences.
>
> I don't think that status will be released in its current form.
> It does way too many queries.
> We'll need to look into optmizing it using queries with temporary
> tables, like Branko suggested for proplist.

There is nobody actively working on status and there are no open issues on status to block branching...

So if status is still a show-stopper we should focus on that instead of trying to improve 'svn proplist -R', which is not a very common operation in svn. (Merge works per node, so it doesn't benefit from the performance enhancements :( )

As things are today it looks like status will be released in its current form for 1.7.
I don't see a problem with the current status performance from my perspective; unless somebody decides to just disable in memory temptables without profiling to fix some other issues in a different place.

I really wish we could just decide this per query as most current queries only use temptables for 0-5 rows, but I don't see that as an easy option. Maybe one of the SQLite devs has a solution here?

An even better solution would be that SQLite tries to do things completely in memory and only *creates* a tempfile when needed. (It seems it now creates the file anyway; but doesn't use it until needed. Introducing a heavy performance penalty on NTFS, but not on extXfs)

> Also, there is no need to use the same default on all platforms.
> We can use memory-backed temp-tables on windows and file-backed
> temp-tables on unix if that's what we need.

Per platform (or possibly per filesystem) defaults also need profile details to make the right decisions. (Ram is not cheap in AnkhSVN as many users fill up their 4 GB in Visual Studio, but neither are tempfiles)

I don't have a problem with choosing file backed temptables over memory backed, but I do have an issue with doing it for theoretical reasons which are only tested under very specific circumstances. And a recursive proplist is not a very common and/or performance critical subversion operation from my perspective.

        Bert
Received on 2011-02-08 16:35:42 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.