[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Effect of indices on SQLite (optimizer) performance

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: Sat, 5 Feb 2011 19:18:05 +0100

Now attached as text files (to be renamed to .py) to prevent the
mailer software from dropping them...

Bye,

Erik.

On Sat, Feb 5, 2011 at 7:05 PM, Erik Huelsmann <ehuels_at_gmail.com> wrote:
> Yesterday or IRC, Bert, Philip and I were chatting about our SQLite
> perf issues and how Philip's findings in the past suggested that
> SQLite wasn't using its indices to optimize our queries.
>
> After searching and discussing its documentation, Philip suggested the
> -too obvious- "maybe we have the wrong indices".
>
> So, I went to work with his "fake database generator script" (attached
> as "test.py").
>
>
> The type of query we're seeing problematic performance with looks like
> the one below. The essential part is the WHERE clause.
>
> SELECT * FROM nodes WHERE wc_id = 1 AND (local_relpath = 'foo' OR
> local_relpath like 'foo%');
>
>
> We discussed 3 ways to achieve the effect of this query:
>
>  1. The query itself
>  2. The query stated as a UNION of two queries
>  3. Running the two parts of the UNION manually ourselves.
>
> Ad (1)
> This query doesn't perform as we had hoped to get from using a database.
>
> Ad (2)
> In the past, UNIONs have been explicitly removed because they were
> creating temporary tables (on disk!). However, since then we have
> changed our SQLite setup to create temporary tables in memory, so the
> option should really be re-evaluated.
>
> Ad (3)
> I'd hate to have to use two queries in all places in our source where
> we want to run queries like these. As a result, I think this scenario
> should be avoided if we can.
>
>
> So, I've created 'perf.py' to evaluate each of these scenarios,
> researching the effect on each of them under the influence of adding
> different indices.
>
> This is my finding:
>
> Scenario (1) [an AND combined with a complex OR] doesn't perform well
> under any circumstance.
>
> Scenario (2) performs differently, depending on the available indices.
>
> Scenario (3) performs roughly equal to scenario (2).
>
>
> Scenario (2) takes ~0.27 seconds to evaluate in the unmodified
> database. Adding an index on (wc_id, local_relpath) makes the
> execution time drop to ~0.000156 seconds!
>
>
> Seems Philip was right :-) We need to carefully review the indices we
> have in our database to support good performance.
>
>
> Bye,
>
>
> Erik.
>

Received on 2011-02-05 19:18:48 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.