[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Effect of indices on SQLite (optimizer) performance

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: Sat, 5 Feb 2011 19:05:52 +0100

Yesterday or IRC, Bert, Philip and I were chatting about our SQLite
perf issues and how Philip's findings in the past suggested that
SQLite wasn't using its indices to optimize our queries.

After searching and discussing its documentation, Philip suggested the
-too obvious- "maybe we have the wrong indices".

So, I went to work with his "fake database generator script" (attached
as "test.py").

The type of query we're seeing problematic performance with looks like
the one below. The essential part is the WHERE clause.

SELECT * FROM nodes WHERE wc_id = 1 AND (local_relpath = 'foo' OR
local_relpath like 'foo%');

We discussed 3 ways to achieve the effect of this query:

 1. The query itself
 2. The query stated as a UNION of two queries
 3. Running the two parts of the UNION manually ourselves.

Ad (1)
This query doesn't perform as we had hoped to get from using a database.

Ad (2)
In the past, UNIONs have been explicitly removed because they were
creating temporary tables (on disk!). However, since then we have
changed our SQLite setup to create temporary tables in memory, so the
option should really be re-evaluated.

Ad (3)
I'd hate to have to use two queries in all places in our source where
we want to run queries like these. As a result, I think this scenario
should be avoided if we can.

So, I've created 'perf.py' to evaluate each of these scenarios,
researching the effect on each of them under the influence of adding
different indices.

This is my finding:

Scenario (1) [an AND combined with a complex OR] doesn't perform well
under any circumstance.

Scenario (2) performs differently, depending on the available indices.

Scenario (3) performs roughly equal to scenario (2).

Scenario (2) takes ~0.27 seconds to evaluate in the unmodified
database. Adding an index on (wc_id, local_relpath) makes the
execution time drop to ~0.000156 seconds!

Seems Philip was right :-) We need to carefully review the indices we
have in our database to support good performance.

Bye,

Erik.
Received on 2011-02-05 19:06:25 CET

This is an archived mail posted to the Subversion Dev mailing list.