[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SQLite and the LIKE operator

From: Branko Čibej <brane_at_e-reka.si>
Date: Mon, 16 May 2011 09:45:54 +0200

On 16.05.2011 09:38, Hyrum K Wright wrote:
> 2011/5/16 Branko Čibej <brane_at_e-reka.si>:
>> On 16.05.2011 03:13, Hyrum K Wright wrote:
>>> Several places in wc_db we use the following pattern to select all
>>> nodes with a common tree ancestor:
>>> WHERE wc_id = ?1 AND (local_relpath = ?2 OR local_relpath LIKE ?3 ESCAPE '#')
>>>
>>> While this works, there was some concern about whether or not SQLite
>>> was using the proper indicies when executing this query. By examining
>>> the output for 'EXPLAIN QUERY PLAN' on some of the relevant SELECT
>>> statements, I believe it does use the indicies as intended.
>>>
>>> However, I stumbled across an alternate implementation which I believe
>>> has some merit. Instead of the above clause, we could use:
>>> WHERE wc_id = ?1 AND substr(local_relpath, 1, length(?2)) = ?2
>>>
>>> This also avoids a table scan by making use of the indicies, but has
>>> the advantage of not having to compute a separate parameter for the
>>> LIKE clause in C. It returns the same results, and has the benefit of
>>> being a bit more clear to SQLite what we're trying to accomplish. I'm
>>> tempted to switch our code to using this new format, but wanted some
>>> comments first. I have not yet run extensive timing or other analysis
>>> on the performance.
>>>
>>> Thoughts?
>>>
>>> -Hyrum
>> Can't be right. I'm assuming the first query works correctly iff:
>> ?2 = foo
>> ?3 = foo/%
>>
>> and returns 'foo' and all its subtree.
>>
>> The second query can't return the same results; if ?2=foo, it'll match
>> foobar, which is not foo's child; if ?2=foo/, it won't return foo.
> That's what I get for writing mail at 3am.
>
> I believe the following would fix this:
> WHERE wc_id = ?1 AND (local_relpath = ?2 OR substr(local_relpath, 1,
> length(?2 + 1)) = ?2 || '/')
>
> -Hyrum

That query used to be:

    local_relpath=?2 OR local_relpath LIKE ?2 || '/%'

but, for obvious reasons, that was a potential bug since literal % were
not escaped. Your latest proposal is broken, but I'm sure you'll find
the bug eventually. :)

Whether substr can be faster than LIKE -- I have no idea.

-- Brane
Received on 2011-05-16 09:46:29 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.