[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: SQLite and the LIKE operator

From: Bert Huijben <bert_at_qqmail.nl>
Date: Mon, 16 May 2011 10:04:07 +0200

> -----Original Message-----
> From: Branko Čibej [mailto:brane_at_xbc.nu] On Behalf Of Branko Cibej
> Sent: maandag 16 mei 2011 9:46
> To: dev_at_subversion.apache.org
> Subject: Re: SQLite and the LIKE operator
>
> On 16.05.2011 09:38, Hyrum K Wright wrote:
> > 2011/5/16 Branko Čibej <brane_at_e-reka.si>:
> >> On 16.05.2011 03:13, Hyrum K Wright wrote:
> >>> Several places in wc_db we use the following pattern to select all
> >>> nodes with a common tree ancestor:
> >>> WHERE wc_id = ?1 AND (local_relpath = ?2 OR local_relpath LIKE ?3
> ESCAPE '#')
> >>>
> >>> While this works, there was some concern about whether or not SQLite
> >>> was using the proper indicies when executing this query. By examining
> >>> the output for 'EXPLAIN QUERY PLAN' on some of the relevant SELECT
> >>> statements, I believe it does use the indicies as intended.
> >>>
> >>> However, I stumbled across an alternate implementation which I believe
> >>> has some merit. Instead of the above clause, we could use:
> >>> WHERE wc_id = ?1 AND substr(local_relpath, 1, length(?2)) = ?2
> >>>
> >>> This also avoids a table scan by making use of the indicies, but has
> >>> the advantage of not having to compute a separate parameter for the
> >>> LIKE clause in C. It returns the same results, and has the benefit of
> >>> being a bit more clear to SQLite what we're trying to accomplish. I'm
> >>> tempted to switch our code to using this new format, but wanted some
> >>> comments first. I have not yet run extensive timing or other analysis
> >>> on the performance.
> >>>
> >>> Thoughts?
> >>>
> >>> -Hyrum
> >> Can't be right. I'm assuming the first query works correctly iff:
> >> ?2 = foo
> >> ?3 = foo/%
> >>
> >> and returns 'foo' and all its subtree.
> >>
> >> The second query can't return the same results; if ?2=foo, it'll match
> >> foobar, which is not foo's child; if ?2=foo/, it won't return foo.
> > That's what I get for writing mail at 3am.
> >
> > I believe the following would fix this:
> > WHERE wc_id = ?1 AND (local_relpath = ?2 OR substr(local_relpath, 1,
> > length(?2 + 1)) = ?2 || '/')
> >
> > -Hyrum
>
> That query used to be:
>
> local_relpath=?2 OR local_relpath LIKE ?2 || '/%'
>
> but, for obvious reasons, that was a potential bug since literal % were
> not escaped. Your latest proposal is broken, but I'm sure you'll find
> the bug eventually. :)
>
> Whether substr can be faster than LIKE -- I have no idea.

I think you were talking about the increment problem, but there is another problem. Local_relpath "" would give '' || '/', which is the invalid relpath '/'.

        Bert
Received on 2011-05-16 10:05:29 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.