[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Eliminate scan_base_repos() - require repos-id and repos-relpath in all BASE nodes

From: Greg Stein <gstein_at_gmail.com>
Date: Thu, 28 Oct 2010 13:55:47 -0400

On Thu, Oct 28, 2010 at 12:36, Hyrum K. Wright
<hyrum_wright_at_mail.utexas.edu> wrote:
> On Thu, Oct 28, 2010 at 11:29 AM, Mark Phippard <markphip_at_gmail.com> wrote:
>> On Thu, Oct 28, 2010 at 12:23 PM, Hyrum K. Wright
>> <hyrum_wright_at_mail.utexas.edu> wrote:
>>> On Thu, Oct 28, 2010 at 10:09 AM, Julian Foad <julian.foad_at_wandisco.com> wrote:
>>>> Bert and Erik and Philip and I discussed on IRC today the merits, or
>>>> lack of merits, of allowing repos-id/repos-relpath to be elided in the
>>>> NODES table BASE layer (op_depth = 0).
>>>>
>>>> * The data is not currently elided.
>>>>
>>>> * Some queries for locks currently assume the data is not elided.
>>>>
>>>> * Elision could save a bit of DB size, which *might* contribute to a
>>>> little bit of general DB performance.
>>>>
>>>> * The option of elision results in the need for all users of this data
>>>> to call svn_wc__db_scan_base_repos() or the internal version
>>>> scan_upwards_for_repos(), which is an extra maintenance burden and extra
>>>> run-time cost.
>>>>
>>>> We concluded it would be better to require the columns to be always
>>>> filled in.  I'll do this soon if no objections.
>>>
>>> As I understand it, our original intention for elision was to make
>>> switch and relocate go much faster, since only one row would be
>>> updated instead of the entire tree.  If that creates too much of a
>>> burden elsewhere, then yes, we can probably nuke it (and we can
>>> probably write a good enough query to make the relatively uncommon
>>> operations of switch and relocate only negligibly slower).
>>>
>>> This is one use case, however, and isn't necessarily the only one.
>>
>> At the same time, with code written to properly leverage SQL this is
>> the sort of scenario that we should not worry about anymore.  SQLite
>> ought to be able to update all the rows in the DB faster than our old
>> code could walk the tree and open all the entries files, let alone
>> rewrite them.
>
> Correct.  And we still have to do the update crawl on a switch or relocate, no?

Dunno that we would have to do a crawl (unless you are limiting the
depth), but I think you guys are correct: on balance, the concept of
elision is costing us more than it is saving us.

+1 to eliminating the concept, and ensuring that we always write them
out (it may look like it now, but be careful... there could be edge
cases)

I would suggest making the columns NOT NULL in the schema. I don't
think we need to attempt a format bump to adjust older working copies
-- they just won't have the sqlite benefits from the additional schema
details.

Cheers,
-g
Received on 2010-10-28 19:56:22 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.