[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: wc_db performance (was: wc_db API discussion)

From: Paul Burba <ptburba_at_gmail.com>
Date: Thu, 17 Mar 2011 15:40:41 -0400

On Tue, Mar 15, 2011 at 6:22 PM, Paul Burba <ptburba_at_gmail.com> wrote:
> On Mon, Mar 14, 2011 at 2:36 PM, Hyrum K Wright <hyrum_at_hyrumwright.org> wrote:
>> On Sat, Mar 12, 2011 at 6:47 AM, Stefan Sperling <stsp_at_elego.de> wrote:
>>> On Fri, Mar 11, 2011 at 10:43:46PM -0500, Greg Stein wrote:
>>>> 2011/3/11 Branko Čibej <brane_at_e-reka.si>:
>>>> >...
>>>> > For the second task, I think the first order of business is to change
>>>> > the wc-db tree crawler to do one query instead of zillions, or at least,
>>>> > where several queries are required, to do them all in one transaction.
>>>>
>>>> stsp has been working this recently. Killing the node walker, and
>>>> moving to table scans.
>>>
>>> Yes. So far, I've been working in the revision status code.
>>> There are two problems left to fix before I'll move on to the next task:
>>>
>>>  - There are API layering issues (wc_db.c calls into node.c).
>>>   This is related to the API discussions in the other thread
>>>   so I'll follow up there.
>>
>> Before reading this thread, I saw the call into node.c, and have
>> subsequently removed it.
>>
>>>  - The revision status code issues about 5 separate queries,
>>>   which aren't combined via a transaction and don't use temporary tables.
>>>   This is no worse than the previous code using the node walker,
>>>   obviously :)  But I'll look at fixing this so that the results
>>>   returned correspond to the state of the DB as of the time the
>>>   svn_wc__db_revision_status() call was made.
>>
>> I wrapped this API in a txn in r1081510.
>>
>>> For others who want to jump in and help, here is a list of places
>>> where the node walker is still being used. I'm not sure if we can
>>> eliminate it everywhere before release, but each of these should
>>> be looked at to see whether we can use an alternative approach to
>>> increase performance:
>>>
>>>  subversion/libsvn_client/changelist.c
>>>  subversion/libsvn_client/commit_util.c
>>>  subversion/libsvn_client/info.c
>>>  subversion/libsvn_client/merge.c
>>>  subversion/libsvn_client/mergeinfo.c
>>>  subversion/libsvn_client/prop_commands.c
>>>   (This should be propget and propset. Proplist is already using
>>>    queries involving temporary tables. Rewriting propget on top
>>>    of the proplist code would be easy.
>
> Is anyone working on propget?  I'll take stab at that if not.

Done, see r1082658.

Not surprisingly, this makes a big difference in the performance of
svn propget -R. What is possibly a little(?) surprising is that it's
faster than 1.6 too:

Using a checkout of
https://svn.apache.org/repos/asf/subversion/branches@1082531 at depth
immediates and setting these branches to infinite depth, 1.0.x 1.1.x
1.2.x 1.3.x 1.4.x 1.5.x 1.6.x, gives us reasonably(?) large WC:

  272 MB
  14,044 files
  1,428 Directories

svn pg svn:mergeinfo -Rv on the root
Version Elapsed Time Peak Working Set Memory
------- ------------ -----------------------
1.6.x_at_1082537 00:00:03.859 14,400 K
trunk_at_1082591 00:01:15.979 12,004 K
trunk_at_1082591 (PATCHED) 00:00:00.581 15,448 K

Paul

> It
> would be useful for dealing with merge.c:merge_reintegrate_locked's
> use of svn_wc__node_walk_children().
>
> Paul
>
>>> Propset needs more work.)
>>>  subversion/libsvn_client/ra.c
>>>  subversion/libsvn_wc/update_editor.c
>>
>> I'll take a gander at some of these, too.  (But I'm not entirely sure
>> what I' gandering at or for...)
>>
>> -Hyrum
>>
>
Received on 2011-03-17 20:41:41 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.