[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH]: Speed up deletion of multiple files

From: Hyrum K Wright <hyrum.wright_at_wandisco.com>
Date: Thu, 24 May 2012 14:45:18 -0500

On Thu, May 24, 2012 at 4:38 AM, Mat Booth <mat.booth_at_wandisco.com> wrote:
> On 24 May 2012 00:47, Bert Huijben <bert_at_qqmail.nl> wrote:
>>
>> > -----Original Message-----
>> > From: Paul Burba [mailto:ptburba_at_gmail.com]
>> > Sent: donderdag 24 mei 2012 0:43
>> > To: Subversion Development
>> > Subject: [PATCH]: Speed up deletion of multiple files
>> >
>> > On one CollabNet's forums a user reported that a single delete command
>> > of 2000+ WC files took well over three hours to complete with 1.7.5
>> > (see
>> > http://subversion.open.collab.net/ds/viewMessage.do?dsForumId=4&dsM
>> > essageId=456214).
>> >
>> > I'm able to replicate similar behavior with a partial checkout of
>> > ^/subversion/tags
>>
>> Which Sqlite version did you use?
>>
>> If your Sqlite is below 3.7.9 I would recommend updating Sqlite first.
>> (What results do you see from the wc-queries-test result on trunk?)
>>
>> This query appears to use the indexes for me on 3.7.12 without the patch, and looking at the buildbots also on 3.7.9.
>> (Which doesn't say that it can't be improved further... But if the optimizer already handles this we don't have to write dirty queries)
>>
>> I'll check my results tomorrow, but this sounds exactly like stefan2's problem last Saturday, which disappeared when he upgraded his Sqlite. (It took an hour before his upgrade, and deleting 16k files was just seconds later)
>>
>> My testcase of deleting 2000 files took about 12 seconds last Saturday. After that I haven't done measurements on that case, but I expect many improvements in other places.
>> (And I ran the tests on a very fast network drive instead of a local harddisk)
>>
>>        Bert
>>
>> >
>> > My test WC:
>> >
>> >   WC Size: 437 MB 21,012 Files, 2,717 Folders
>> >   wc.db Size: 18,108,000 bytes
>> >   Deletion Targets: 2642 files
>> >
>> > Using 1.7.5 this takes almost 27 minutes on my machine:
>> >
>> >   C:\SVN\sandbox\subversion-tags>timethis svn delete -q --targets
>> > del-target.2463.txt
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Fri May 18 11:27:04 2012
>> >
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Fri May 18 11:27:04 2012
>> >   TimeThis :      End Time :  Fri May 18 11:53:58 2012
>> >   TimeThis :  Elapsed Time :  00:26:53.758
>> >
>> > trunk_at_1341851 is significantly faster, taking only 16 minutes:
>> >
>> >   C:\SVN\sandbox\subversion-tags>timethis svn delete -q --targets
>> > del-target.2463.txt
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Wed May 23 10:56:41 2012
>> >
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Wed May 23 10:56:41 2012
>> >   TimeThis :      End Time :  Wed May 23 11:13:17 2012
>> >   TimeThis :  Elapsed Time :  00:16:35.873
>> >
>> > Using optimizations similar to what Bert used in r1341848 and creating
>> > a new , the attach patch cuts this down to just over 4 seconds:
>> >
>> >   C:\SVN\sandbox\subversion-tags>timethis svn delete -q --targets
>> > del-target.2463.txt
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Wed May 23 17:41:12 2012
>> >
>> >
>> >   TimeThis :  Command Line :  svn delete -q --targets del-target.2463.txt
>> >   TimeThis :    Start Time :  Wed May 23 17:41:12 2012
>> >   TimeThis :      End Time :  Wed May 23 17:41:16 2012
>> >   TimeThis :  Elapsed Time :  00:00:04.116
>> >
>> > ...So, WCNG gurus, does this look ok?
>> >
>> > [[[
>> > Speed up WC deletions.
>> >
>> > * subversion/libsvn_wc/wc-queries.sql
>> >   (STMT_DELETE_NODES_ABOVE_DEPTH_RECURSIVE,
>> >    STMT_INSERT_DELETE_FROM_NODE_RECURSIVE): Make the OR operation
>> > the
>> >    outer operation by duplicating some cheap tests.
>> >
>> >   (STMT_INSERT_DELETE_LIST_RECURSIVE,
>> >    STMT_INSERT_DELETE_LIST): Split old STMT_INSERT_DELETE_LIST into two
>> >    new versions, one recursive, and one not.
>> >
>> > * subversion/libsvn_wc/wc_db.c
>> >   (delete_node): Use the faster non-recursive query when operating on a
>> > file.
>> > ]]]
>> >
>> > --
>> > Paul T. Burba
>> > CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development
>> > Skype: ptburba
>>
>
> And yet in Subversion's configure.ac there are these lines:
>
> SQLITE_MINIMUM_VER="3.6.18"
> SQLITE_RECOMMENDED_VER="3.7.6.3"
>
> Maybe it would be prudent to update at least the recommended version
> if there are known problems with using versions of SQLite earlier than
> 3.7.9?

We do plan on bumping the minimum to at least 3.7.12 (per Bert's
research above) for the 1.8.x series. I don't think it would be
prudent to require such before the 1.8 release, but we can suggest it
to users, since they can upgrade sqlite independent of Subversion.

I almost committed the change to configure.ac to require sqlite 3.7.12
just now, but then realized that the bots would probably fail, as
3.7.12 is relatively new and it's likely the bots haven't been updated
to use it. (But then I realized that it probably won't happen until
we start breaking the bots anyway, so I'm tempted just to go ahead and
do it.)

-Hyrum

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Received on 2012-05-24 21:45:50 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.