[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

'svnadmin load' & database sync options

From: Oliver Jowett <oliver_at_opencloud.com>
Date: 2004-07-17 03:32:14 CEST

Hi all,

I'm currently looking at converting our large (~350mb) CVS repository to
subversion, learning subversion along the way.

cvs2svn happily produces a dumpfile containing ~14000 transactions:

> -rw-r--r-- 1 oliver ocstaff 708319348 Jul 16 18:53 cvs2svn-dump

Loading it via 'svnadmin load' is hideously slow, taking almost 10 hours:

> oliver@cyclone:~/svn-test$ svnadmin create repo-sync
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-sync <cvs2svn-dump
>
> real 561m59.668s
> user 14m1.379s
> sys 2m4.799s

Ok, so I'll use --bdb-txn-nosync:

> oliver@cyclone:~/svn-test$ svnadmin create --bdb-txn-nosync repo-no-sync
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-no-sync <cvs2svn-dump
>
> real 146m49.972s
> user 13m3.273s
> sys 1m36.818s

Better but still very disk-bound. Some digging with lsof/strace showed
that some fsync() calls are still done on the DB log files.

I experimented a bit with other DB options and ended up with this:

> oliver@cyclone:~/svn-test$ svnadmin create --bdb-txn-nosync repo-no-log
> oliver@cyclone:~/svn-test$ echo "set_flags DB_TXN_NOT_DURABLE" >>repo-no-log/db/DB_CONFIG
> oliver@cyclone:~/svn-test$ svnadmin recover repo-no-log
> Please wait; recovering the repository may take some time...
>
> Recovery completed.
> The latest repos revision is 0.
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-no-log <cvs2svn-dump
>
> real 26m40.620s
> user 12m40.711s
> sys 1m9.318s

That's more like what I originally expected!

The system these all ran on (cyclone) is a dual Athlon/MP 2800+, 2GB
RAM. The OS is Debian stable with a 2.6.5 Linux kernel, and subversion
is 1.0.5 as packaged in Debian unstable:

> ||/ Name Version Description
> +++-==============-==============-============================================
> ii subversion 1.0.5-1 Advanced version control system (aka. svn)
> ii libdb4.2 4.2.52-16 Berkeley v4.2 Database Libraries [runtime]

The subversion repositories are on an ext3 filesystem on a commodity IDE
disk with the disk's write-caching disabled.

So, some questions:

1) Is using DB_TXN_NOT_DURABLE during the initial load a sane thing to
do? I don't care about recovery from failures during the load at all --
I'd just restart from scratch if something did go wrong.
2) Is it normal for fsync() to still be called when --bdb-txn-nosync in use?
3) Is an option to use DB_TXN_NOT_DURABLE for the duration of a
'svnadmin load' a good idea?

-O

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat Jul 17 03:32:39 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.