[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: some questions/comments on using fsfs for file system

From: stephen meechan <mgvk68_at_dsl.pipex.com>
Date: 2004-08-13 01:12:05 CEST

> I heard tell this morning of a change to cvs2svn to make it use
> bdb-txn-nosync, allegedly speeding up conversion by a factor of four
> or so. So you might be able to get comparable conversion performance
> between the two back ends now.

I've checked out the latest cvs2svn and I'll try it as I'm still
experimenting with the repositories to see what settings give the best
overall conversion results.

> > Would the NTFS file compression work ok with FSFS? I wouldn't expect
> > it to work safely with BDB and never tried it.
>
> I assume it would work okay, but it might not buy you very much, since
> file contents in an FSFS repository are already compressed. Directory
> contents and metadata are not compressed, so you might see some
> savings, but your performance would probably degrade noticeably.

The benefit might come in the revprops directory, the files are very small,
the smallest is 50 bytes and the largest is 528. Under NTFS it reports the
directory size as 1.69 MB for 13211 files, but the disk used as 51.6 MB.
Obviously this is just wasted space, but overall it's still smaller that the
equivalent DB file.

> > I no longer need hotbackup or recover?
>
> Recover is a no-op at the moment (and will remain so unless we
> discover common failure cases we can automatically recover from). You
> still "need" hotbackup, in that a naively copied FSFS repository might
> suffer from (easily repairable) inconsistencies between the "current"
> file and the set of revision files present under db/revs.

I tried the hotcopy, it was slower that the DB hotcopy, but not by much.

> > Is it good for 20,000, 30,000, and upwards revisions? Has anyone
> > tried more?
>
> I think it is. If you experience bad performance, let me know. There
> are two theoretical reasons to believe performance might suffer:

OK I'm going to try it out. I dont think I have enough real data to reach
30,000, but by combining all my repositories that would at least show me
that there's enough growing room for few years.

> * The FSFS delta storage algorithm means that as the number of file
> revs goes up, the number of deltas required to check out the head
> also goes up. However, the increase is logarithmic, so I don't
> think performance will become unacceptable even with a very large
> number of file revs.
>
> * FSFS has a big directory full of rev files (and another one full
> of revprop files), one for each revision. Although in my
> experience, most filesystems perform very well with large
> directories even if they don't use advanced techniques like
> B-trees, at least one person complained that his NFS FSFS
> repository performed poorly for non-svn operations like backup.
>
> The second factor may be addressed in 1.2 with an option to split up
> the revs and revprops directories into smaller subdirs.

I've seen the 2nd problem happen on CD, backing up 40,000 TIFF images files
to one directory on a CD worked OK, but browsing that directory in explorer
would take around 10 minutes to come up. The solution was similar to what
you suggested, copy every 1000 files to their own sub directory.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Aug 13 01:05:18 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.