[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] FSFS filesystem options (long, sorry)

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2007-03-05 08:55:47 CET

On Mon, Mar 05, 2007 at 08:27:31AM +0100, Ph. Marek wrote:
> > Thoughts?
> Seems like a very usefull feature.
> Having that many files in a single directory is *bad* - even if your
> filesystem allows it.
> Ever did a "ls -la" and got more than 5000 lines? How do you look at that?

Well, I was really looking for comments about the option concept, but
with regards to the number of files in a directory: if the filesystem
supports it _well_, it really is the most efficient option. Most don't, alas.

(FSFS doesn't ever do a readdir(), so it's not quite as bad as you make
out, but the large number of files doesn't help lookups generally).

> BTW: If you do some changes in FSVS, how about issue 2286
> (http://subversion.tigris.org/issues/show_bug.cgi?id=2286) "Identical files
> should share storage space in repository"? Pretty please :-) ?
>

Yes, that's something else I've looked at. This would be an especially
good idea for feature branches, with their frequent merge-from-trunk
patterns, and it'd also help increase cache locality.

It's not easy, though: you need to determine the delta base for a file,
accept the new file and write out a delta (just in case it's unique),
then quickly look up any matching representations and ditch the delta
you've just written out. Oh, and when you commit, somehow update the
MD5 index without disturbing other readers.

I'm not saying it's impossible, but it's pretty hard.

Regards,
Malcolm

  • application/pgp-signature attachment: stored
Received on Mon Mar 5 08:56:07 2007

This is an archived mail posted to the Subversion Dev mailing list.