[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Suggestion: preventing inode crowding with FSFS

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2005-10-07 09:34:58 CEST

On Fri, 2005-10-07 at 00:12 -0700, Kean Johnston wrote:
> Ofbviously, with
> hundreds, thousands or even hundreds of thousands of revs,
> this doesn't scale well on many platforms where you start
> paying a severe penalty when a single directory has too many
> inodes in it.

So far I've seen a fair number of people worrying about the problem, but
not many people experiencing it.

That said, we would be interested in code to support a more complex
layout as an option--at least if the code came with performance
measurements showing that the code would actually benefit someone. It
would go something like this:

  * Design the new layout in such a way that it can be recognized when
an FS is opened.
  * Add an option to svnadmin create to use the new layout.
  * Provide a Python script to convert an existing repository to the new
layout. (An sh script would be simpler, but wouldn't work on Windows.)

The new layout cannot be on by default until 2.0.

> Did the designers consider a scheme similar to what Squid
> does (it has the same problem, wants to put thousands of
> cached files into a database)? An effective way to do this
> is thus. Convert the revision number to a 8-digit hex number.
> Take the very last digit as a top level directory. Take the
> next two digits as a second tier directory. The create the
> file inside that directory. This spreads the load out fairly
> evenly, and I would image it would be pretty trivial to
> imnplement.

We might prefer something simpler; I'm not sure if the load-spreading
goal of the Squid cache layout is of any great value to a Subversion
repository. Also, although 2^64 is "plenty" of revisions, the current
FSFS layout does not impose an upper limit on the number of revisions,
and it would be nice to keep that property.

> If anyone likes the idea and can point me in the approximate
> direction of the right place in teh code, I will work on a
> patch to do so. However, fi this idea has been previously
> suggested and subsequently rejected for some reason, I'd
> be curious why it was rejected.

Relevant code:

  svnadmin/main.c -- to add the svnadmin option; see how existing BDB
options are handled

  libsvn_fs_fs/fs.h -- add a field to fs_fs_data_t tracking whether the
repository is using the new layout

  libsvn_fs_fs/fs.c -- fs_create and fs_open are here

  libsvn_fs_fs/fs_fs.c -- svn_fs_fs__path_rev needs adjustment, and
path_revproprs probably needs to be changed to take a revision number so
that it can be similarly adjusted. Some callers would need to be
changed to ensure the containing directory exists.

When creating directories to place rev files into, make sure to copy the
mode of a reference directory (the one containing rev 0, perhaps) so
that the umask of the current user doesn't prevent other users in the
group from accessing the repository. See
libsvn_fs_fs/lock.c:ensure_dir_exists() for an example; that function
might want to move to fs_fs.c and be exported so that it can be used
within fs_fs.c.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Oct 7 09:36:06 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.