Re: Sharded FSFS repositories - summary
From: Matthias Wächter <matthias.waechter_at_tttech.com>
 
Date: 2007-03-15 13:49:27 CET 
On 13.03.2007 13:47, Ph. Marek wrote:
 How about:
         0/
 and so on?
 Advantage: One could store each directory on a separate storage devices probably increasing bandwidth since revisions are typically read sequentially in number. (Q: Is this true? Or is this presuming a pipelined file access that is not yet implemented?)
 Disadvantage 1: All top-level directories are created before the 4000th revision, you don't see the repository "grow up" on the top level by numbers of sub-directories.
 Disadvantage 2: You cannot take "finished" directories to put them on non-backuped storage space (considering good archive for it), since each directory may receive new files every now and then.
 I like the idea of having the divisor be a power of 10 (or let the revision stored in hex? Then take 4096 which is 3-digit hex :)).
 Beside that, multiple levels would be fine, too, and could reduce the impact of Disadvantage 2 from above. I would suggest having the top-level directories being used in a round-robin fashin for throughput maximization, the second level would be used according to the base proposal:
         0/
 and so on.
 In this scheme, I only have 10 top-level directories (easier to split over multiple disks) and one sub-directories in each of them with each step of 10,000 revisions, but each directory only containing 1,000 revisions. With each step of 10,000 revisions all "finished" second-level directories could be excluded from subsequent backups.
 Of course, this all only makes sense if there is a performance benefit for splitting sequential accesses over multiple storage spaces.
 - Matthias
 ---------------------------------------------------------------------
  | 
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.