Re: Sharded FSFS repositories - summary
From: Matthias Wächter <matthias.waechter_at_tttech.com>
Date: 2007-03-15 13:49:27 CET
On 13.03.2007 13:47, Ph. Marek wrote:
How about:
0/
and so on?
Advantage: One could store each directory on a separate storage devices probably increasing bandwidth since revisions are typically read sequentially in number. (Q: Is this true? Or is this presuming a pipelined file access that is not yet implemented?)
Disadvantage 1: All top-level directories are created before the 4000th revision, you don't see the repository "grow up" on the top level by numbers of sub-directories.
Disadvantage 2: You cannot take "finished" directories to put them on non-backuped storage space (considering good archive for it), since each directory may receive new files every now and then.
I like the idea of having the divisor be a power of 10 (or let the revision stored in hex? Then take 4096 which is 3-digit hex :)).
Beside that, multiple levels would be fine, too, and could reduce the impact of Disadvantage 2 from above. I would suggest having the top-level directories being used in a round-robin fashin for throughput maximization, the second level would be used according to the base proposal:
0/
and so on.
In this scheme, I only have 10 top-level directories (easier to split over multiple disks) and one sub-directories in each of them with each step of 10,000 revisions, but each directory only containing 1,000 revisions. With each step of 10,000 revisions all "finished" second-level directories could be excluded from subsequent backups.
Of course, this all only makes sense if there is a performance benefit for splitting sequential accesses over multiple storage spaces.
- Matthias
---------------------------------------------------------------------
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.