Re: Sharded FSFS repositories - summary

From: Michael Sinz <Michael.Sinz_at_sinz.org>
Date: 2007-03-15 21:44:48 CET

Matthias Wächter wrote:
> On 13.03.2007 13:47, Ph. Marek wrote:
>>> - We'll create shards of 4000 entries each. That's large enough that
>>> someone would have to hit 16M revisions before a larger value would be
>>> an improvement, but small enough that it has reasonable performance
>>> (and support) on all filesystems that I'm aware of. It's also a
>>> power-of-ten, so easier for humans to understand.
>> 4000 is no (integer) power of ten, so would not really be better.
>> Quick, in which directory is revision 421712? (see KDEs repository)
>>
>> If I understand you correctly, you want to have
>> 0/1
>> 0/2
>> 0/3
>> ...
>> 0/3999
>> 1/4000
>> 1/4001
>> ...
>> 2/8000
>> and so on. Right?
>>
>> I'd prefer to have a *real* (integer :-) power of ten, eg. 1000. And
>> TBH, 4000 is a bit too much (for me, at least) - 1000 would be high,
>> but acceptable.
>> (I'd really prefer 100 and 3 or 4 levels - but I seem to be alone with
>> that.)
>
>
> How about:
>
> 0/
> 1/1
> 2/2
> 3/3
> ...
> 3999/3999
> 0/4000
> 1/4001
> ...
> 3999/7999
> 0/8000
>
> and so on?

The problem with this is that you don't get the "stable" or "read-only" bit
of the directory tree. On large systems, being able to easily put part of the
repository (older bit) onto tier-2 or tier-3 storage systems or even onto
read-only archival storage systems can be very powerful. For example, with
Apache, they could easily put the first 300,000 revisions onto a slower R/O
storage device and never actually notice the performance difference.

(Tier-1 storage is much more costly than the tier-2/tier-3 SAN storage)

Also, having the data in archival storage reduces backup costs too (tier-1
SAN flash-copy requires a second storage area of the same size, and if that
size can be less, the flash-copy backup is also smaller and thus major
cost savings - plus many SAN flash backup systems are licensed on a storage
size basis, with the costs going up as the storage size goes up)

> Disadvantage 1: All top-level directories are created before the 4000th
> revision, you don't see the repository "grow up" on the top level by
> numbers of sub-directories.
>
> Disadvantage 2: You cannot take "finished" directories to put them on
> non-backuped storage space (considering good archive for it), since each
> directory may receive new files every now and then.

See above...

> I like the idea of having the divisor be a power of 10 (or let the
> revision stored in hex? Then take 4096 which is 3-digit hex :)).
>
> Beside that, multiple levels would be fine, too, and could reduce the
> impact of Disadvantage 2 from above. I would suggest having the
> top-level directories being used in a round-robin fashin for throughput
> maximization, the second level would be used according to the base
> proposal:

Again, the top-level is where the best win is for archival behaviors.
Also, what throughput maximization? If most operations happen in the last
20 revisions or so, over all performance would be best if those 20
revisions are in the *same* directory and not different ones. The OS
and filesystem are much more likely to have cached everything you need
when dealing within a smaller set of directories. (Locality of reference
is usually a benefit)

If you want to get more performance, a better disk subsystem and more RAM
is the way to go (and filesystem/OS that can make use of those resources)

[...]

> Of course, this all only makes sense if there is a performance benefit
> for splitting sequential accesses over multiple storage spaces.

In general, a SAN or good RAID setup with a filesystem that knows how to
really push the hardware will beat having the mounted directories across
different filesystems. The reduction in metadata caching requirements
and locking requirements alone may prove to provide the win here.

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Thu Mar 15 21:45:36 2007

This message: [ Message body ]
Next message: Michael Sinz: "Re: Sharded FSFS repositories - summary"
Previous message: Mark Phippard: "Re: Sharded FSFS repositories - summary"
In reply to: Matthias WÃ¤chter: "Re: Sharded FSFS repositories - summary"
Next in thread: John Peacock: "Re: Sharded FSFS repositories - summary"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]