[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svntar, anybody?

From: Ph. Marek <philipp.marek_at_bmlv.gv.at>
Date: 2007-07-04 10:48:10 CEST

On Mittwoch, 4. Juli 2007, Talden wrote:
> > > A
> > > tar based approach will seek the length of the tar to evaluate
> > > differences to sync.
> >
> > It won't seek, it will just push that to the clients to apply.
> > And linear reading is on the >100MByte/sec range, eg. for a cheap stripe
> > set of two harddisks.
>
> Assuming of course that only one client pulls at a time. If not then
> you depending upon disk cache to avoid seeking. A large RAM
> investment might be coming your way.
The difference is still between having a few (self-synchronizing) points that
are read in a single file vs. having many files being read randomized.

See here:

Tar-file ===================================================================
                  ^ ^
            Client1 |
               Client2

These two clients will synchronize, until they both send identical data -
because Client2 will have to fetch from disk, while Client1 can use already
cached data and is thus faster.

Tar-file ===================================================================
                  ^ ^ ^ ^
            Client1 | Client3&4 Client5-8
               Client2

After a while there'll be (at most) a few points in the file being read; using
the anticipatory io-scheduler and/or read-ahead it's easy to only seek, say,
3 or 4 times a second while keeping the bandwidth at maximum speed.

Compare that with 8 clients reading physically discontinuos regions from the
harddisk, with additional CPU load ....
Sure, that data *should* be cached after the first reading ... but there's a
lot more to cache: indizes from the repository, inodes for the revision
files, and just the seeking around for "Client3&4" above takes most of the
harddisk bandwidth.

Regards,

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jul 4 10:48:13 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.