[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fsfs-reshard.py - offline FSFS resharding

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2007-04-11 21:23:56 CEST

Malcolm Rowe wrote:
> All,
>
> I haven't finished testing this yet, but here's a snapshot for review of
> a tool I was working on over the weekend for offline resharding of an
> FSFS repository. It can convert between the 1.4 and the (proposed) 1.5
> format, and is safe to interrupt and re-run.
>
> I'd welcome any comments (on the approach, implemention, or my
> fantastically unPythonic coding style :-)), but this is essentially what
> I'm planning to provide (in tools/) for people with large repositories
> who can't spare the time/space to dump/load.

I haven't had a chance to really review the code, but I noted several
functions that look like error handlers using print() instead of
sys.stderr.write(). Keep error output on stderr, please.

Do you wanna just check it in with the first line of main() being:

   raise Exception, "This script is unfinished and not ready to be used "
                    "on live data. Trust us."
?

> I part-converted my copy of the ASF repository to test the
> restartability, producing 400 shards of 1000 files in about 5-6 minutes;
> going back the other way has taken about 5-6 hours so far. That's
> partly down to my filesystem (ext2), but it does mean that restarting a
> sharding operation is less than ideal, because I currently have that
> achieved by pre-converting to a linear structure.

Ouch.

-- 
C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on Wed Apr 11 21:24:22 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.