Malcolm Rowe wrote:
> All,
>
> I haven't finished testing this yet, but here's a snapshot for review of
> a tool I was working on over the weekend for offline resharding of an
> FSFS repository. It can convert between the 1.4 and the (proposed) 1.5
> format, and is safe to interrupt and re-run.
>
> I'd welcome any comments (on the approach, implemention, or my
> fantastically unPythonic coding style :-)), but this is essentially what
> I'm planning to provide (in tools/) for people with large repositories
> who can't spare the time/space to dump/load.
I haven't had a chance to really review the code, but I noted several
functions that look like error handlers using print() instead of
sys.stderr.write(). Keep error output on stderr, please.
Do you wanna just check it in with the first line of main() being:
raise Exception, "This script is unfinished and not ready to be used "
"on live data. Trust us."
?
> I part-converted my copy of the ASF repository to test the
> restartability, producing 400 shards of 1000 files in about 5-6 minutes;
> going back the other way has taken about 5-6 hours so far. That's
> partly down to my filesystem (ext2), but it does mean that restarting a
> sharding operation is less than ideal, because I currently have that
> achieved by pre-converting to a linear structure.
Ouch.
--
C. Michael Pilato <cmpilato@collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on Wed Apr 11 21:24:22 2007