I haven't finished testing this yet, but here's a snapshot for review of
a tool I was working on over the weekend for offline resharding of an
FSFS repository. It can convert between the 1.4 and the (proposed) 1.5
format, and is safe to interrupt and re-run.
I'd welcome any comments (on the approach, implemention, or my
fantastically unPythonic coding style :-)), but this is essentially what
I'm planning to provide (in tools/) for people with large repositories
who can't spare the time/space to dump/load.
One problem with it is that while conversion from a linear layout to a
sharded layout is fast, conversion back the other way seems glacial by
I part-converted my copy of the ASF repository to test the
restartability, producing 400 shards of 1000 files in about 5-6 minutes;
going back the other way has taken about 5-6 hours so far. That's
partly down to my filesystem (ext2), but it does mean that restarting a
sharding operation is less than ideal, because I currently have that
achieved by pre-converting to a linear structure.
Received on Wed Apr 11 20:21:30 2007
- application/pgp-signature attachment: stored