Re: Subversion FSFS logical addressing and packed shard

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Tue, 09 Feb 2016 00:45:52 +0000

Yves Martin <ymartin1040_at_gmail.com> writes:

> I would like to better understand how logical addressing has impact on
> packed shard. My objective is to provide a "unpack" feature with logical
> addressing in my version of "fsfs-reshard.py" script:
> https://github.com/ymartin59/svn-fsfs-reshard
>
> Do you have any hints to help me in that job or do you already know it
> is irrelevant and should be considered as a pure waste of time ?

Logical addressing refers to items in the revision file by an index
number rather than an offset. The revision file also contains an index
map that allows index numbers to be converted to offsets and offsets to
be converted to index numbers. The index map also contains the length
of each item and the revision number; the revision number is trivial for
an unpacked revision file.

A pack file has a similar index map but in this case the revision number
varies.

The 1.9 tool svnfsfs can dump and load the index maps of revision and
pack files. An example (shard size 4):

  $ svnfsfs dump-index repo 1
           Start Length Type Revision Item Checksum
             0 2a chgs 3 1 5f5b9c31
            2a 2a chgs 2 1 efee8d5b
            54 2a chgs 1 1 eee1b382
            7e 1 chgs 0 1 f28a4f1d
            7f 79 node 3 2 7e6fca28
            f8 72 drep 3 5 21933af7
           16a 55 drep 2 5 6f371fa3
           1bf 39 drep 1 5 8da855e0
           1f8 11 drep 0 3 60232b75
           209 9d node 1 4 d684e01d
           2a6 1b frep 1 3 1823e0a0
           2c1 9d node 2 4 3bd76335
           35e 1b frep 2 3 5b6fd650
           379 9d node 3 4 70fb00b0
           416 1b frep 3 3 1f9eb8e6
           431 78 node 2 2 7c048873
           4a9 78 node 1 2 cde8ee37
           521 59 node 0 2 403dbe48

Note that the items that make up a revision are not consecutive in the
pack file.

In principal the unpack is not hard. Read the index map from the pack
file. Then construct the revision files by extracting items from the
pack files and adding them to revision files, keeping track of the new
offsets. Do one revision file at a time or multiple revision files in
parallel. Once all the items are present in a revision file construct
the new index maps for each revision file.

It might be tricky to implement this in Python simply because you need
code to dump and load the index maps. You would have to write that code
from scratch, or run the svnfsfs tool, or write a Python binding to the
C code.

An alternative would be to implement an unpack operation for svnadmin in
C and use the existing C code to handle the index maps.

-- 
Philip Martin
WANdisco

Received on 2016-02-09 01:46:00 CET

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: [RFE] Make 'svn patch' read from STDIN"
Previous message: Andreas Scherer: "Re: [RFE] Make 'svn patch' read from STDIN"
In reply to: Yves Martin: "Subversion FSFS logical addressing and packed shard"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]