[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1166489 - /subversion/branches/fs-successor-ids/subversion/libsvn_fs_fs/fs_fs.c

From: Stefan Sperling <stsp_at_elego.de>
Date: Thu, 8 Sep 2011 10:36:13 +0200

On Thu, Sep 08, 2011 at 01:08:33AM -0000, danielsh_at_apache.org wrote:
> Author: danielsh
> Date: Thu Sep 8 01:08:33 2011
> New Revision: 1166489
>
> URL: http://svn.apache.org/viewvc?rev=1166489&view=rev
> Log:
> On the fs-successor-ids branch, actually implement sharding.
>
> Found by: stsp
>
> * subversion/libsvn_fs_fs/fs_fs.c
> (FSFS_SUCCESSORS_REVISIONS_PER_SHARD): New helper.
> (path_successor_ids_shard, path_successor_ids,
> path_successor_node_revs_shard, path_successor_node_revs):
> Fix path calculations.
> (update_successor_ids_file):
> Fix checks for 'New shard' and 'New file in a shard'.
>

Just to make sure we both have the same idea:

Each file in the successor store is responsible for a fixed
number of revisions (currently 1000).

max-files-per-dir tells us how many files can be in a single directory.
If more than max-files-per-dir files exist in a given directory
we open a new directory and store files there instead.

So I would expect sharding within the successors tree
to behave like this:

 filename: file stores successor data created in:
 db/successors/ids/0/0 r0..r999
 db/successors/ids/0/1 r1000..r1999
 ... ...
 db/successors/ids/0/999 r1000000..r1999999
 db/successors/ids/1/0 r2000000..r2000999
 ... ...

Data for the first million revs goes into the first shard,
data for the second million revs goes into the second shard, etc.

Is this what you've implemented?

I probably would have used FSFS_SUCCESSORS_FILES_PER_SHARD
instead of FSFS_SUCCESSORS_REVISIONS_PER_SHARD, and then
computed the filename based on that number. I don't like
thinking of it in terms of "revisions per shard" because
the numbers get so big :)
Received on 2011-09-08 10:37:15 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.