C. Michael Pilato wrote on Wed, Sep 21, 2011 at 11:50:40 -0400:
> On 09/21/2011 11:03 AM, Daniel Shahaf wrote:
> >> But before we press on here, I'd like to understanding your bigger-picture
> >> view.
> >
> > The branch operates on the assumption that an efficiently-queryable
> > successors store should be managed by the FS. In this thread I'm
> > further assuming that creating successors would be expensive and
> > therefore 'svnadmin upgrade' should create a 'miscellaneous' table
> > record and bump the format number.
> >
> > There is a concurrent thread by Stefan2 that challenges both of these
> > assumptions. I don't know that we have consensus yet whether the design
> > in that thread or the design currently on the branch are better. (And,
> > yes, figuring that is the second thing at the top of my list, next to
> > figuring out how to implement 'upgrade' on the branch.)
>
> Yeah, I'm not following Stefan2's thread very closely. But regardless of
> what he thinks Subversion *should* have, I don't know of any reasons why it
> should *not* have this successor-id mapping.
>
On a high level, I recall Stefan2 was suggesting a design that focuses
not on node-rev successors but on high-level copy operation, and that is
not FS-backend-specific.
> >> Why are you choosing to this by-revision in fs_base rather than using
> >> a more lower-level, largely-Subversion-ignorant approach? Is it
> >> specifically so you can have an interruptible/restartable process? Is it so
> >> you can hook into some pre-existing per-revision subsystem (notification,
> >> perhaps)?
> >
> > I was simply trying to outline an algorithm for populating the
> > successors store from scratch in a live FS. (And yes, both
> > restartability and notification are nice properties to have.)
>
> Okay. I'm not sure that I would take the same course in a live FS versus an
> offline one, and you've been referring to 'upgrade' which shouldn't be run
> on a live FS -- that is, it should make the FS effectively "not live" for
> the duration of the upgrade. So, I'm a touch confused about what
> specifically you are aiming at.
>
What I'm thinking is as follows:
base_upgrade():
- create 'miscellaneous' table entry
- set the stored format number to 5
add_successors_to_f5_fs():
- backfill successors and remove 'miscellaneous' tables entry
base_history_next():
- assert format >= 5
- assert no 'miscellaneous' table entry
- do whatever it does today
This makes base_upgrade() a cheap operation. I was trying to make
add_successors_to_f5_fs() not block concurrent writers more than
necessary.
For add_successors_to_f5_fs(), I assumed operating by-revision would
result in smaller transactions and thus better behaviour for concurrent
readers/writers. It's also what I imagined the algorithm for FSFS would
be.
> But here's the extent of my assumptions: you want to backfill successors as
> quickly, efficiently, and painlessly as possible, ideally without
> interrupting live operation of the repository. Is that fair? :-)
>
Yes :-)
> > It's not clear to me exactly what the alternatives your question refers
> > to are. Could you elaborate on them, please?
>
> Well, BDB being a real database, we can do this sort of backfill operation
> without attending to any higher-level Subversion concepts such as revisions
> at all. A cursor walk through the `nodes' table should suffice:
>
> for key, value in nodes_table.rows()
> successor_id = key
> node_rev = parse_node_revision_skel(value)
> successors_table.add_row(node_rev.predecessor_id, successor_id)
>
I see what you mean now, thanks. (See above for why I went for
a by-revision algorithm.)
> --
> C. Michael Pilato <cmpilato_at_collab.net>
> CollabNet <> www.collab.net <> Distributed Development On Demand
>
Received on 2011-09-21 18:58:31 CEST