Re: Node origins cache rewrite

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Mon, 28 Jan 2008 13:17:08 -0500

Mark Phippard wrote:
> On Jan 28, 2008 12:55 PM, David Glasser <glasser_at_davidglasser.net> wrote:
>> On Jan 25, 2008 12:50 PM, David Glasser <glasser_at_davidglasser.net> wrote:
>>> On Jan 25, 2008 11:16 AM, David Glasser <glasser_at_davidglasser.net> wrote:
>>>> On Jan 24, 2008 6:54 PM, Mark Phippard <markphip_at_gmail.com> wrote:
>>>>> I see David has rewritten this to no longer use SQLite. Yay!
>>>> Here's an alternative implementation. In FSFS, at commit time, new
>>>> node IDs are rewritten from a temporary value like "_ab3" to a unique
>>>> value by adding "ab3" to the "start_node_id" field in the current
>>>> file. This makes them not only unique, but also part of an ordered
>>>> sequence without gaps.
>>>>
>>>> Is it actually important that node IDs be ordered and gapless? We
>>>> could just change new node-IDs (in format 3 repositories) to be built
>>>> as "<rev>-ab3". get-node-origin-rev would be trivial on these nodes.
>>>> Pre-format-3 repositories, or nodes in format 3 repositories that
>>>> aren't dumped and loaded, would require the slow crawl.
>>> Like this. Can somebody review?
>> New version, supporting "svnadmin recover". Barring objections, will
>> commit later today.
>
> I do not have objections, but I did ask some questions in this message
> that have not been answered:
>
> http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=134583

Requiring a dump and load just to get node-origins is a non-starter, in my
opinion. And it is repositories large enough to making dumping and loading
such a pain that will suffer the most from *not* have the node-origins table.

The get-location-segments fallback logic is based on 'svn log', which
requires a loooooong time to run against, say, APR's trunk (after four
minutes, get-location-segments.py against that URL still hadn't completed).
Worse still, because of this new way David wishes to implement the
feature, all that cost would be paid *every time a merge was requested*,
rather than simply the first time as it is in the current implementation.

So while I like the elegance of David's plan for storing new node-origins at
commit time, I honestly believe the compat plan for existing repositories --
that is, the utter lack thereof -- would be a horrendous mistake for this
community to make. You might as well tell big projects that merges are
effectively disabled for them until they dump and load.

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

application/pgp-signature attachment: OpenPGP digital signature

Received on 2008-01-28 19:17:34 CET

This message: [ Message body ]
Next message: C. Michael Pilato: "Re: Node origins cache rewrite"
Previous message: Mark Phippard: "Re: Node origins cache rewrite"
In reply to: Mark Phippard: "Re: Node origins cache rewrite"
Next in thread: C. Michael Pilato: "Re: Node origins cache rewrite"
Reply: C. Michael Pilato: "Re: Node origins cache rewrite"
Reply: Garrett Rooney: "Re: Node origins cache rewrite"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]