[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: commit crawler

From: Greg Stein <gstein_at_lyra.org>
Date: 2001-11-16 21:22:17 CET

On Fri, Nov 16, 2001 at 01:34:06PM -0600, cmpilato@collab.net wrote:
> Hey, Ben. Was wondering what you thought about us changing the commit
> crawler so that mined information from the working copy first, storing
> relavent bits in some applicable in-memory data structure and locking
> dirs and such, then blew through that data structure to perform the
> actual commit.

Eek. That blows away all concept of streamy. -1

I would suggest writing a crawler, much like Python's os.path.walk(). It
crawls over the disk and calls a callback for each entry. You can then use
different callbacks for status, for committing, for updating, whatever.

The walker would take several flags:

* include unversioned items
* include versioned items missing from the WC
* include .svn files (a raw filesystem walk; overrides the above two)
* callback for dirs in a prefix manner
* callback for dirs in a postfix manner

Define a single, simple item structure that has just enough information such
that a callback can use it to get any further data. I would imagine this
structure would have a pointer to the "versioning" data. For some of the
walk types, you're going to open the .svn/entries file. That data would go
into item->vsn_info. For a raw filesystem walk, or for unversioned items,
that pointer would be NULL.

(and a simple function call would fill in vsn_info for a given item since
that item structure has enough info to "get back" to the right spot to fetch
the necesssary data)

(the different on prefix vs postfix on dirs: consider a copy-tree function
wants dirs in a prefix so it can mkdir; deleteing a tree wants dirs postfix
so it can rmdir after the dir is empty)

> Reason I ask is that I'd like to recycle alot of the logic in that
> commit process, but for working-copy-free commits (like `svn cp' is
> soon to do).

"svn cp" for two URLs would entirely skip all this gobbledy gook. I don't
think you'd use any common routines at all for it. Seems like it would be a
manual sequence of calls into the commit editor, just like you posted in
copy-planz. There isn't any reason to build up a Thing just to execute four
calls which you *know* ahead of time. In fact, you're building the Thing so
that it happens to result in precisely those calls. Skip the whole
intermediate process of a Thing and keep it simple.

> Plus, I think it would help the commit crawler not be such a mess, to
> seperate the local-mod search from the commit process itself.

Agreed. See above. Note that the recent "copy the filesystem" walker as part
of the wc->wc copy could be tossed in favor of the walk() function and a
callback to copy a file/dir.

(I noticed that the code to do the copy was effectively duplicating the
other filesystem traversal stuff in the WC; no sense duplicating a basic
function everywhere, leading to possible divergence and maintenance issues)

Anyway... separate the filesystem traversal out of the various WC bits will
help quite a bit. I think there are numerous subtle benefits that are going
to pop up after this change. A nice little dividing line between filesystem
data/representation and control will be established.


Greg Stein, http://www.lyra.org/
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:48 2006

This is an archived mail posted to the Subversion Dev mailing list.