Re: (FS) operational question

From: Greg Stein <gstein_at_lyra.org>
Date: 2000-12-30 00:27:20 CET

On Tue, Dec 26, 2000 at 08:52:08AM -0600, Karl Fogel wrote:
> Greg Stein <gstein@lyra.org> writes:
> > Up to this point, I've been considering a version resource URL to look
> > something like:
> >
> > http://www.lyra.org/repos/$svn/ver/67/somedir/foo.c
> >
> > However, this means that a commit *anywhere* in the tree will change the
> > version resource URL for *every* file/dir in the tree. And since these are
> > server-constructed URLs, the client can't automagically just update the
> > version URLs stored throughout the client tree. The server would actually
> > have to tell the client "here is <this> URL, here is <that> URL." For a
> > large repository, ugh...
>
> I don't think the version resource has to change for everything else
> in the tree. If foo.c has not changed from 67 to 72, then
>
> http://www.lyra.org/repos/$svn/ver/67/somedir/foo.c
> http://www.lyra.org/repos/$svn/ver/72/somedir/foo.c
>
> are the same entity. You can continue using 67 -- the number may not
> be "highest", but as far as foo.c is concerned it's still effectively
> "latest".

They refer to the same entity, but the additional URLs become available
after a commit. The updated URL is the "more official" URL to access the
resource, thus it would be "nice" to use it over the older one.

> > http://www.lyra.org/repos/$svn/ver/32.7/somedir/foo.c
> >
> > i.e. use the Node ID in there, rather than the revision number.
>
> IMHO, under no circumstances should internal node and node-revision
> numbers become visible to outside layers.

That is the "opaque ID string" and it is part of the public FS interface. We
aren't exposing what it means or its decomposition. Just a blob.

> If they're part of a URL,
> we've done something wrong. :-)
>
> The way to identify a revision uniquely is rev:path, URLified as
> desired, of course. (Yes, revN:path and revM:path might be the same
> even though N != M, but that doesn't actually hurt us anywhere.)

I understand that rev:path is the unique method, the issue that I'm
considering is that 67:foo.c and 72:foo.c can refer to the same underlying
entity. The correct way to model the URL is to have a correspondence between
the identification mechanism (the URL) and the entity that you're naming
(the node in the FS).

[ heck, just one example is knowing that v67 and v72 are the same since they
have the same version resource URL. ]

>...
> > The alternative, of course, is to leave things at an old revision number
> > until a real change arrives, but then our state reporting grows and grows as
> > we get more exceptions throughout the tree (by "exception" I mean a child
> > needing to report a revision that is different than the parent's).
>
> That's been the plan, yup, but it's not such an awful price to pay, is
> it?

Actually, I think it will be. Since commits do not involve every file, then
only a few files would have their state updated each time. And since a
change *anywhere* in the repository bumps the number, then the numbers just
zoom upwards. The state in your working copy will get highly fragmented,
thus creating a large amount of data to report to the server.

> And it's not so simple to avoid -- in general, we can't
> automatically tweak the revision numbers for unaffected files, because
> for all we know they *might* have changed in the repository. Suppose
> bar.c and foo.c are siblings in dir D:
>
> 1. If we commit bar.c and get a new rev in return, we cannot assume
> that foo.c also moves to that rev, because we haven't updated
> foo.c.
>
> 2. Likewies, if we update bar.c individually, we still can't change
> foo.c's rev, because we haven't updated foo.c itself. Maybe it
> has changed in the repository and we just don't know it.
>
> 3. If we update the whole dir D, of course, then both foo.c and
> bar.c will move to the latest available revision, along with
> everything else.

Agreed, but a "svn update" will bring all the revs up to par with each
other *if* we report all the changes on *each* rev change. If we don't
report them, then an "svn update" will remain fragmented.

Basically, a quick explanation would be:

1) Lazy updating of rev information. This minimizes the server *response*
   during an update (it doesn't have to say "update <that> but you don't
   need to fetch a new copy), but it fragments the WC which maximizes the
   server *request*.

2) Always update the rev information. This minimizes the *request* since we
   end up with fewer revision exceptions, but the *response* is larger since
   we need to send statements for the client to update the rev but not
   fetch.

I'm suggesting that the model that I'm interested in using for DAV is based
on URLs that change only when a change occurs on the server. Thus, we skip
both of the above problems. The client state is minimized because we can
update the rev numbers (they can be updated without needing to always update
version resource URLs, which also means we can use an exception-style model
for responses, thus minimizing both requests and responses).

[ actually, on an update response, there is a singular rev number for the
whole response, plus any necessary changes in the version resource URLs ]

> > So... back to the original question about the FS. I believe that I'm going
> > to need a revision number, huh? An ID and path isn't enough to do the work?
> > Ooh... or could I go ahead and do an "open_root" on the latest rev, then an
> > "open_node" with an ID/path. If that ID/path does not occur within that rev,
> > then I get an error. But having the revision at the open_root also means
> > that I have all the information needed (revision, path, with ID as a bonus).
> > Will that work?
>
> Not sure I understand the question/problem... (?)
>
> Sorry for being dense. Could you describe it untersely?

Sorry about that. Kind of a stream-of-consciousness there.

Presume that I have an ID and a path, extracted from the URL:

http://www.lyra.org/repos/$svn/ver/32.7/somedir/foo.c

I cannot simply open the node using the ID/path pair (it is not unique). The
revision is required.

However, during a commit, we must operate against the latest revision. Thus,
when I go to "open" the provided ID/path, I can instead open latest/path and
validate the ID matches. If not, then we punt the commit (it means the
client does not have the latest revision).

This scheme will also require a way to fetch a node given an ID/path (the
path is used for ACLs; the content only needs the ID). The fetching is done
during an update -- the client does a GET using the ID/path URL.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Received on Sat Oct 21 14:36:18 2006

This message: [ Message body ]
Next message: Greg Stein: "Re: CVS update: subversion/notes webdav-usage.html"
Previous message: Greg Hudson: "Re: one more "Greg issue" :-)"
In reply to: Karl Fogel: "Re: (FS) operational question"
Next in thread: Karl Fogel: "Re: (FS) operational question"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]