[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Don't use entries for checking status in svn_wc_status3_t. Was: Re: [WIP] Some quirky parts of libsvn_wc/status.c:assemble_status() for retrieving revisions

From: Greg Stein <gstein_at_gmail.com>
Date: Fri, 23 Apr 2010 06:23:49 -0400

On Fri, Apr 23, 2010 at 05:38, Daniel Näslund <daniel_at_longitudo.com> wrote:
> I set out to replace the entry field in svn_wc_statusX_t. I thought that
> ideally I could just keep the semantics from the entry but move out the
> fields to svn_wc_status2_t.


I would say, "if callers want entry_t semantics, then they should use
status->entry. new fields should use wc-ng semantics ONLY."

Migrate away from entry_t and its semantics. Creating new fields with
the same semantics doesn't get us very far. The status->entry can
linger around and hold its old semantics.

> I wanted to keep the current behaviour since the status code is involved
> in so _many_ tests and rewriting all of those would cause me to loose
> momentum. Guess, what? I've already lost that momentum. :)

I suspect you mean svn/status-cmd.c rather than "status code". The
real point is "how does 'svn status' present the values in
svn_wc_status3_t?". You can provide wc-ng semantics in that structure,
and then let status-cmd map those into old-style semantics and

Then, in a future step, we can discuss altering the output of
status-cmd to make more sense and to eliminate all the ugliness.

> I guess I should do what you suggested: Make assemble_status() simpler
> to insure we that we can easily describe what it's returning. On the

Bingo. entry->revision is very difficult to describe, which means most
users get it wrong. They want one idea of "revision", but
entry->revision doesn't always match that idea.

Answer: have a very clear and simple definition of "status->revision",
and if they want something else, then they ask that clearly and
specifically via other APIs.

> other hand, I don't see the scanning process as something expensive
> comparing to 'text compares'.

Scanning up the tree will get to be expensive when you do it for EVERY
node in a large tree. But I'm not so worried about the time-cost, as I
am about the semantics of these things. As I mentioned, the code in
your patch had a long and complicated meaning behind "revision". When
it gets that complicated, it means the field cannot reliably be used.
"Is the node in the right status for me to use status->revision?

> And I question if the API we're exposing
> really should demand that the callers do any extra work for fetching
> something as fundamental as a revision.

Heh. "fundamental". Define "revision". read_info defines it as "what
is the revision of the node I'm looking at right now?" And that has a
simple answer "what you checked out" or "doesn't have one cuz you
haven't committed your changes".

> AFAIK, the only thing that is
> not clear is how we handle working changes below a deleted node. If we
> get that one settled, then why not return the _right_ revision at once?

Again, what the hell is the right revision? The child below that
deleted node has at least three revisions:

1) undefined. you haven't committed
2) a BASE node was here, but isn't now. provide that BASE node's revision
3) the deleted node was copied from somewhere, and has a corresponding
repository node. what is that revision.

So tell me... which of those three definitions do you want for your
"revision". And is this truly "fundamental" or a difficult and
scenario-specific question?

entry->revision blended many of these simple definitions into one,
making the field very hard to use. The code generally used it
*unaware* of the real semantics and what the value truly meant. I am
hoping to alter that code so it *knows* what it is looking at. And if
"BASE revision, or -1 cuz you haven't committed" is not right for that
code, then it can ask for exactly what it truly needs.

> I mean, for a VCS where the notion of globally recognized identifiers
> for change sets is a central part, there should be little vagueness as
> to which one we really mean for a path! The users of our API will hardly
> want to use another behaviour than what the CLI exposes.
> Since I'm easily persuaded, below is my idea of what to do to get to the
> point where assemble_status() only returns the most basic info about the
> revision of a node. (I really should skip step (i) and rewrite it to be
> part (ii) at once but I've put so much effort into that patch. I just
> want to commit it! Hardly a good argument, though.)
> Suggested work flow
> ====================
> i) Commit the current patch with a few mods to correct the behaviour for
> changed_rev, entries-checking and some '###' comments in
> assemble_status() clearly stating our intent to return a more
> well-defined revision (less ambiguity).

The work in that patch to define status->revision doesn't move us
forward from status->entry->revision. If current code wants the old
semantics, then status->entry is still there.

> ii) Create a svn_wc__node_get_revision_of_deleted() or similar to invoke
> when status->text_status == svn_wc_status_deleted. That one should be
> able to cope with changes below the base_del_abspath when returning the
> abspath. Remove relevant parts from assemble_status() and invoke the
> node function instead from svn/status.c.

First, "deleted" is a concept of the node, not the "text" or the props
(ie. prop_status). That is seriously bogus, and should be ripped out.
The status structure should have something similar to
svn_wc__db_status_t. And then text/props can have simple flags like
"have these been modified?"

> iii) Replace the status->entry->lock_{token, comment,...} with
> status->lock_{token, comment, ... }

Yes. An alternative would be "svn_boolean_t locked;" .. but provided
the lock information is easy/cheap, so we may as well. ie. we gotta
try and fetch the lock info to report it exists, so let's just provide
all the info.

> iv) Remove the entry field and use another way for checking if a path is
> versioned or not instead of just checking if status->entry is set.

svn_boolean_t versioned;

> v) Create the svn_status2_from_3() logic to create an entry for
> svn_wc_status2_t.

svn_wc__get_entry(&status2->entry, ...);

> vi) Change the output of 'svn status' to set uncommitted paths as having
> rev ' ' instead of '0'.

Do anything you like in status-cmd to retain compatibility. My concern
is around a cogent and understandable set of semantics for

> vii) Do the same for 'svn info' after removing the uses of entries in
> there. build_info_for_entry() is a rat nest of entry usages.

Eesh. That structure is nearly as bad, and my head aches from "sheesh.
MORE stuff to clean up?!?!"

Received on 2010-04-23 12:24:25 CEST

This is an archived mail posted to the Subversion Dev mailing list.