Re: [PATCH] Don't use entries for checking status in svn_wc_status3_t. Was: Re: [WIP] Some quirky parts of libsvn_wc/status.c:assemble_status() for retrieving revisions

From: Daniel Näslund <daniel_at_longitudo.com>
Date: Fri, 23 Apr 2010 11:38:55 +0200

On Thu, Apr 22, 2010 at 05:07:56PM -0400, Greg Stein wrote:
> On Thu, Apr 22, 2010 at 15:48, Daniel Näslund <daniel_at_longitudo.com> wrote:
>
> >> read_info() returns changed_* information for copied nodes. A copied
> >> node has a corresponding node in the repository, and the changed_*
> >> information reflects that. (thus, your assemble_status should not be
> >> wiping the values out... if the caller wants to interpret added nodes
> >> that way, then it can)
>
> >> Personally, I would suggest that assemble_status() does NOTHING more
> >> than returning read_info's REVISION result. That is the "ideal" value
> >> for the revision of the node LOCAL_ABSPATH.
> >>
> >> If that revision is NOT what you want (ie. you have some other rules
> >> like "well... if it is added, then ..." or "if it is deleted, then
> >> ..."), then the caller should perform those calculations explicitly.
> >> The status code should not pre-suppose what rules the caller may want
> >> for revision handling. It should simply report the "ideal" revision
> >> for that node and be done with it.
> >
> > WHAT? You're saying that all the scanning should take place in the
> > client callback?
>
> No. I said "don't create strange semantics".
>
> The code you had in there had the following definition for status->revision:
>
> "The revision of the BASE node, unless that has been deleted, then: if
> nothing was added, then still the same BASE revision, but if something
> was added, then SVN_INVALID_REVNUM. BUT! If you delete a child of that
> addition, then the BASE version MAY be returned again... we say "may"
> because if a BASE doesn't correspond to the deleted child of the add,
> then you'll get an error instead."
>
> The definition that read_info uses is:
>
> "The revision of the unmodified (BASE) node, or SVN_INVALID_REVNUM if
> any (structural) changes have been made to that node."
>
> (where "structural" means an add/copy/move/delete; we don't mean
> content edits like propchanges or editing the text)
>
> > That's a major change from what I've done so far. That
> > I should just return SVN_INVALID_REVNUM for cases where I've scanned for
> > deletion? Is that a behaviour that everyone agrees on? You're saying
> > personally so I'm assuming there hasn't been a discussion about it.
>
> No, I don't think it should do a scan_add, nor a scan_deletion. Report
> the add/delete, and leave it at that.
>
> repos_relpath, repos_root_url, and repos_uuid are all inherited
> values, so I believe a scan_base_repos makes sense: you're simply
> returning those values. You're not looking for further details about
> operations (which the caller may not be interested in anyways).
>
> (and note that scan_base_repos doesn't even have to be used cuz
> assemble_status could be passed the parent repos_* values; if the BASE
> node hasn't been switched and (therefore) returns NULL for repos_*,
> then its repos_* values are easily derived within wc/status.c; but
> repos_* values will always be NULL for modified nodes (ie. if a
> WORKING node exists))
>
> Scanning additional rows is not "free", so we shouldn't be looking up
> information from the ancestors unless/until we find that most/all of
> the callers need that information. And in some cases, we can even make
> status.c fill in the info, since it has seen the ancestors already.
>
> In the future, we may return added/copied/moved_here distinctly so
> that a scan won't be needed to resolve that (tho you'd still have to
> scan to locate the root of the operation, tho we may be able to
> optimize that inside wc_db).

I set out to replace the entry field in svn_wc_statusX_t. I thought that
ideally I could just keep the semantics from the entry but move out the
fields to svn_wc_status2_t.

I wanted to keep the current behaviour since the status code is involved
in so _many_ tests and rewriting all of those would cause me to loose
momentum. Guess, what? I've already lost that momentum. :)

I guess I should do what you suggested: Make assemble_status() simpler
to insure we that we can easily describe what it's returning. On the
other hand, I don't see the scanning process as something expensive
comparing to 'text compares'. And I question if the API we're exposing
really should demand that the callers do any extra work for fetching
something as fundamental as a revision. AFAIK, the only thing that is
not clear is how we handle working changes below a deleted node. If we
get that one settled, then why not return the _right_ revision at once?
I mean, for a VCS where the notion of globally recognized identifiers
for change sets is a central part, there should be little vagueness as
to which one we really mean for a path! The users of our API will hardly
want to use another behaviour than what the CLI exposes.

Since I'm easily persuaded, below is my idea of what to do to get to the
point where assemble_status() only returns the most basic info about the
revision of a node. (I really should skip step (i) and rewrite it to be
part (ii) at once but I've put so much effort into that patch. I just
want to commit it! Hardly a good argument, though.)

Suggested work flow
====================
i) Commit the current patch with a few mods to correct the behaviour for
changed_rev, entries-checking and some '###' comments in
assemble_status() clearly stating our intent to return a more
well-defined revision (less ambiguity).

ii) Create a svn_wc__node_get_revision_of_deleted() or similar to invoke
when status->text_status == svn_wc_status_deleted. That one should be
able to cope with changes below the base_del_abspath when returning the
abspath. Remove relevant parts from assemble_status() and invoke the
node function instead from svn/status.c.

iii) Replace the status->entry->lock_{token, comment,...} with
status->lock_{token, comment, ... }

iv) Remove the entry field and use another way for checking if a path is
versioned or not instead of just checking if status->entry is set.

v) Create the svn_status2_from_3() logic to create an entry for
svn_wc_status2_t.

vi) Change the output of 'svn status' to set uncommitted paths as having
rev ' ' instead of '0'.

vii) Do the same for 'svn info' after removing the uses of entries in
there. build_info_for_entry() is a rat nest of entry usages.

Thanks,
Daniel
Received on 2010-04-23 11:40:00 CEST

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]