[14:56] CIA-7: pburba * r22139 /branches/ood-status-info/subversion/ (include/svn_repos.h libsvn_repos/rev_hunt.c): Cure(?) for cripplingly slow performance when finding deleted rev of ood path. ... [15:14] pburba: rooneg: Did you see my post a few minutes back to the 'Re: [PATCH] Was: Enhancement needed in svn status -u' thread? I think that anwers your question re r22139. If not let me know and I'll attempt a better/more detailed explanation. [15:15] rooneg: i still don't see how an API that professes to get you the most recent time this thing was deleted can possibly work via binary search. [15:16] glasser: rooneg: "this thing" is the key phrase [15:16] glasser: paul changed the definition of "this thing" in this patch :) [15:16] cmpilato: to what? [15:16] glasser: or, hmm, i forget, does svn_fs_compare_ids include the copy id? [15:17] glasser: IIUC, he was first asking for "last time this path was deleted", now it's "the time that PATH@REV was deleted" [15:17] cmpilato: no. [15:17] cmpilato: wait. sorry. [15:18] cmpilato: yes, it looks at copy ids. [15:18] cmpilato: (for the equivalence check) [15:18] pburba: glasser is right, I changed the function, possibly my new doc string is misleading [15:20] glasser: so equivalence means that "modified a file and its props" is equivalent but "copied" or (current definition of) "moved" or "resurrected" (really copied) is not equivalent, so I think it works [15:21] rooneg: ok, i guess that makes sense [15:22] pburba: svn_fs_compare_ids() returns -1 (unrelated), 0 (equivalent), or 1(otherwise related, e.g. moved, copied) [15:23] glasser: (note that my last statement was more of a question :) ) [15:23] pburba: Though my understanding regarding FS nodes is limited, I might be misunderstanding the meaning of the 1 [15:24] cmpilato: 1 means the nodes share a "line of history" [15:24] cmpilato: one is a ancestor of the other, or they share a common ancestor. [15:24] cmpilato: (second cousins thrice removed, or something) [15:26] pburba: Ok, then I think I'm using the return from svn_fs_compare_ids() correctly, at least as regards what I state svn_repos_deleted_rev() does. [15:27] cmpilato: paul, can you describe the algorithm you're using? i saw something about a binary search, and can't fathom how that would pan out. [15:29] pburba: Ok, the problem is given a path, a start rev and end rev, find the rev that path@start was deleted (if it was) [15:30] pburba: I order start and end, so start < end, and then make sure path@start exists, and path@end doesn't. [15:30] pburba: If either of the last two conditions are true I just return SVN_INVALID_REVNUM [15:30] cmpilato: ok. [15:33] * cmpilato now thinks he can fathom a way a binary search could pan out. [15:33] pburba: Store the node id for path@start, set mid_rev = start + end / 2, get the revision root for path@mid_rev, compare with () [15:34] pburba: If compare result is -1 then we know path was deleted at a lower rev, [15:34] pburba: If it is 0 we know it's a higher rev... [15:34] pburba: If 1 it's also a *lower* rev, since this is "otherwise related" but not equivalent [15:34] cmpilato: If 1, you don't know. [15:35] pburba: Oh, damn, really... [15:35] pburba: shit [15:35] glasser: Well, if 1 it might not have been deleted, right? [15:35] glasser: or does replace-with-history count as deleted? [15:35] cmpilato: if 1, it might simply have been modified another time before being deleted. [15:35] cmpilato: or, it might have been deleted and resurrected. [15:35] cmpilato: so the deletion point remains a mystery. [15:36] pburba: If you jsut change it the compare returns 0 [15:36] cmpilato: now, fortunately, in the 1 case, you can ask, "when was the last time this mid-point was copied?" [15:37] * cmpilato checks himself. [15:37] glasser: cmpilato: well, it if something is deleted in rD, then comparing any of the IDs between 'start' and rD will give 0, and comparing any of those to any later will give non-zero (no matter what comes after), right? [15:38] cmpilato: no. [15:38] cmpilato: 0 means "this is the exact same node-revision" [15:38] cmpilato: -1 means "no relation" [15:38] glasser: Oh, so modifies *do* change nodeid? [15:38] cmpilato: no, they change node-revision-ID (of which the first component is node-id) [15:39] cmpilato: CAUTION: confusing terminology ahead. [15:39] pburba: So svn_fs_compare_ids() is 0 in that case right? [15:40] cmpilato: nope, 1. [15:40] cmpilato: related, but not equal. [15:41] cmpilato: svn_fs_compare_ids() *cannot* distinguish between simple mods and copies. [15:41] pburba: Ok, I must have screwed up when I was debugging this, I thought it was 0. But all is not lost no, you said I can check when last copied...is there a possible solution there? [15:41] cmpilato: it's is basically an is_related() check, with a bonus third return value that means, "not just related, but the same dang thing." [15:42] pburba: Ok [15:42] cmpilato: possible solutions depend heavily on the problem's definition. [15:42] cmpilato: i'm not convince the definition is tight enough. [15:43] cmpilato: if a node is moved (copy+del), then moved back where it was, then deleted, which rev are you interested in? [15:43] pburba: The delete that is part of the first move, as far as status is concerned that's enough I think, because... [15:44] pburba: the path's parent dir will have the out of date info for the subsequent delete. [15:44] pburba: But maybe Im rationalizing the support that limited case [15:45] cmpilato: well, certainly that's the easiest of the options. [15:45] rooneg: yeah, i think that's wrong, you'd be hoping for the second delete, since that's the most recent, right? [15:47] pburba: Hmmm...ok, you might be right [15:47] pburba: Thinking... [15:47] cmpilato: i think this might depend on what ood_* is used for. [15:48] cmpilato: i mean, what utility is in that whole set of status data? [15:49] cmpilato: back in the day, the whole point of svn status -u was to answer the question, "If I were to update, what changes will I get?" [15:50] cmpilato: for paths that still exist in HEAD, it makes sense to populate based on the last-committed information for those paths. [15:50] pburba: But for Subclipse it's useful to know, what revision to I need to update to so my WC is not out of date [15:50] cmpilato: but for deleted stuff, i think we're in sufficiently vague territory. [15:50] pburba: But if it doesn't exist in head? [15:51] cmpilato: pburba: why? will folks choose to update to something < HEAD? [15:52] *** markphip has joined #svn-dev. [15:53] cmpilato: pburba: watch out. the boss is looking. [15:53] markphip: Paul said you are being mean [15:53] pburba: I did not! [15:53] cmpilato: . o O ( or did i get the association backwards? ) [15:53] pburba: No you got it right [15:53] markphip: Why does Subclipse need accurate delete info? [15:54] peterS: pburba: btw, for the linear search backward for a deleted file, you can at least skip every other revision, on the theory that something can't appear and then disappear in less than 2 revisions [15:54] cmpilato: phrased like that, the question is uninteresting. [15:54] * cmpilato prefers, "What is the definition of 'accurate delete info'?" [15:54] markphip: The revision when the item was deleted? [15:54] peterS: markphip: you mean when it was _last_ deleted? [15:54] cmpilato: or _first_ deleted? [15:55] markphip: We have this graphical Synchronize view that is basically driven by svn status and lets you update/compare/commit [15:55] pburba: peterS: Yes, I realized that today, but it would still be too slow I think in many cases no? [15:55] markphip: In my case first [15:55] markphip: But any would do [15:55] peterS: pburba: I don't see any way around it, unless you cache "when was a file last seen" somewhere [15:55] cmpilato: nono. "any would do" is not gonna pan out well for a public API. [15:55] markphip: One of the features of the view is to group the items by commit. So for deletes, they will not go into the correct grouping with an accurate revision [15:56] markphip: The second issue is update. Right now, if you select a deleted item we force the revision to HEAD [15:56] markphip: But this is a GUI, and you could have selected other items where HEAD is bad [15:56] markphip: Especially if you have spent an hour using the GUI merge tool to merge/inspect the changes [15:56] markphip: and HEAD is not the same as when you started [15:57] cmpilato: there are three points of possible interest that i can see: [15:57] pburba: peterS: glasser suggested that as well, if it comes to that...:-( [15:57] cmpilato: (1) first time after REV that PATH@REV was deleted [15:58] cmpilato: (2) last REV in which PATH@REV existed [15:58] pburba: i.e. (1) what Im doing now in that most recent commmit [15:58] cmpilato: (3) last REV in which PATH@REV existed and was related to PATH@START_REV [15:58] markphip: In Subclipse either #1 or 3 would be acceptable [15:59] peterS: 1 and 3 are much easier to find than 2, right? [15:59] cmpilato: heh. i just remembered that we did something like this in viewvc. [16:00] * cmpilato reads the viewvc algorithm. [16:00] markphip: I see no little value in 2 [16:00] pburba: (3) is going to be O(n) [16:00] markphip: If you have 1 or 3 you can get it [16:00] peterS: yes, 2 is useful for "I deleted this file a long time ago, now I need it back" [16:00] cmpilato: def last_rev(svnrepos, path, peg_revision, limit_revision=None): [16:01] cmpilato: """Given PATH, known to exist in PEG_REVISION, find the youngest [16:01] cmpilato: revision older than, or equal to, LIMIT_REVISION in which path [16:01] cmpilato: exists. Return that revision, and the path at which PATH exists in [16:01] cmpilato: that revision.""" [16:02] markphip: My issue is that I am showing them that an item has been deleted, and I am showing them a revision number, If user looks at that revision, I'd want the affected paths to show that file was deleted in that rev [16:02] cmpilato: ah, sweet. apparently i hit the same issue: [16:02] cmpilato: ### Not quite right. Need a comparison function that only returns [16:02] cmpilato: ### true when the two nodes are the same copy, not just related. [16:02] cmpilato: cmp = fs.compare_ids(orig_id, mid_id) [16:04] markphip: Not to complicate things, it would be nice once this is settled if we could make svn log work for a deleted revision [16:04] rooneg: what do you mean by "work for a deleted revision"? [16:05] markphip: If you have a file in your WC that has been deleted in repos, svn log will only work if you know the revision it was deleted and do not go beyond it in your range [16:05] markphip: In other words, cannot use HEAD [16:06] rooneg: ahh, yeah, that would be nice [16:06] cmpilato: these problems all fall into the domain of "forward history searching", and that's complicated stuff. [16:06] cmpilato: if your file has been copied 15 times between the revision you have and HEAD, which one are you asking for logs on? [16:06] markphip: If people did not want to redefine HEAD, perhaps a new DELETED keyword could be added that resolved to HEAD for non-deleted items [16:09] cmpilato: of course, the definition of "deleted" plays there, too. [16:09] cmpilato: meaning, it's not as simple as saying HEAD == DELETED when object exists in HEAD, right? [16:09] markphip: In the simple case it does. [16:09] markphip: What do you mean by copied? [16:10] cmpilato: object must exist in HEAD and not have been copied between PEG_REV and HEAD [16:10] cmpilato: 'svn copy' [16:10] markphip: When you say copied, I think of being copied to another location. Why would that matter? [16:10] markphip: If it were moved and then moved back or something I could see complicatiobn [16:11] pburba: Because if you deleted it, then moved/copied the copy back? [16:11] markphip: That is why I asked what he meant by copied. Obviously copy is not the issue [16:11] cmpilato: svn log follows changes in paths backwards in time; why would we make a blanket disallowance of following changes in path forwards in time, too (unless we simply deemed the implementation and/or UI too cumberson) [16:11] cmpilato: s/cumberson/cumbersome. [16:13] markphip: If you copy a file, change the original, change the copy, delete the original, copy the current copy back to the original location, I'd assume that is not the same file [16:14] cmpilato: that's a fair assumption. [16:15] cmpilato: implementing such a check doesn't come for free, but... [16:15] cmpilato: anyway. we're off-topic here. [16:15] markphip: Of course if you move a file, edit it and eventually move it back...from a user's point of view they might think it the same [16:15] cmpilato: we want to solve your deleted-rev problem. [16:15] markphip: I understand the relationship to the issue though. Paul and I had talked about it. [16:16] markphip: My feeling was that if I have a rev of a file in my WC, that the Synch view is telling me it is going to delete when I update, I'd only really care about the first time since my REV when it was deleted [16:17] cmpilato: that's fine. [16:17] pburba: Also, if it was moved, then moved back, then deleted, the path's parent dir would know when it was deleted [16:17] pburba: The second and final delete that is [16:17] pburba: If that's worth anything anyway... [16:17] cmpilato: so, paul's algorithm is close to correct, then, i think. [16:18] cmpilato: when it gets '1' for compare_ids(), it needs to run closest_copy on PATH@MID to see if that sucker has been copied. [16:18] cmpilato: if not, MID is the new START; loop [16:19] cmpilato: if so, the revision it was copied in becomes the new END; loop [16:20] markphip: I am going to bow out. I have to go lower some people's morale before they leave for the weekend :) [16:20] pburba: He's a joy to work for! [16:20] markphip: since I am being portrayed as your phb I figured I need to live up to it [16:21] pburba: cmpilato: Ok, I follow. That is a fairly minor tweak. [16:25] pburba: Are there any outstanding objections to this then, rooneg? peterS? [16:26] peterS: don't ask me, I'm still slightly confused about the exact requirements. if others don't object, I won't either [16:27] cmpilato: :-) [16:27] cmpilato: i [16:27] cmpilato: i'm not sure how the corner cases pan out. [16:27] pburba: peterS: It will be in the ood branch first anyway, awaiting further abuse :-) [16:27] cmpilato: such as: path is replaced in some revision with a copy of itself. [16:28] cmpilato: well, i guess that's no worse than being off-by-1 during the binary search ... whittling the space away to nothingness. [16:29] pburba: Any others bothering you? [16:29] cmpilato: people? or corner cases? [16:29] pburba: corner cases [16:29] cmpilato: not at the moment.