Re: svn_ra_get_file_revs2 vs. blame vs. FS API
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Mon, 17 Feb 2014 12:28:42 +0000 (GMT)
Stefan Fuhrmann wrote:
BDB and FSFS and FSX all implement the comparison in exactly the same way: "does it have the same rep-key?". This does not mean "is the content identical?", but it does not necessarily mean "has there been an intervening change to the content?" either. It only means "was the same representation instance chosen to represent this content?". For example, I think it might be possible for this to return "different" even if there has been no intervening change to the content, if the rep cache was disabled and re-enabled during a series of commits, although that depends on even more implementation details.
One thing we can say for certain is that, despite the doc string, these APIs have always been implemented to return "the same" in some cases when the content is known to be the same, and return "different" in cases where the content might be different but might not be.
Some of the callers use the call to optimize whether to fetch and compare the content. That's fine of course.
Some other callers expose this behaviour to higher level APIs, where the caller might be expecting a definite answer to "Does the content differ?", and that's where the problems begin.
Both semantics are valid, of course, so we might end up wanting to provide both.
> svn_ra_get_file_revs2 uses svn_fs_contents_changed to decide whether
And note that the doc string for svn_ra_get_file_revs2() explicitly says these false positives can be returned.
> However, our blame implementation relies on those false positives because
Without following the details, I think what you mean about merges is that the blame algorithm is trying to report an event whenever a merge happened, even if that merge did not result in a content change. The gist of it seems to be that the blame implementation assumes:
* that a new content rep-key is assigned whenever a merge is committed, even if the content is unchanged, but not at other times when the content is unchanged; and
* that svn_ra_get_file_revs2() will notify if and only if a new rep-key is assigned.
or something very much like that. Those assumptions look like the real problem here.
> I see the following options:
We should deprecate them, and adjust their doc strings with a historical note about the discrepancy, and replace them with one or two new (pairs of) APIs:
-- a quick optimization API -- the definitely/maybe question -- like the existing implementations but documented properly and named appropriately; and
-- (perhaps) a definite content comparison API: "has the content changed?"
> * Make svn_ra_get_file_revs2 always accept a delta handler unless the
I don't understand this one at all. Do you mean "call the delta handler unless..."? What does "same branch" mean in Subversion? Why would a prop change be considered an empty delta?
> * Make svn_ra_get_file_revs2 accept delta handlers (roughly) as today
Not sure what you mean there, exactly.
> * Make the blame implementation update the mainline blame info even
Sorry, I can't follow that one either without digging into the blame code.
Clearly the blame implementation needs to be changed, but if we can start by determining what question it needs to know the answer to and define suitable APIs for answering it, that would be a good way to proceed.
- Julian
> I suggest to do any of the last two or even both.
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.