[ moving to dev@ ]
Following up on a discussion on the users list about the lack of a way
to easily find the rev number in which a file was deleted...
Already referred to issue #3627 (FS API support for oldest-to-youngest
history traversal) and FS-NG, as mentioned on the roadmap. But the
discussion continued about why this is so hard right now, and if there
are alternative approaches. See below...
On Mon, Nov 29, 2010 at 3:51 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> Johan Corveleyn wrote on Sun, Nov 28, 2010 at 21:20:28 +0100:
>> On Sun, Nov 28, 2010 at 6:35 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> > Stefan Sperling wrote on Sun, Nov 28, 2010 at 16:48:30 +0100:
>> >> The real problem is that we want to be able to answer these questions
>> >> very fast, and some design aspects work against this. For instance,
>> >> FSFS by design does not allow modifying old revisions. So where do
>> >> we store the copy-to information for a given path_at_N?
>> >
>> > copy-to information is immutable (never changes once created), so we
>> > could add another hierarchy (parallel to revs/ and revprops/) in which
>> > to store that information. Any 'cp foo_at_N bar' operation would need to
>> > create/append a file in that hierarchy.
>> >
>> > Open question: how to organize $new_hierarchy/16/16384/** to make it
>> > efficiently appendable and queryable (and for what queries? "Iterate
>> > all copied-to places" is one).
>> >
>> > Makes sense?
>>
>> I'm not sure. But there is another alternative: while we wait for
>> FS-NG (or another solution like you propose), one could implement the
>> "slow" algorithm within the current design.
>
> Are you advocating to implement it in the core (as an svn_fs_* API) or
> as a third-party script? The latter is certainly fine, but regarding
> the former I don't see the point of adding an API that cannot be
> implemented efficiently at this time.
Why not in the core? "We can't do this quickly, so we don't do it" is
not a very strong argument against having this very useful
functionality IMHO.
Having it in the core is vastly more useful for people like me (and my
colleagues): works on Windows, regardless of whether or not one has
perl/python installed, no need to distribute an additional script,
guaranteed to be available everywhere an svn client is installed, ...
It's actually quite similar to the way "blame" is implemented
currently: we don't really have the design (line-based information) to
do this quickly, but we calculate it from the other information that
we have available (in a way that could also be done by a script on the
client: diffing every interesting revision against the next,
remembering the lines that were added/removed in every step). Can you
imagine not having blame in svn core just because we can't do it
quickly? Ok, blame may be a more important use case than "finding the
rev number where a file was deleted", but still ...
So I still think it's definitely worth it to have this in the core and
offer an API, and implement it slowly now because that's the only way
we can do it (besides, I don't think it will be *that* slow). And
"optimize" it later when we have FS-NG, or another way to retrieve
this info quickly...
However, having said all that doesn't change the fact that someone
still needs to implement it, and I must admit I don't have the cycles
for that currently :-(.
Cheers,
Johan
>> Just automating what a
>> user (or script) currently does when looking for this information,
>> i.e. a binary search.
>>
>> Of course it would be slow, but it would certainly already provide
>> value. At the very least, it saves users a lot of time searching FAQ's
>> and list archives, wondering why this doesn't work, understanding the
>> design limitations, and then finally implementing their own script or
>> doing a one-time manual search.
>>
>> Then, when FS-NG arrives, or someone comes up with a way to index this
>> information, it can be "optimized".
>>
>> I don't know if there would be fundamental problems with that, apart
>> from the fact that someone still needs to implement it of course ...
>>
>> Cheers,
>> --
>> Johan
>
Received on 2010-11-29 10:14:58 CET