[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Deprecating svn_fs_id_t

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 31 Dec 2013 18:20:38 +0100

On Sun, Dec 22, 2013 at 3:01 PM, Branko Čibej <brane_at_wandisco.com> wrote:

> On 22.12.2013 13:56, Stefan Fuhrmann wrote:
> For some time now, this thing has been a nuisance to me
> in various situations. Recent discussions prompted me
> to have a closer look at this and now I want to see it
> removed from the API (maintaining it in the deprecated
> section). Here is why; further down is how.
> First of all, we've been discussing at length using the FS IDs
> (effectively, svn_fs_id_t) in mergeinfo, in order to avoid the mess that
> arises from trying to track renames based on the mergeinfo source path. If
> you throw them out, we'll just have to re-introduce them again.

Fair point. IIRC, however, there may be changes to the way
nodeIDs will be assigned etc., i.e. the relatedness reported
by compare(idA,idB) may be different for those kinds of IDs
useful in merge tracking.

So, chances are that this will be an svn_fs_id2_t - if only
for changed semantics. But I will keep the the svn_fs_id_t
in the API for now and only reduce its usage.

> ---
> svn_fs_id_t is ill-conceived and unnecessary concept with
> an under-developed API:
> * Ill-conceived - as this is not used as an ID / key to
> anything in the rest of the API. The API looks like
> get_infoA(root, path)
> get_infoB(root, path)
> instead of
> get_ID(root, path)
> get_infoA(id)
> get_infoB(id)
> So, we rather exposed an implementation detail and had
> to deprecate svn_fs_parse_id() early on to prevent the
> most severe consequences.
> That's not an argument for deprecating FS IDs; it's an argument for
> deprecating the whole FS tree API.

It is an argument for "FS ID is not what you would a generic
FS layer ID to be" and that it is only a minor element of the
API that can be deprecated with reasonable effort.

You are right that we should revise the usage of paths and IDs
in FS2. I'm slightly in favour of using IDs (e.g. as glorified pointers
to a DAG node) over paths in FS2.

 * Unnecessary - as it gets in the way instead of actually
> contributing to an operation. We use svn_fs_id_t to test
> for noderev identity and relationship but all code uses
> the following pattern:
> id_a = get_ID(rootA, pathA)
> id_b = get_ID(rootB, pathB)
> compare(id_a, id_b)
> Shorter and in keeping with the rest of the API, we should
> simply do
> compare(rootA, pathA, rootB, pathB)
> Again, as above; this is an argument for adding and using "compare"
> function variant. Or just designing a completely new API.

I don't think we have enough information to start just yet.
Some things that may impact the API design: rename
tracking, in-memory commits, multi-threaded commits,
operation sequence numbers, meta-data storage.

Maybe, the Berlin hackathon would be a good place to
collect input start discussion.

> There are two more uses. One for "debugging" (svnlook)
> which could use a similar private API call. The other
> one is as a basis for an optimization and there is an
> alternative implementation for that.
> * Under-developed - as it does not provide the API that
> it could.
> Still not an argument for derpecation; it's an argument for writing new
> APIs, if/when it turns out they're needed.
> For instance, it might return a root object
> (telling us the txn / rev that the noderev was created in).
> There is also no check on the copy id sub-struct that
> might answer a question like: refer these noderevs to
> the same node with no copy operation in between?
> id_a = get_ID(rootA, path)
> id_b = get_ID(rootB, path)
> if (same_node(id_a, id_b) && same_copy(id_a, id_b))
> doIt()
> Note that the IDs in this scenario are still derived
> from Subversion's versioned file system semantics and
> independent from any concrete implementation.
> How / what
> -----------
> * Introduce a root+path based svn_fs_compare
> Makes sense ...

I've got a patch prepared adding the above path-based
compare function. Will be committed shortly.

> and deprecate
> svn_fs_compare_ids as well as svn_fs_check_related.
> ... but this doesn't. See above about mergeinfo.

Ok. I will leave them for now. However, I found another
issue with the way we use these today and will discuss
that in a separate post.

 * Deprecate svn_fs_dirent_t and replace it with svn_fs_dirent2_t
> containing no ID member. Bump svn_fs_dir_entries API as well.
> I suspect you want this in order to reduce chache footprint. Maybe we just
> shouldn't cache whole svn_fs_dirent_t structs?

Yes, memory footprint (FSFS + FSX), storage and parsing
overhead (can only be changed in FSX). The original idea
was to create proper svn_fs_id_t only at the API level as
most directory access is purely internal as part of some path

If we *were* to deprecate svn_fs_id_t, however, even most
API users wouldn't suffer the overhead (only deprecated API
users were still hurt).

> * Similarly bump svn_fs_path_change2_t plus related API.
> The current implementation of svn_fs_path_change2_t and
related API is completely broken (and nobody cared). Yet
another separate post to come.

> * Deprecate svn_fs_unparse_id and svn_fs_node_id. Introduce
> svn_fs__unparse_id and svn_fs__node_id as private API for
> debugging and tools like svnlook. Also, use these to
> implement the deprecated directory and changed path list APIs.
> Again, see above about mergeinfo. Also, I can't see a good reason for
> deprecating this at all. You're not required to use any of these functions.
> You don't deprecate a public API just because you don't like it.

If you need *this* ID type, we will keep it of course.
But there are various issues with its API that I like
to fix. Deprecating that API entirely would simply be
the easiest and cleanest approach.

-- Stefan^2.
Received on 2013-12-31 18:21:21 CET

This is an archived mail posted to the Subversion Dev mailing list.