[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFE: API for an efficient retrieval of server-side mergeinfo data

From: Marc Strapetz <marc.strapetz_at_syntevo.com>
Date: Fri, 14 Feb 2014 17:39:58 +0100

On 14.02.2014 14:18, Marc Strapetz wrote:
>>> Can we think of a better way to design the API so that it returns the
>>> interesting data without all the redundancy? Basically I think we want to
>>> describe changes to mergeinfo, rather than raw mergeinfo.
>>
>> Marc,
>>
>> Perhaps a better way to ask the question is: Can I encourage you to write the API that you want? You already designed a cache for the data. What is the shape of the data
>> in your cache, and can the API get the data you want in the form you
>> want it, directly? We'd be glad to help implement it. Even if you start with an API which simply iterates over a range of revisions, at least that would allow for the possibility of improving the efficiency internally at various layers.
>
> Looks like our emails have crossed :) I'll dig into the cache code and
> will try to come back with a more detailed API suggestion soon.

I did that now and the storage is quite simple: we have a main file
which contains the diff (added, removed) for every path in every
revision and a revision-based index file with constant record length (to
quickly locate entries in the main file).

This storage allows to efficiently query for the mergeinfo diff for a
path in a certain revision. That's sufficient to build the merge arrows.
Assembling the complete mergeinfo for a certain revision is hard with
this cache, but actually not necessary for our use case.

Hence an API like the following should work well for us:

interface MergeinfoDiffCallback {
  void mergeinfoDiff(int revision,
                     Map<String, Mergeinfo> pathToAddedMergeinfo,
                     Map<String, Mergeinfo> pathToRemovedMergeinfo);
}

void getMergeinfoDiff(String rootPath,
                      long fromRev, long toRev,
                      MergeinfoDiffCallback callback)
                      throws ClientException;

This should give us all mergeinfo which affects any path at or below
rootPath.

When disregarding our particular use case, a more consistent API could be:

void getMergeinfoDiff(Iterable<String> paths,
                      long fromRev, long toRev,
                      Mergeinfo.Inheritance inherit,
                      boolean includeDescendants,
                      MergeinfoDiffCallback callback)
                      throws ClientException;

The mergeinfo diff should be received starting at fromRev and ending at
toRev. No callback is expected if there is no mergeinfo diff for a
certain revision. Depending on the server-side storage, we may require
to always have fromRev >= toRev or always fromRev <= toRev. If it
doesn't matter, better have always fromRev <= toRev (for reasons given
below).

Regarding the usage, let's assume always fromRev <= toRev, then we will
invoke

getMergeinfoDiff(cacheRoot, 0, head, callback)

This should start returning mergeinfo diff immediately, starting at
revision 0, so we quickly make at least a bit of progress. Now, if the
cache building process is shutdown and restarted later, it will resume
with the latest known revision:

getMergeinfoDiff(cacheRoot, latestKnownRevision, head, callback)

This procedure will be performed until we have caught up with head.
Note, that the latestKnownRevision is the last revision for which we
have received a callback. Depending on the server-side storage, this may
be different from the current revision which the server is currently
processing at the time the cache building process is shutdown. Hence it
will be important that ranges for which no mergeinfo diff is present
will be processed quickly on the server-side, otherwise we could run
into some kind of endless loop, if the cache building process is
shutdown and resumed frequently.

-Marc
Received on 2014-02-14 17:41:24 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.