[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFE: API for an efficient retrieval of server-side mergeinfo data

From: Branko Čibej <brane_at_wandisco.com>
Date: Fri, 14 Feb 2014 12:31:12 +0100

On 14.02.2014 11:38, Julian Foad wrote:
> Marc Strapetz wrote:
>> For SmartSVN we are optionally displaying merge arrows in the Revision
>> Graph. Here is a sample image, how this looks like:
>>
>> http://imgur.com/MzrLq00
>>
>>> From the JavaHL sources I understand that there is currently only one
>>> method to retrieve server-side mergeinfo and this one works on a single
>>> revision only:
>> Map<String, Mergeinfo> getMergeinfo(Iterable<String> paths,
>> long revision,
>> Mergeinfo.Inheritance inherit,
>> boolean includeDescendants)
> Right. This is a wrapper around the core library function svn_ra_get_mergeinfo().
>
>> This makes the Merge Arrow feature practically unusable for larger graphs.
>>
>> To improve performance, in earlier versions we were using a client-side
>> mergeinfo cache (similar as the main log-cache, which TSVN is using as
>> well). However, populating this cache (i.e. querying for mergeinfo for
>> *every* revision of the repository) often resulted in bringing the
>> entire Apache server down, especially if many users were building their
>> log cache at the same time.
>>
>> To address these problems, it would be great to have a more powerful
>> API, which allows either to retrieve all mergeinfo for a *revision
>> range* or for a *set of revisions*.
> The request for a more powerful API certainly makes sense, but what form of API?
>
> In the Subversion project source code:
>
> # How many lines/bytes of mergeinfo in trunk, right now?
> $ svn pg -R svn:mergeinfo | wc -lc
> 245 24063
>
> # How many branches and tags?
> $ svn ls ^/subversion/tags/ ^/subversion/branches/ | wc -l
> 288
>
> # Approx. total lines/bytes mergeinfo per revision?
> $ echo $((245 * 289)) $((24063 * 289))
> 70805 6954207
>
> So in each revision there are roughly 70,000 lines of mergeinfo, occupying 7 MB in plain text representation.
>
> The mergeinfo properties change whenever a merge is done. All other commits leave all the mergeinfo unchanged. So mergeinfo is unchanged in, what, 99% of revisions?
>
> It doesn't seem logical to simply request all the mergeinfo for each revision in turn, and return it all in raw form.
>
> Can we think of a better way to design the API so that it returns the interesting data without all the redundancy? Basically I think we want to describe changes to mergeinfo, rather than raw mergeinfo.

I wonder, Julian, could something like this be useful for improving
merge in general?

We know that clients can cache most of the mergeinfo in the repository,
if they want to; I just don't have any feeling for how much sense it
would make to maintain such a cache, and if it can be made smart enough
to speed up merging significantly.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com
Received on 2014-02-14 12:31:49 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.