[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFE: API for an efficient retrieval of server-side mergeinfo data

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 14 Feb 2014 10:38:00 +0000 (GMT)

Marc Strapetz wrote:
> For SmartSVN we are optionally displaying merge arrows in the Revision
> Graph. Here is a sample image, how this looks like:
>
> http://imgur.com/MzrLq00
>
>> From the JavaHL sources I understand that there is currently only one
>> method to retrieve server-side mergeinfo and this one works on a single
>> revision only:
>
> Map<String, Mergeinfo> getMergeinfo(Iterable<String> paths,
>                                     long revision,
>                                     Mergeinfo.Inheritance inherit,
>                                     boolean includeDescendants)

Right. This is a wrapper around the core library function svn_ra_get_mergeinfo().

> This makes the Merge Arrow feature practically unusable for larger graphs.
>
> To improve performance, in earlier versions we were using a client-side
> mergeinfo cache (similar as the main log-cache, which TSVN is using as
> well). However, populating this cache (i.e. querying for mergeinfo for
> *every* revision of the repository) often resulted in bringing the
> entire Apache server down, especially if many users were building their
> log cache at the same time.
>
> To address these problems, it would be great to have a more powerful
> API, which allows either to retrieve all mergeinfo for a *revision
> range* or for a *set of revisions*.

The request for a more powerful API certainly makes sense, but what form of API?

In the Subversion project source code:

  # How many lines/bytes of mergeinfo in trunk, right now?
  $ svn pg -R svn:mergeinfo | wc -lc
    245   24063

  # How many branches and tags?
  $ svn ls ^/subversion/tags/ ^/subversion/branches/ | wc -l
  288

  # Approx. total lines/bytes mergeinfo per revision?
  $ echo $((245 * 289)) $((24063 * 289))
  70805 6954207

So in each revision  there are roughly 70,000 lines of mergeinfo, occupying 7 MB in plain text representation.

The mergeinfo properties change whenever a merge is done. All other commits leave all the mergeinfo unchanged. So mergeinfo is unchanged in, what, 99% of revisions?

It doesn't seem logical to simply request all the mergeinfo for each revision in turn, and return it all in raw form.

Can we think of a better way to design the API so that it returns the interesting data without all the redundancy? Basically I think we want to describe changes to mergeinfo, rather than raw mergeinfo.

- Julian

> Querying a set of revisions would be more flexible and would allow to
> generate merge arrows on the fly. On the other hand, to alleviate the
> server, it's desirable to cache retrieved mergeinfo on the client-side
> anyway, hence a range query would be fine as well.
>
> -Marc
>
Received on 2014-02-14 11:38:40 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.