[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

svn log improvement proposals

From: Eirik Bjørsnøs <eirbjo_at_gmail.com>
Date: Sat, 6 Sep 2008 00:39:23 +0200


I recently mentioned[1] some issues I was facing adding merge tracking
features to my version control history search tool [2].

Having spent the last few days doing some actual work on this feature,
I have run into some issues with the Subversion log APIs/protocols.

Using the current API there's no way for me to specify the exact kind
and amount of information I want from svn log.

Specifically, here are the issues I'm facing:

A) Can't specify depth of include-merged-revisions (-g)

If a merged revision also has a merged revision it will be included in
the log. (And that revision can again be a merge of a merge of a merge
many levels down..) Since I look at the repository log as a whole, it
doesn't make sense to log merged revisions more than one level deep.
Any information about "nested" merges will have appeared earlier in
the log.

The following thread discusses a --depth parameter, but I can't seem
to find it leading to any conclusion:

Proposal: Add a --depth switch to svn log (and the protocols, simple
client filtering won't be very useful).

B) Can't limit information returned specifically about merged revisions.

The -v option currently applies to merged revisions as well as the
"actual" revisions. I need full information about the actual
revisions, but for merged revisions I only need the number of the
revision. Everything else I already have cached.

For my specific use case a special "--just-revision-for-merged" would
be ok. However, for consistency it's probably better to add versions
of --verbose, --with-all-revprops and --with-revprop that apply
specifically to merged revisions.

Proposal: Add merge-specific versions of --verbose,
--with-all-revprops and --with-revprop.

C) SVN protocol documentation: "rev-props excludes author, date, and log"

Eric Gillespie suggested I pass an empty array for revprops to
svn_ra_get_log2. I don't use the API but implement the SVN and DAV
protocols directly, however the idea still applies. Using the SVN
protocol I can do this by sending a an empty list of revprops like " (
) " in the log request.

The SVN protocol documentation [3] states that "rev-props excludes
author, date, and log; they are sent separately for
backwards-compatibility", but this doesn't seem to be quite up to
date: Using Subversion 1.5.1 I'm very much able to specify which of
these I want transferred. Older servers seem to send them anyway.
Perhaps the protocol spec should be updated? )

Proposal: Make it clear in the SVN protocol spec that you actually can
select which (if any) of author, date and log you want returned.

D) DAV protocol: Can't specify "return no revprops at all"

This isn't an issue for me right now, but more of a lack of protocol
consistency that I've noticed. Using DAV against a 1.5 server I'm able
to specify which revprops I want (or that I want all revprops). But
unlike the SVN protocol there doesn't seem to be an easy way of
specifying that I want none of them. (Adding an empty or non-existing
<S:revprop> works, but I assume that's more of a side effect than
intentional with regards to the protocol)

Proposal: Include a new XML tag "<S:revprops>" containing any
<S:revprop> the client wants. An empty "<S:revprops>" would mean "no,
don't send any revprops". Alternatively, document that
"<S:revprop></S:revprop>" gives no revprops and allow it in the CLI.

E) mergeinfo ranges in svn log

As a tool vendor integrating with Subversion I try very hard to hit
the servers as lightly as I can. That means doing as few and
lightweight requests as possible. I noticed that very often merged
revisions returned by "svn log -g" aren't cherry picked, but part of a
range. For wire efficiency it would be great if these ranges could be
expressed as ranges on the wire.

So something like:

<logentry revision="1"/>
<logentry revision="2"/>
<logentry revision="3"/>
<logentry revision="7"/>

Could become something like:

<logentries range="1-3"/>
<logentry revision="7"/>

Or perhaps just:

<merged-logentries revisions="1-3,7"/>

(This would obviously not work if revprops or changed paths are
requested, but for my use case where this is the only info I need
about merged revisions, this would be just perfect!)

Proposal: Add <merged-logentries revisions="1-3,7"/>, perhaps
triggered by an alternative "-g" switch like "--g-revisions"

---- end of issues ---

While I fully understand that these issues are probably not on the top
of the list of things to fix right now I'd very much appreciate a bit
of discussion around them. In particular, issue A and B combined makes
it pretty much impossible for me to implement merge tracking in
SVNSearch without major performance hits to both the indexed
Subversion servers and to SVNSearch itself.

To illustrate the performance problem I did some testing using

First, I tested verbose svn log without merge tracking

svn log --xml -v -r1:HEAD
25MB. (Took 50 seconds)

Then I tested with merge tracking turned on using -g:

svn log --xml -v -g -r1:30000 (actually broken into several smaller requests)
883M in total. (Took an order of magnitude more than 50 secs. I got
hungry waiting :-)


[1] http://subversion.tigris.org/servlets/BrowseList?list=dev&by=thread&from=674885
[2] http://svnsearch.org/svnsearch/repos/SVN/search
[3] http://svn.collab.net/repos/svn/trunk/subversion/libsvn_ra_svn/protocol

To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-09-06 00:39:36 CEST

This is an archived mail posted to the Subversion Dev mailing list.