[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RFC: Flattening out 'svn log -g'

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Thu, 10 Jan 2008 02:07:46 -0500

I was trying to help Hyrum out by looking in the log-tests 18 failure.
This was my first time seeing 'svn log -g' output in a while, and it
took a while to understand what I was reading. (Karl assisted greatly
by changing the text used to report merged-via info.) Anyway, I
noticed an absurd amount of duplicated data in the output stream of
'svn log -g' as run on the test data for this test. (See
http://paste.lisp.org/display/53925 to see what I see.) It's a hard
problem, trying to take naturally tree-like, nested data and display
it in a flat output. But here's a half-baked idea for doing so that I
wanted to bounce off of folks.

First, a nasty ASCII branch diagram which is supposed to correspond to
my later sample output:

       --|-----|-----------------|--------------
      / 2 3 6 \
     / \
   |-----------------|-----|-----------------|-----|-----|------> /trunk
   1 4 \ 5 8 /9 / 10
                        \ / /
                         --|---------------------------
                           5 \ /
                                   \ /
                                    ----------
                                    7

Disclaimer: I'm quite sleepy, so try not to get bogged down in
specific revision numbers used in my sample output. I probably goofed
up in a few spots.

Now, today's 'svn log -g' code will, I believe, transmit the following
stream of revisions (shown nested based on log entry's has-children bit)
across the wire for a log of /trunk in the diagram above. (The nesting
indicates that some revisions are part of the history of the requested
object because they were merged into that history as part of some
natural revision of the object in question.)

    r10
       r5
       r4
       r1
       --
    r9
       r6
       r3
       r2
       r1
       --
       r7
       r5
       r4
       r1
       --
    r8
    r5
    r4
    r1

I was thinking tonight about all those duplicated revisions, and
wondering if they couldn't be flattened out by the client, with dupes
removed and revisions kept in order. The proposed output would be
something like:

    ------------------------------------------------------------------------
    r10 | ...
    Changed paths:
       ...
    Lineage: natural

    ------------------------------------------------------------------------
    r9 | ...
    Changed paths:
       ...
    Lineage: natural

    ------------------------------------------------------------------------
    r8 | ...
    Changed paths:
       ...
    Lineage: natural

    ------------------------------------------------------------------------
    r7 | ...
    Changed paths:
       ...
    Lineage: merged via r9

    ------------------------------------------------------------------------
    r6 | ...
    Changed paths:
       ...
    Lineage: merged via r9

    ------------------------------------------------------------------------
    r5 | ...
    Changed paths:
       ...
    Lineage: natural, merged via r10, r9

    ------------------------------------------------------------------------
    r4 | ...
    Changed paths:
       ...
    Lineage: natural, merged via r10, r9

    ------------------------------------------------------------------------
    r3 | ...
    Changed paths:
       ...
    Lineage: merged via r9

    ------------------------------------------------------------------------
    r2 | ...
    Changed paths:
       ...
    Lineage: merged via r9

    ------------------------------------------------------------------------
    r1 | ...
    Changed paths:
       ...
    Lineage: natural, merged via r10, r9

    ------------------------------------------------------------------------

It would require the client to keep a cache of the merge tree and the
not-yet-printed revision metadata. Knowing that our client was
responsible for maintaining such a thing, though, could possibly allow
us to teach the server to stop duplicating that data on the wire.
(Part of the contract is that in -g mode, 'svn log' sends real
metadata only once per revision, only revision numbers for
duplicates.)

Oh -- and this would only apply to 'svn log -g' when not in --xml output mode.

Comments? Questions? Tomatoes?

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on 2008-01-10 08:07:57 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.