may I jump into the discussion.
Background: For TortoiseSVN, I implemented log caching (due to release
with version 1.5.0) based on the 1.4.x API. The current implementation
is quite capable of dealing with really large histories of one million
revisions and more. The KDE.org and Apache.org repositories are used
for benchmarking. If those are handled with reasonable time and space
requirements, all others should be fine as well.
It seems reasonable to include merge information into the log. For
instance, it should show up in the revision graph. To do that, it seems
imperative to keep the amount of data transmitted O(#revs).
on 2007-06-03, Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu> wrote:
> Currently, there isn't a whole lot of path-based filtering going on with
> the merged revisions. If the mergeinfo has something like '/trunk:1-9',
> you'll get all nine revisions, 1-9, as child revisions. And since every
> copy starts out with something like '/copysource:1-rev_before_copy', we
> end up pulling back *a lot* of data. That data will need to be
> filtered, not based upon the destination path, but upon the merge source
> Another reason for the large volume of data, is that by running the
> command on the root of the repository, you're going to get multiple
> copies of some log messages, both the original message, as well as the
> copies of the message pulled in as the result of a merge. (See
> http://svn.haxx.se/dev/archive-2007-05/0446.shtml for a previous mail on
> the issue.) This is exacerbated by the fact that we're already pulling
> in extra revisions already due to the first problem.
> I'll brainstorm about these issues and try to get a workable solution
> sometime next week.
Here is what I think. My concern is basically with the svn_client_log4
* Instead of the include_merged_revisions parameter there should be
a merged_revisions_depth parameter. Valid values whould be 0, 1
and -1 (i.e. unlimited). Value 1 would return the list of child
revisions immediately merged into the respective parent revision.
* Introduce a merged_revision_limit parameter. If not 0, it restricts
the size of the merged revision sub-tree in the following way.
merged_revision_count = 0
while (revisions_to_report.count > 0)
revision = revisions_to_report.pop
if (merged_revision_count < merged_revision_limit)
sub_revisions = revisions_merged_into (revision)
merged_revision_count += sub_revisions.count
Hence, for every node either all or none of its children is reported.
Rationale: the list of merge-inputs of a given revision is part of
that revision just like the list of changed paths.
This parameter may also be used to replace merged_revisions_depth:
only -1 requires a special check. Due to the exponential growth of
the tree, the number of nodes may exceed the range of a 32 bit counter.
* Report not only the merged revision(s) but also the path that has
been merged. TSVN would use that information to draw the revision
graph *for a given path*. Btw, that would be enough the reconstruct
the content of svn:mergeinfo.
For maximum efficency, I propose the following API change. Add a
apr_hash_t *changed_revisions to svn_log_entry_t with an structure
analogous to svn_log_changed_path_t:
typedef struct svn_log_merged_revision_t
const char *mergedfrom_path;
We could introduce discover_merged_revisions to control this new
member in a way symmetric to discover_changed_paths.
If set, it would SVN let fetch all direct merges, even if
is 0. Likewise, discover_merged_revisions may be false while
merged_revision_limit is not 0, causing the merged revisions
to reported as children. Of course, merged_revision_limit>0
and discover_merged_revisions=true is valid as well.
In summary, my "ideal, wished-for" svn_client_log4 would look like this
svn_client_log4 (const apr_array_header_t *targets,
const svn_opt_revision_t *peg_revision,
const svn_opt_revision_t *start,
const svn_opt_revision_t *end,
Please don't feel offended by the lengthy this-needs-to-be-changed-post.
I will be fine with any solution that does not result in exponential
Received on Sun Jun 3 16:28:44 2007