[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1338350 - in /subversion/trunk/subversion: libsvn_subr/io.c libsvn_subr/skel.c libsvn_wc/wc_db.c

From: Greg Stein <gstein_at_gmail.com>
Date: Mon, 14 May 2012 23:43:40 -0400

On Mon, May 14, 2012 at 7:05 PM, Stefan Sperling <stsp_at_elego.de> wrote:
> On Mon, May 14, 2012 at 05:42:01PM -0400, Greg Stein wrote:
>> On Mon, May 14, 2012 at 3:12 PM,  <stsp_at_apache.org> wrote:
>> > Author: stsp
>> > Date: Mon May 14 19:12:13 2012
>> > New Revision: 1338350
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1338350&view=rev
>> > Log:
>> > Use svn_hash__make() instead of apr_hash_make() in select places to ensure
>> > stable ordering of output. This commit makes the output of 'svn status'
>> > and 'svn proplist' stable.
>>
>> I think you're solving this at the wrong level. Why should the
>> functions below care about stable ordering? These are hash tables,
>> which *by definition* are "unordered". With these changes, you're
>> affecting *all* users rather than just the places that are important.
>>
>> If you want stable ordering, then at some higher level _where ordering
>> matters_, get an array of sorted keys and walk the hash table that
>> way.
>>
>> I'm somewhere between -0.5 and -1 on this commit. I feel pretty
>> strongly that this is not a correct solution.
>>
>> Thanks,
>> -g
>
> Yes, it is more of UI level issue that does not belong into the
> core libraries. It should be done by the UI.
>
> But the UI gets one path per callback invocation.
> So it cannot control the order itself without buffering everything.
> That is not acceptable either.

Right. What are the callback drivers? I'm guessing the status walker.
Throwing an O(n log n) key sort into that will not cause any
appreciable overheads, relative to the I/O that is driving the overall
operation.

There is a *user* benefit for sorted keys, rather than just stable keys.

(ra_serf will always have some variability due to network performance,
but we can fix 'svn st' output at a minimum)

> I had a patch that used a sorted array of hash table keys (still in
> libsvn_wc, not in 'svn') but Bert objected to that due to performance
> concerns and suggested to use the newly added stable hash table instead.

I'd like to see that patch. I really can't imagine that we're talking
any systemic performance here.

My concern is keeping the lower levels free to do what is best for
them. The change in this commit is *very* tenuous to its intended
purpose. That is going to break at some point. Basically cause it
introduces some awful coupling between our subsystems.

IOW, I'll take a couple milliseconds to perform a sort over the
long-term maintenance burden of coupled systems.

[ to beat this dead horse some more: not a single one of your changes
mentioned WHY svn_hash__make() was used instead of apr_hash_make();
somebody coming along will have absolutely no idea why that was done;
so they might switch it back because there is zero apparent effect at
doing so; at a minimum, a comment saying "we sort our skel proplists
so that 'svn status' (yah, that thing 15 layers up) can have stable
output." ]

> The unordered output sucks. This commit was the result of a user
> complaining about unordered output of 'svn status' after he upgraded
> to ubuntu 12.04 which includes apr-1.4.6. Unordered output which used
> to be ordered is perceived as a regression by our users. IMO we can't
> just ignore this problem.

Sure.

> What would you suggest to do instead? I'm open to suggestions.

Move the code further up. Adding a qsort() somewhere is not going to
be an issue except for some of the most pathological directory sizes.
And I will bet directories of that size have other problems relative
to that qsort().

Cheers,
-g
Received on 2012-05-15 05:44:14 CEST

This is an archived mail posted to the Subversion Dev mailing list.