[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Trival merge of big text file: Dismal performance, 540x faster if binary.

From: Andreas Krüger, DV-RATIO <andreas.krueger_at_hp.com>
Date: Fri, 14 Jan 2011 14:53:19 +0000

Hello, Johan and all,

first, for the record, here is another comparison between
binary and text merge performance, this time with the files
generated by my script (repeated below):

Binary merge took 3.56 seconds, text merge took 3:45:45.202 hours.
In this particular case, binary merge performance was 3805 times
faster than text merge performance.



> Textual merging in svn makes use of a variant of the standard diff
> algorithm, namely diff3. Just a couple of days ago, I finally
> succeeded in making diff3 take advantage of ... performance
> improvements ... .

Good news! Excellent! Thank you!

But... does this relate to my problem?

The improved diff3 will give a nice performance improvement in the
*general* case.

I certainly want that improvement!


Another nice performance improvement of a factor of several hundreds
(or thousands) could be obtained for big files in the *trivial* case,
if SVN didn't diff3 at all, but simply copied the result.

I also want this other improvement!


Finally:

SVN already contains the intelligence needed to find out whether a
merge is trivial or not. For, in the binary case, the trivial merges
are precisely the ones that SVN knows how to do.


Johan (or whoever else), please kindly enlighten me, should I be
missing something!

Regards, Andreas
--
Dr. Andreas Krüger, Senior Developer

Tel. (+49) (211) 280 69-1132
andreas.krueger_at_hp.com

DV-RATIO NORDWEST GmbH, Habsburgerstraße 12, 40547 Düsseldorf, Germany
 
für
 
Hewlett-Packard GmbH H Herrenberger Str. 140 71034 Böblingen www.hp.com/de
Geschäftsführer: Volker Smid (Vorsitzender), Michael Eberhardt, Thorsten Herrmann,
Martin Kinne, Heiko Meyer, Ernst Reichart, Rainer Sterk
Vorsitzender des Aufsichtsrates: Jörg Menno Harms
Sitz der Gesellschaft: Böblingen S Amtsgericht Stuttgart HRB 244081 WEEE-Reg.-Nr. DE 30409072

-----Original Message-----
From: krueger, Andreas (Andreas Krüger, DV-RATIO)
Sent: Thursday, January 13, 2011 4:08 PM
To: users_at_subversion.apache.org
Subject: RE: Trival merge of big text file: Dismal performance, 540x faster if binary.

...


#!/usr/bin/perl -w

# Generate stupid files on stdout.

use strict;

# For the overhauled file, set to 1:
my $overhaul = 0;

my $number = 1;

for (1 .. 1000000) {

    # 1073741824 and 910111213 have no common divisor,
    # so this will take a while before it repeats.
    $number = ($number + 910111213) % 1073741824;

    my $printme;
    if($overhaul) {
        $printme = ($number % 4 != 0 ? $number * 13 % 1073741824 : $number);
    } else {
        $printme = $number;
    }
    print $printme,"\n";
}
Received on 2011-01-14 15:54:51 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.