[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

large changes to large files

From: Bill Mann <bmann_at_vertica.com>
Date: 2007-09-28 20:40:05 CEST

I have regression tests which produce large output files, then compare
them with files saved in svn. They look like ordinary txt files, and
are type .out. (That could be changed.)

When a test changes, these files may change a little or a lot. When
it's a lot, I have performance problems with diff3. The expected
performance of diff3 is O(N + PD), where N is the size of the larger
version in lines, D is the number of lines added or deleted, and P is
the number of lines deleted.

In one case, my 5M file shrunk to 2.5M, with about 100K lines changed,
almost all of them deleted. The performance was O(200K + 100K * 100K)
or O(10.2G); the merge ran for several hours. But it was entirely
unnecessary -- there was only one set of changes, so the result was
simply the new, shorter file.

I think the code should check for that. I haven't tried for a patch,
but it doesn't look difficult.

I tried /usr/bin/diff3, and that worked quickly, but obviously it would
be nice to fix our version.

For now, I've used propset to add svn:mine-type application/octet-stream
to all my existing .out files. Maybe this is right in principle, since
merging these files doesn't make sense; changes in my .out files are
derived from changes in other files in a revision. But this is not a
good solution unless I force all my users to modify their .subversion
files to make this property automatic when they add new .out files.
That's doable too, but a hassle.

I'm using svn+ssh:, so apache solutions don't help.

Am I missing something? Should I work on a patch? Anyone want to help?

-Bill Mann

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Received on Fri Sep 28 20:41:06 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.