[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: compression

From: Edward Ned Harvey <svn_at_nedharvey.com>
Date: Wed, 30 Jun 2010 12:01:01 -0400

> From: Daniel Shahaf [mailto:d.s_at_daniel.shahaf.name]


> > I've had the greatest complaints for >15min commits.

> >


> So, a commit takes 1min when the server is idle, and 15min when the

> server is busy.


Actually, I'm surprised what I'm learning now. Although it matters if the
server is busy, that's not the root cause of the problem. Also, changing
the compression level makes a difference, but it's not the difference we
were hoping for.


I finally found a rock-solid and reproducible test case, as follows:


. There is a binary file, approx 45M, called layout.oa

. There are only 14 revs where it was changed. It varies slightly
in size, 44M, 45M, 46M...

. If I export all the different rev's of that file (parent dir, no
subdirs) then some rev's are repeatably less than 11sec to export, while
other revs are repeatably 15min.

. During all of the above exports, I am the only user using the
system. On the server, I see precisely one svnserve jump up to 100% cpu
utilization for the duration of the export.


I can't imagine any reason for such an enormous difference. I'm not sure
what I should look at next. I'll have to just start reading and reading and
reading code & documentation to get an idea precisely what other
possibilities may be going on.


Any ideas or suggestions?


This thread is no longer development related, unless this is a bug. We're
running svn 1.5.7 via svnserve, built from source on centos 5.1 x86_64. If
anyone cares, I'll happily move to the users list.


Please see attachments.



> > The way things are right now, svndiff, zlib_encode() take a chunk of

> data,

> (in svndiff.c)

> > performs compression on it, and writes (a) the size of the data, and

> (b)

> > whichever is smaller: the data, or the compressed data.

> >


> Note that the correctness of zlib_decode() depends on this check being

> done by the encoder.


I built subversion once with the default compression, once with compression
set to 1, and once with compression disabled. We're currently using the one
that's built with compression set to 1. But for the sake of continuing
discussion which is engaging, here's how I built the one with compression


In zlib_encode() there is an if() statement, to see if len <
MIN_COMPRESS_SIZE, in which case, compression will not be done. I simply
commented out the if() {} else {} statement, to make zlib_encode()
unconditionally behave as if len < MIN_COMPRESS_SIZE. That is: No


Aside from svndiff.c, there are a million places where zlib is called, but
they all use Z_DEFAULT_COMPRESSION. So I also edited zlib/deflate.c:

      if (level == Z_DEFAULT_COMPRESSION) level = 0; // was formerly 6

(occurs two times.)


Received on 2010-06-30 18:01:51 CEST

This is an archived mail posted to the Subversion Dev mailing list.