[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: binary file size limit? (4.2GB retry - success)

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2004-03-12 06:29:56 CET

> I guess most of the time was spent in the DB doing some "cleanup" work?
> because the 4GB upload took just about 10 minutes, I could already
> ls and checkout files, but it still took about 40 minutes until the
> commit command finally completed.

Let me explain what's going on here when you commit a file:

  1. copy the working file to .svn/tmp/, in case the user changes it
during the commit. if eol or keyword translation is on, 'detranslate'
these things when copying.

  2. send a binary diff over the network by comparing the text-base of
the file with the temporary file.

  3. repository applies binary diff to the file as it receives it;
after the new file is fully constructed & committed in the repository,
compare the new file with the previous version and do another binary
diff. Store the previous version of the file as a binary diff against
its successor.

  4. the client gets the new revision number from the server; it then
copies the .svn/tmp/ file into .svn/text-base again.

Step #1 takes a very long time on a 4GB file.

Step #2 sends only a teeny tiny binary diff over the network, but it
takes a horribly long time to *generate* the diff... the binary diff
algorithm needs to scan two 4GB files, comparing them byte-for-byte!

Step #3 takes just as long as step #2, because now the repository is
*rederiving* the same binary diff in reverse!

Step #4 takes just as long as step #1.

We've gone over these things in the past, and had many discussions about
how to optimize things. It would be nice if svn allowed users to
optimize for either network or CPU: in other words, like the CVS '-z'
flags, you could tell svn to not do any diffing at all in steps #2 and #3.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Mar 12 06:31:20 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.