[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problem with large files

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: 2006-08-28 17:35:10 CEST

Mark Phippard wrote:
> sussman@gmail.com wrote on 08/28/2006 09:57:43 AM:
>
>> On 8/28/06, Daniel Berlin <dberlin@dberlin.org> wrote:
>>
>>> The report from the one person who has ever tried it with large files
>>> was that it sped up commit times from 45 minutes to less than 5 ;)
>> I don't think that this is rocket science which requires testing. :-)
>> Of course, if you just insert new data directly into the stream
>> without trying to deltify it, it's gonna be way way faster.

Well, careful now -- a user's experience isn't just "speed of inserting
new data directly into the stream" versus "speed of deltifying the data
and then sticking it into the stream". The user's experience also
includes cost of the respective wire transfers.

>> What's tricky here is coming up with a design. Should the svn client
>> be magically deciding when to deltify or not, based on some heuristic?
>> Or should it be controlled by the user via switches or
>> config-options? We have a really long standing issue filed (like...
>> years old) about giving users the option to toggle compression on the
>> fly (something akin to 'cvs -zN'). Is that the interface we want?
>
> I would like to an svn: property used for this so that is not something
> that has to be entered into configuration files (with the exception of
> auto props to set the property in the first place).
>
> Perhaps the property could be named something like "svn:delta" or
> "svn:deltify" with values of "none" and "normal". This would allow us to
> introduce specialty algorithms later if we wanted to add custom algorithms
> that worked better on certain file types.

We do client->server deltas using our pristine text-bases for a reason
-- to reduce the cost of network transfer. We know that on some
networks, this matters (trust me -- many folks I know still use 56k
dialup lines), and on some, it really doesn't. But Subversion doesn't
know which networks are which. Only humans do. And the type of network
in use isn't a property of the file in question, or even of the working
copy in question (hello, laptops moving from place to place) -- it's a
condition as ever-changing as the weather in Chicago that has to be
evaluated anew each time a commit occurs.

I think CVS has the right idea here, allowing folks to specify in both
personal configuration files and at the command-line what their
compression options should be. To some extent, we expose the same sort
of thing in our runtime configuration area (http-compression). But we
only let folks play with compression (only one of several things we
employ to try to reduce network usage), and we only let them do so via
the runtime configuration.

My current thinking here is that we should add (and honor) runtime and
real-time options for disabling text deltas on the wire as a whole.
Alternatively, maybe allow for disabling text deltas on binary files (as
determined by svn:mime-type).

-- 
C. Michael Pilato <cmpilato@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on Mon Aug 28 18:04:33 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.