[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: eliminating sequential bottlenecks for huge commit and merge ops

From: Joe Schaefer <joe_schaefer_at_yahoo.com>
Date: Wed, 4 Jan 2012 16:20:40 -0800 (PST)

>________________________________
> From: Greg Stein <gstein_at_gmail.com>
>To: Joe Schaefer <joe_schaefer_at_yahoo.com>
>Cc: dev_at_subversion.apache.org
>Sent: Wednesday, January 4, 2012 7:08 PM
>Subject: Re: eliminating sequential bottlenecks for huge commit and merge ops
>
>
>
>On Jan 4, 2012 1:34 PM, "Joe Schaefer" <joe_schaefer_at_yahoo.com> wrote:
>>
>> As Daniel mentioned to me on irc, subversion doesn't use threading
>> internally, so things like client side commit processing and merge
>> operations are done one file at at time IIUC.
>>
>> Over in the openoffice podling we have a use-case for a 9GB working copy
>> that regularly sees churn on each file in the tree.  commit and merge
>> operations for such changes take upwards of 20min, and I'm wondering
>> if there's anything we could do here to reduce that processing time
>> by 2x or better by threading the per-dir processing somehow.
>>
>> Thoughts?
> We've always taken the position that the amount of effort or size of
> delta/data is proportional to the size of the change. If you change all
> of a 9Gb working copy, then you should expect svn to take a good chunk
> of time and space.
> IOW, stop doing that :-)
> That said, even if we were desirous of "fixing" this(*), we would have

> a hard time doing it using threads. The Subversion client is pretty solidly
> single-threaded. We take no precautions for operation in a multi-threaded app.

>

> Cheers,
> -g
> (*) I'd be interested in what they are doing. Is this a use case we might see
> elsewhere? Or is this something silly they are doing, that would not be seen elsewhere?

They're using the ASF CMS to manage the www.openoffice.org website, which is full
of 10 years worth of accumulated legacy spanning 50 or so different natural languages.
The CMS is "too slow" during commits to template files or such which change
the generated html content of virtually every file on the site.

There are 2 ways I could mitigate this issue with them if subversion isn't interested
in working on this use case:

1) convert the templating system to use SSI, which would eliminate most of the
sledgehammer type commits.

2) deploy the CMS on an SSD backed system.

FWIW (2) is scheduled to happen in the not too distant future anyway, and I personally
don't want to encourage the use of SSI with the CMS even for oddball situations
like this one.
Received on 2012-01-05 01:21:18 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.