[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Opening the repository hooks environment file

From: Bert Huijben <bert_at_qqmail.nl>
Date: Thu, 28 Mar 2013 22:33:09 +0100

> -----Original Message-----
> From: Branko Čibej [mailto:brane_at_wandisco.com]
> Sent: donderdag 28 maart 2013 22:06
> To: dev_at_subversion.apache.org
> Subject: Re: Opening the repository hooks environment file
>
> On 28.03.2013 21:50, Bert Huijben wrote:
> >
> >> -----Original Message-----
> >> From: MARTIN PHILIP [mailto:codematters_at_ntlworld.com] On Behalf Of
> >> Philip Martin
> >> Sent: donderdag 28 maart 2013 19:32
> >> To: Bert Huijben
> >> Cc: dev_at_subversion.apache.org
> >> Subject: Opening the repository hooks environment file
> >>
> >> "Bert Huijben" <bert_at_qqmail.nl> writes:
> >>
> >>> The reading of one file for each access to the repository is a more
> >>> than measurable slowdown when profiling operations. (Reading
> fsfs.conf
> >>> over and over again is one of the most expensive things apache worker
> >>> processes do when I profiled them. I think stefan2 optimized some of
> >>> this away)
> >> We have already picked up one new file on every access in 1.8: the hooks
> >> environment file. This appears to be opened and parsed for every time
> >> mod_dav_svn opens the repository, both read and write operations.
> >>
> >> Perhaps we should require an explict config setting to enable the hooks
> >> file so that we can avoid opening it when it is empty? Or perhaps we
> >> could make the opening/parsing lazy and delay it until running a hook
> >> thus avoiding it for read operations?
> > Would be nice if we can read it on first use (after the hook exists check?)
> > and then cache it.
> >
> > I'm not sure why this didn't turn op in the performance traces though.
> Maybe
> > because this file doesn't exist by default?
>
> Sure it does, "svnadmin create" will create a template. I think you're
> overestimating the cost of reading such small files; it'll mostly stay
> in RAM once it's been read, and parsing it is not all that expensive.

I'm pretty sure you remember that with Subversion 1.6 running 'svn update' on ^/subversion/branches took one and a half minutes on Windows to obtain the directory locks, before it even started doing anything update related?

(And this was without a virusscanner on a for that time fast desktop harddisk)

Simple file operations may be cheap on one system (E.g. linux) but don't use that as a measurement for other systems. That one and a half minute operation took less than half a second on ext3/ext4.
And this problem existed on many Subversion releases without anybody noticing... The common suggestion was "Windows is slow".

Loading 'fsfs.conf' is a slow operation. Maybe because it is too big to fit in a buffer; maybe because our parsing is inefficient and even more inefficient on systems where we use "\r\n" as EOL, but it really is performance critical on Windows.

I don't say that we should stop reading fsfs.conf, but why should we introduce a 'fs.conf' at the repos level for things that don't belong at that level and most likely 99% of our users will never use?

fsfs contains things like sharding information which are essential for functionality. It is not a diagnostics configuration file.

Reading a very tiny file hundreds of thousands times during a typical ra-serf test run as on our buildbot makes its impact huge. (Even when using a ramdrive). This will certainly have an impact on production Subversion servers with similar loads.

But as noted earlier... our testsuite uses uncommon small files.
But the changes applied to the files might be in the more common space.

        Bert
Received on 2013-03-28 22:33:48 CET

This is an archived mail posted to the Subversion Dev mailing list.