[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS format 6

From: Stefan Fuhrmann <eqfox_at_web.de>
Date: Tue, 25 Jan 2011 00:43:40 +0100

On 24.01.2011 03:12, Johan Corveleyn wrote:
> On Wed, Dec 29, 2010 at 8:37 PM, Stefan Fuhrmann<eqfox_at_web.de> wrote:
>> On 29.12.2010 01:58, Johan Corveleyn wrote:
>>> The current code is written in a certain way, not particularly
>>> optimized for this new format (I seem to remember "log" does around 10
>>> fopen calls for every interesting rev file, each time reading a
>>> different part of it). Also, if an operation currently needs to access
>>> many revisions (like log or blame), it doesn't take advantage at all
>>> of the fact that they might be in a single packed rev file. The pack
>>> file is opened and seeked in just as much as the sum of the individual
>>> rev files.
>> The fopen() calls should be eliminated by the
>> file handle cache. IOW, they should already be
>> addressed on the performance branch. Please
>> let me know if that is not the case.
> Ok, finally got around to verifying this.
Thanks for taking the time.
> You are completely correct: the performance branch avoids the vast
> amount of repeated fopen() calls. With a simple test (testfile with 3
> revisions, executing "svn log" of it) (note: this is an unpacked 1.6
> repository):
>
> - trunk: opens each rev file between 19 and 21 times.
>
> - performance branch: opens each rev file 2 times.
>
> (I don't know why it's not simply 1 time, but ok, 2 times is already a
> factor 10 better than trunk :-)).
The file cache won't hand out the same handle
at the same twice. If one part of the FSFS opens
a revision file and keeps it open for some reason
while a sub-routine also needs to read the same
file without having access to the parent's handle,
it will open the same file a second time.
> I tested this simply by adding one line of printf instrumentation
> inside libsvn_subr/io.c#svn_io_file_open (see patch in attachment, as
> well as the output for trunk and for perf-branch).
When developing the file handle cache, I used
a similar method (also counting some other
low-level file operation statistics).
> Now, if only that file-handle cache could be merged to trunk :-) ...
As opposed to the full text cache, the file handle
cache may have unknown side effects as it keeps
files open longer than may be expected.

-- Stefan^2.
Received on 2011-01-25 00:44:22 CET

This is an archived mail posted to the Subversion Dev mailing list.