[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Improving the performance of libsvn_wc for checkouts/updates/switches

From: Josh Pieper <jjp_at_pobox.com>
Date: 2004-05-22 22:11:23 CEST

Philip Martin wrote:
> Josh Pieper <jjp@pobox.com> writes:
>
> > Philip Martin wrote:
> >> >> What I observe with the current code is typically
> >> >>
> >> >> A dir
> >> >> A dir/file
> >> >> A dir/subdir
> >> >> A dir/subdir/file
> >>
> >> Interrupt here. Is dir/subdir versioned? If not then I don't think
> >> cleanup will find subdir's log file. I suppose one might be able to
> >> run cleanup repeatedly.
> >
> > If the interrupt is hard, i.e. kill -9, the log files will not have
> > been moved into the live position yet. That could be a problem as
> > there will now be unversioned obstructions lying around, but no data
> > should be lost.
> >
> > If the log file for the inner directory has been made live, but not
> > the one for the parent directory, cleanup may have a hard time finding
> > it. If this is a big problem, we could run the parent's log file
> > before recursing into subdirectories, and would still gain performance
> > if there were many text files in a directory.
>
> But how do we run the log file if it is in .svn/tmp/log?

What I meant is that cleanup would have a hard time finding the
sub-directory, since there is no log-file for the parent directory to
run. However, as I mention below, this cannot happen.

> At present each log file contains a set of several operations and, in
> general, all the operations must be completed before the wc is
> consistent. We cannot simply run arbitrary log files from .svn/tmp/
> since such a log file may contain an incomplete set of operations, we
> don't know where the interrupt occurred.

Log files are still only run if they have been made live, nothing from
.svn/tmp would ever be run.

> > If both log-files were made live before the interrupt, I believe
> > cleanup would run the parent directory's logfile first, then use the
> > new state of its entries to recurse and would thus correctly recurse
> > into subdir.
>
> Not at present, cleanup explicitly runs children first.

Ok, well I looked into this a bit further, and this case could never
occur. 1: The child log file would be made live and run before the
parent log file were made live. 2: The operation that enters the
subdir into the parent entries file (update_editor.c:add_directory)
doesn't use the log file mechanism, but instead modifies the entry
directly.

> > The pending logs are kept in .svn/tmp/log the same as they are
> > currently. When either
> > a) a delete operation occurs
> > b) the editor closes the directory or
> > c) the cancellation function returns an error
> > the log files are moved into the live position in a depth first
> > fashion and run one at a time.
>
> That's a step backwards for restartable checkouts. All the files
> downloaded will be stored somewhere (in .svn/tmp/ perhaps?) until the
> log file is run. If the log file is in .svn/tmp/log it will not get
> run by cleanup, so there will be no way to promote the downloaded
> files into versioned items. Restarting a checkout will need to
> download those files again :-(

It is only a step backward if you 'kill -9' the process. If you use
CTRL-C, the cancellation function route kicks in and the log-files are
run before exiting. I have restarted checkouts on my machine with no
problem, no having to re-download large portions. Is there a big
reason to expect that if you 'kill -9' a checkout it will remember
anything it did?

> Remember that the log file itself is not the performance bottleneck,
> (every log file operation gets written once) it's the entries file
> that is the problem. You are combining log files to solve the entries
> file problem, you must be careful not to break the log file atomic
> guarantees in the process.

I know... but putting all the information into one log file makes the
entries problem pretty trivial, thus I'll try and solve the problems
there first. So far I believe it still meets all the criteria you've
stated, except for not remembering previous work after a hard kill or
system crash.

-Josh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat May 22 22:11:46 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.