I'd like to give a short status report on the wc-propcaching branch. The
original plan for wc-propcaching is now implemented. And it seems to work
The following things have changed regarding how properties are stored:
- There is no base-props file if there are no properties.
- There is no working props file if there are no prop changes.
- Thrre new fields have been added to the entries file:
- has-props keeps track of wheter the entry has any (working) props.
- cached-props: is a space-separated list of property names.
If a property is mentioned here, the working props for this entry has
a property of this name. Only svn:needs-lock, svn_special and
svn:externals may be present in this field.
- prop-mods: Is true or false (attribute absent) depending on whether
this entry has property modifications.
- The prop-time fields isn't present anymore.
The WC format number has been bumped to 6 (*) and loggy auto-upgrading
from earlier formats is implemented. Functions that don't require a
write-lock on the WC directory work with old format WCs.
*) We already have a WC format bump in 1.4, but I added another one to not
break trunk-using developers' working copies. Format numbers are cheap.
I've done some performance experiments (see below).
It has been suggested to store properties in one single file per
directory, both for regular props and wcprops. I think that seems like a
good idea, but I think it falls outside of the scope of the propcaching
branch. Also, I want to merge this work as early as possible in the 1.4
cycle to get it wider tested.
What we are waiting for now is that Erik wants to require checksums on all
files. Since that would require even another WC format bump, we think it
is best to do that on wc-propcaching before merging.
So, in short, I think wc-propcaching is approaching its merge back to
trunk and I want to encourege people to review it. If no one objects, I
want to merge as soon as Erik's work is done. (Also, if anyone has a good
reason to add (or remove) any property from the cached-props field, it is
easier to do so before people start using this code in their working
Now for some performance numbers:
I've done some experiments. I'm not very experienced in this area and I
only did tests on my local system (a Pentium celeron 1.7 GHz with 256 MB
of RAM), so these numbers are just hints giving indications of what
performance improvements one could expect. The tests are done on Linux
2.6.8 using the ext3 filesystem.
I have tested with one checked out GCC tree (from
svn://gcc.gnu.org/svn/gcc/trunk), one GCC tree with svn:eol-style property
added on each file and one Subversion working copy. Note that
svn:eol-style is not cached. The reason for using it was because none of
these commands read this property (I tested with a WC without any
modifications). The interesting difference is that when there were no
props (which is almost the case in the GCC tree), we don't need to read
the property files in some cases (in the old format), but instead use the
file size to detect that the property file is empty.
I tested three commands, which are common local operations: status,
wc-to-wc diff and commit. I tested with no local modifications at all in
the WC (that's why I call commit a local operation:-). This is the case I
want to modify, because most of the time, most of the working copy will be
In each column below, there are two numbers. The first indicates the
performance impprovements when the disk cache was flushed. The second
number is an approx. average of four runs of the same command after the
first one, i.e. when the data is in memory.
GCC tree GCC tree w/prop svn tree
svn st 11% 33% 64% 92% 53% 25%
svn diff 12% 27% 30% 45% 40% 0% (*)
svn ci 40% 86% 80% 95% 78% 50%
*) There were small time differences, but it was hard to measure.
One could go into detail analyzing these numbers, but I'm not sure how
much that would give us. There are too many factors that can have an
effect on the numbers. If someone want to test this in more depth, feel
free to. It would also be nice to have some final test results on Windows.
(If you want my raw data for some reason, just ask.)
What I think is interesting is that we have improved performance for all
operations. On some operations (i.e. commit), we have dramatic
improvements. In summary, I feel that this work has been worth it.
Comments, flames, questions?
To unsubscribe, e-mail: firstname.lastname@example.org
For additional commands, e-mail: email@example.com
Received on Sun Nov 27 23:30:59 2005