[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: More libsvn_wc efficiency improvements.

From: Philip Martin <philip_at_codematters.co.uk>
Date: 2004-05-24 19:05:26 CEST

Josh Pieper <jjp@pobox.com> writes:

> Philip, could you look over this patch and see if anything is
> obviously wrong with my entries caching modifications? Essentially
> lock.c:prune_deleted was getting called for every single modification
> made to the entries file. To generate a pruned list, it needs to
> iterate over all the currently existing entries. It just wasn't very
> noticable before because of the O(n^2) disk problems. While the patch
> does cause a little extra work to be done in svn_wc__entry_modify, the
> reduction in work in prune_deleted seems to negate the loss for small
> directories, for big directories it's an obvious performance win.
>
> With this patch and the current (r9870) trunk FSFS, I see no
> non-linearities in checkouts when going up to at least 25,000 files in
> a single directory. At that point my hard drive or filesystem started
> to groan, so I stopped.

It's hard to argue with measurements, but I don't understand why it's
an obvious performance win. The current prune_deleted function has
two loops over the entries, now your patch removes one loop but that
still leaves the other so any O(N^2) behaviour is still there. Have
you removed a call to prune_deleted from the critical path?

> * subversion/libsvn_wc/lock.c
> (prune_deleted): Do not attempt to take a shortcut and assign the
> full entries list to the pruned one. This was only a very small
> performance win before and now that we need to update the two
> caches separately it would be incorrect.

I never measured it but the optimisation was reduced memory use rather
than any runtime CPU effect. The memory used by the entries cache is
allocated for the lifetime of the access batons, so I tried to
minimise it where possible. I suppose the extra memory used by having
two hashes instead of one is small compared to the memory used by the
entries themselves, but it does seem a waste since in most cases there
are no deleted entries.

>
> * subversion/libsvn_wc/entries.c
> (svn_wc__entry_modify): Get both the full and pruned entries lists,
> perform the requested modifications on both of them.
>
> Index: subversion/libsvn_wc/entries.c
> ===================================================================
> --- subversion/libsvn_wc/entries.c (revision 9870)
> +++ subversion/libsvn_wc/entries.c (working copy)
> @@ -1549,7 +1549,7 @@
> svn_boolean_t do_sync,
> apr_pool_t *pool)
> {
> - apr_hash_t *entries;
> + apr_hash_t *entries, *entries_nohidden;
> svn_boolean_t entry_was_deleted_p = FALSE;
>
> /* ENTRY is rather necessary, and ENTRY->kind is required to be valid! */
> @@ -1557,7 +1557,11 @@
>
> /* Load PATH's whole entries file. */
> SVN_ERR (svn_wc_entries_read (&entries, adm_access, TRUE, pool));
> + SVN_ERR (svn_wc_entries_read (&entries_nohidden, adm_access, FALSE, pool));
>
> + /* These two hashes cannot be the exact same ones. */
> + assert (entries != entries_nohidden);
> +
> /* Ensure that NAME is valid. */
> if (name == NULL)
> name = SVN_WC_ENTRY_THIS_DIR;
> @@ -1565,6 +1569,8 @@
> if (modify_flags & SVN_WC__ENTRY_MODIFY_SCHEDULE)
> {
> svn_wc_entry_t *entry_before, *entry_after;
> + apr_uint32_t orig_modify_flags = modify_flags;
> + svn_wc_schedule_t orig_schedule = entry->schedule;
>
> /* Keep a copy of the unmodified entry on hand. */
> entry_before = apr_hash_get (entries, name, APR_HASH_KEY_STRING);
> @@ -1574,6 +1580,9 @@
> SVN_ERR (fold_scheduling (entries, name, &modify_flags,
> &entry->schedule, pool));
>
> + SVN_ERR (fold_scheduling (entries_nohidden, name, &orig_modify_flags,
> + &orig_schedule, pool));

How about a sanity check?

         assert(orig_modify_flags == modify_flags);
         assert(orig_schedule == entry->schedule);

-- 
Philip Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon May 24 19:07:00 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.