[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

The cost of svn_io_get_dirents2 and early error messages

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2005-11-11 18:19:17 CET

So yesterday i spent some time tracking down why update felt the need to
stat every file. The result was somewhat surprising, so i thought i'd
share it and ask what we should do.

It turns out the stat'ing comes from calling svn_io_get_dirents2 in
report_revisions.

svn_io_get_dirents2 asks apr_dir_read for info APR_FINFO_NAME |
APR_FINFO_TYPE.

On some of our "advanced" filesystems, like reiserfs and xfs, the type
isn't returned in a readdir dirent (legal by posix), so APR fetches it
for each file using stat.

However, the *name* is always available from readdir, and can be used to
determine missingness without the type.

report_revisions uses the type for two reasons:

1. To see if something changed from a directory to a file and issue an
error message:
<if entries says it's a dir>

   /* No excuses here. If the user changed a versioned
             directory into something else, the working copy is hosed.
             It can't receive updates within this dir anymore. Throw
             a real error. */
          if ((dirent_kind != svn_node_none) && (dirent_kind !=
svn_node_dir))
            {
              return svn_error_createf
                (SVN_ERR_WC_OBSTRUCTED_UPDATE, NULL,
                 _("The entry '%s' is no longer a directory; "
                   "remove the entry before updating"),
                 svn_path_local_style (this_path, iterpool));
            }

2. It also *thinks* that changing a file to a directory will issue an
error messages later, and marks the path as deleted in certain cases,
but it doesn't actually do anything in that case:

<if entries say it's a file>
 /* If the dirent changed kind, report it as missing and
             move on to the next entry. Later on, the update
             editor will return an 'obstructed update' error. :) */
          if ((dirent_kind != svn_node_none)
              && (dirent_kind != svn_node_file)
              && (! report_everything))
            {
              SVN_ERR (reporter->delete_path (report_baton, this_path,
                                              iterpool));
              continue;
            }

dberlin@linux:/mnt/gccstuff/gcc-clean> mv depcomp foobar
dberlin@linux:/mnt/gccstuff/gcc-clean> mkdir depcomp
dberlin@linux:/mnt/gccstuff/gcc-clean> svn update depcomp
At revision 106785.
dberlin@linux:/mnt/gccstuff/gcc-clean> svn status depcomp
svn: Working copy 'depcomp' is missing or not locked

Note that whatever obstructed update error message it seems to think is
going to be issued, never is :)

If you remove this check for file->directory, you get the exact same
errors.

If you just remove the check for *directory->file* here, you get

dberlin@linux:/mnt/gccstuff/gcc-clean> mv gcc foobar
dberlin@linux:/mnt/gccstuff/gcc-clean> touch gcc
dberlin@linux:/mnt/gccstuff/gcc-clean> svn update gcc
svn: Working copy 'gcc' is missing or not locked

instead of

dberlin@linux:/mnt/gccstuff/gcc-clean> svn update gcc
svn: The entry 'gcc' is no longer a directory; remove the entry before
updating

Note: we could still keep this error message, and not have it cost so
much, if we pushed it deeper to the point where we fail to find the
working copy.

My question is thus:

Is it worth the expense of stat'ing every file on every update on some
common file systems, just to issue an error message *early* about an
incredibly uncommon case that issues an error message anyway?

Like I said, i think we can push these error messages down into when we
discover the working copy missing, or try to receive updates to a file
that is now a directory. Would anyone object to moving these out of the
"hot path" that is report_revisions?

This would mean they would be issued a bit later. In the
file->directory case, it wouldn't happen until you received updates to
that file.

For the directory->file case, we are still going to write lock the
repository, and when we do that, we'll discover the directory has no wc,
and can issue our better error message then, which is still before we
attempt to receive updates.

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 11 18:20:58 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.