[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Subversion performance (issue #1429 et al)

From: Sander Striker <striker_at_apache.org>
Date: 2003-08-17 17:33:53 CEST

> From: cmpilato@localhost.localdomain
> [mailto:cmpilato@localhost.localdomain]On Behalf Of cmpilato@collab.net
> Sent: Sunday, August 17, 2003 5:13 PM

> cmpilato@collab.net writes:
>
>> I'll take some time to see who all uses svn_fs_dir_entries(), and how
>> they use it, and determine if this change would be Goodness or not.
>> If Goodness, I'll code the change and commit.
>
> Oh, geez, there are only a couple of uses where the caller doesn't
> loop over the entries calling is_dir() on every one. I'll making
> coding this change my highest priority, Sander.

Thanks!

> For the record, here's how the cost trade-off will be.
>
> Today:
>
> svn_fs_dir_entries() uses the not-cheap open_path() routine to walk
> the DAG down to the directory, then reads the dirents list from the
> reps/strings data and returns it.
>
> svn_fs_is_dir/file() use the not-cheap open_path() routine to walk
> the DAG down to the dir/file, then reads the kind out of its
> node-revision record, and returns an answer.

Yep, that's what I was seeing too. Unfortunately I'm not so familiar
with the intimates of libsvn_fs as you are.

There are other functions that use the same open_path() pattern. If
these are used in the same way as svn_fs_is_dir/file, we can apply the
same trick. Although I guess we don't want to cache proplists and
such, since that may be way too expensive in terms of memory use.
 
> Tomorrow:
>
> svn_fs_dir_entries() uses the not-cheap open_path() routine to walk
> the DAG down to the directory, then reads the dirents list from the
> reps/strings data, then opens each dirent directly and asks the
> kind question, then returns the dirents with their kinds. Note
> that no open_path() is needed here because we already have the
> node-rev-ids of the entries. In other words, this is *far* cheaper
> to do inside the API than the equivalent work outside the API.

Can't we cache those node-rev-ids? So that if we have already called
open_path() we don't need to do it again? I am quite likely missing
something here...
 
> svn_fs_is_dir/file() is not changed, but isn't called *nearly* as
> often. :-)

Good. That will make a big difference for performance.

Other functions to look at in this light:
 - svn_fs_node_created_rev
 - svn_fs_file_md5_checksum
 - svn_fs_file_contents
 - svn_fs_node_proplist
 - svn_fs_get_file_delta_stream

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Aug 17 17:34:40 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.