[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: 1.7.x - merge now accesses all files in WC?

From: Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu>
Date: Wed, 8 Dec 2010 22:17:29 +0000

On Wed, Dec 8, 2010 at 9:27 PM, Daniel Becroft <djcbecroft_at_gmail.com> wrote:
> On Tue, Dec 7, 2010 at 12:39 PM, Hyrum K. Wright
> <hyrum_wright_at_mail.utexas.edu> wrote:
>>
>> On Mon, Dec 6, 2010 at 4:28 PM, Hyrum K. Wright
>> <hyrum_wright_at_mail.utexas.edu> wrote:
>> > On Mon, Dec 6, 2010 at 3:15 PM, Daniel Becroft <djcbecroft_at_gmail.com>
>> > wrote:
>> >> On Mon, Dec 6, 2010 at 8:50 PM, Daniel Becroft <djcbecroft_at_gmail.com>
>> >> wrote:
>> >>
>> >>> On Mon, Dec 6, 2010 at 7:13 PM, Daniel Shahaf
>> >>> <d.s_at_daniel.shahaf.name>wrote:
>> >>>
>> >>>> Instead of guessing which function causes the lstat() calls, could we
>> >>>> have a tool tell?
>> >>>>
>> >>>> I've looked at 'ltrace -S', but it seems to me that will only tell us
>> >>>> which public svn_wc_* API is responsible for the calls.  Perhaps
>> >>>> there
>> >>>> is another tool that can help us follow the entire call chain down to
>> >>>> the lstat() call?
>> >>>>
>> >>>> Daniel
>> >>>> . o O (gdb with a breakpoint in lstat() and some fu to extract
>> >>>> statistics about the stack trace?)
>> >>>>
>> >>>
>> >>> Very true.
>> >>>
>> >>> Okay, after a crash course in using GDB, I've run some further
>> >>> testing.
>> >>> Running with breakpoints set on the lstat() system call, it looks to
>> >>> be the
>> >>> known issue that was originally pointed to. A partial stack trace is
>> >>> below:
>> >>>
>> >>> #0  0x00007ffff644ab45 in __lxstat (vers=<value optimised out>,
>> >>>     name=<value optimised out>, buf=0x7fffffffcbf0)
>> >>>     at ../sysdeps/unix/sysv/linux/wordsize-64/lxstat.c:38
>> >>> #1  0x00007ffff69269b6 in lstat (finfo=0x7fffffffccf0,
>> >>>     fname=0x6bd638 "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     wanted=33137, pool=0x6bd588) at /usr/include/sys/stat.h:464
>> >>> #2  apr_stat (finfo=0x7fffffffccf0,
>> >>>     fname=0x6bd638 "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     wanted=33137, pool=0x6bd588) at file_io/unix/filestat.c:292
>> >>> #3  0x00007ffff6fb1199 in io_check_path (
>> >>>     path=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     resolve_symlinks=0, is_special_p=0x7fffffffced8,
>> >>> kind=0x7fffffffcedc,
>> >>>     pool=0x6bd588) at subversion/libsvn_subr/io.c:222
>> >>> #4  0x00007ffff6fb1346 in svn_io_check_special_path (
>> >>>     path=0x6bd608 "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     kind=0x7fffffffcedc, is_special=0x7fffffffced8, pool=0x6bd588)
>> >>>     at subversion/libsvn_subr/io.c:283
>> >>> #5  0x00007ffff794513d in svn_wc__db_pdh_parse_local_abspath (
>> >>>     pdh=0x7fffffffd038, local_relpath=0x7fffffffd030, db=0x64ad38,
>> >>>     local_abspath=0x6bd608
>> >>> "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     smode=svn_sqlite__mode_readwrite, result_pool=0x6bd588,
>> >>>     scratch_pool=0x6bd588) at subversion/libsvn_wc/wc_db_pdh.c:382
>> >>> #6  0x00007ffff7935d58 in svn_wc__db_read_info (status=0x7fffffffd184,
>> >>>     kind=0x7fffffffd188, revision=0x0, repos_relpath=0x0,
>> >>> repos_root_url=0x0,
>> >>>     repos_uuid=0x0, changed_rev=0x0, changed_date=0x0,
>> >>> changed_author=0x0,
>> >>>     last_mod_time=0x0, depth=0x0, checksum=0x0, translated_size=0x0,
>> >>>     target=0x0, changelist=0x0, original_repos_relpath=0x0,
>> >>>     original_root_url=0x0, original_uuid=0x0, original_revision=0x0,
>> >>>     props_mod=0x0, have_base=0x0, have_work=0x0, conflicted=0x0,
>> >>> lock=0x0,
>> >>>     db=0x64ad38,
>> >>>     local_abspath=0x6bd608
>> >>> "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     result_pool=0x6bd588, scratch_pool=0x6bd588)
>> >>>     at subversion/libsvn_wc/wc_db.c:5261
>> >>>
>> >>> However, I then put the breakpont in the svn_wc__db_read_info function
>> >>> and,
>> >>> just for A/beta.txt, it gets called 7 times, when in terms leads to 7
>> >>> additional lstat() calls. Looking further the call stack appears as
>> >>> follows:
>> >>>
>> >>> #0  svn_wc__db_read_info (status=0x7fffffffd184, kind=0x7fffffffd188,
>> >>>     revision=0x0, repos_relpath=0x0, repos_root_url=0x0,
>> >>> repos_uuid=0x0,
>> >>>     changed_rev=0x0, changed_date=0x0, changed_author=0x0,
>> >>> last_mod_time=0x0,
>> >>>     depth=0x0, checksum=0x0, translated_size=0x0, target=0x0,
>> >>> changelist=0x0,
>> >>>     original_repos_relpath=0x0, original_root_url=0x0,
>> >>> original_uuid=0x0,
>> >>>     original_revision=0x0, props_mod=0x0, have_base=0x0,
>> >>> have_work=0x0,
>> >>>     conflicted=0x0, lock=0x0, db=0x64ad38,
>> >>>     local_abspath=0x6bd608
>> >>> "/home/djcbecroft/dev/workingcopy/A/beta.txt",
>> >>>     result_pool=0x6bd588, scratch_pool=0x6bd588)
>> >>>     at subversion/libsvn_wc/wc_db.c:5259
>> >>> #1  0x00007ffff78fab96 in walker_helper (db=0x64ad38,
>> >>>     dir_abspath=0x6ba288 "/home/djcbecroft/dev/workingcopy/A",
>> >>> show_hidden=1,
>> >>>     walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>,
>> >>>     walk_baton=0x7fffffffd4e0, depth=svn_depth_infinity,
>> >>>     cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0,
>> >>>     scratch_pool=0x6ba208) at subversion/libsvn_wc/node.c:679
>> >>> #2  0x00007ffff78facb5 in walker_helper (db=0x64ad38,
>> >>>     dir_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy",
>> >>> show_hidden=1,
>> >>>
>> >>>     walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>,
>> >>>     walk_baton=0x7fffffffd4e0, depth=svn_depth_infinity,
>> >>>     cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0,
>> >>>     scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:716
>> >>> #3  0x00007ffff78faf9a in svn_wc__internal_walk_children (db=0x64ad38,
>> >>>     local_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy",
>> >>> show_hidden=1,
>> >>>     walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>,
>> >>>     walk_baton=0x7fffffffd4e0, walk_depth=svn_depth_infinity,
>> >>>     cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0,
>> >>>     scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:763
>> >>> #4  0x00007ffff78fb06d in svn_wc__node_walk_children (wc_ctx=0x64ad20,
>> >>>     local_abspath=0x67d258 "/home/djcbecroft/dev/workingcopy",
>> >>> show_hidden=1,
>> >>>     walk_callback=0x7ffff7ba9026 <get_mergeinfo_walk_cb>,
>> >>>     walk_baton=0x7fffffffd4e0, walk_depth=svn_depth_infinity,
>> >>>     cancel_func=0x4119c3 <svn_cl__check_cancel>, cancel_baton=0x0,
>> >>>     scratch_pool=0x6a0dc8) at subversion/libsvn_wc/node.c:785
>> >>>
>> >>> Using additional breakpoints, I have verified that
>> >>> svn_wc__node_walk_children, svn_wc__internal_walk_children and
>> >>> walker_helper
>> >>> only get called once. It seems to be an issue within the walker_helper
>> >>> itself.
>> >>>
>> >>> I'll continue looking at it further - let me know if I'm barking up
>> >>> the
>> >>> wrong tree.
>> >>>
>> >>> Cheers,
>> >>> Daniel B.
>> >>> ... who now has a much greater appreciation for those that debug in
>> >>> GDB on
>> >>> a regular basis.
>> >>>
>> >>
>> >> I think I've found the culprit. As we are walking the children
>> >> (svn_wc__internal_walk_children, node.c), we call the walk_callback for
>> >> each
>> >> node. At the moment, this is the get_mergeinfo_walk_cb function in
>> >> merge.c.
>> >> Inside this, we call the following:
>> >>
>> >> svn_wc__node_is_status_present
>> >> svn_wc__node_is_status_deleted
>> >> svn_wc__node_is_status_absent
>> >> svn_wc__path_switched
>> >> svn_wc_read_kind
>> >> svn_wc__node_get_depth
>> >>
>> >> Each of these will run a separate svn_wc__db_read_info call, and
>> >> therefore a
>> >> separate lstat(). There's also a couple of other calls that might call
>> >> it
>> >> indirectly.
>> >
>> > So it looks an improvement would be something like
>> > svn_wc__get_info_for_merge_callback() which does it all in one go (or
>> > some other less-verbosely-named function).
>>
>> Picked the low-hanging fruit in r1042892.  More work could be done,
>> but I think the problems lie deeper, namely:
>> * read_info() was originally envisioned as being cheap to call.
>> Apparently that isn't the case anymore, part of which is no doubt
>> caused by...
>> * the call to svn_io_check_special_path() in
>> vn_wc__db_pdh_parse_local_abspath().  There is a comment in that file
>> that:
>>
>>  /* ### at some point in the future, we may need to find a way to get
>>     ### rid of this stat() call. it is going to happen for EVERY call
>>     ### into wc_db which references a file. calls for directories could
>>     ### get an early-exit in the hash lookup just above.  */
>>
>> It's really unfortunate that every wc_db call on a file which fetches
>> a pdh has to call stat().  I tried to implement an optimization that
>> looks at the parent (which is much more likely to succeed, now that we
>> have single-db), but it failed for externals (which have their own
>> wc.db databases).  We could work around that, but to do so, we'd need
>> to know if we are looking at a file or a directory...and the way to do
>> that is by calling stat().
>>
>> Improvements, patches, suggestions encouraged.
>
> Is it possible to look at it a different way? Maybe the issue isn't that we
> need to call stat() for each parse_local_abspath(), but that we are calling
> that function so often? When processing recursively, do we need to
> continually find the wc.db file? It seems to be residual from the 1.6.x
> algorith, where the contents of a folder weren't known until svn actually
> descended into that folder (and parsed the entries file). For example:
>
>  - Enter A. Read entries. Process A\mu
>  - Enter A\B. Read entries. Process lamba
>  - Enter A\B\E. Read entries. Process beta.
>  - Rinse and repeat.
>
> The same process has been maintained for wc-ng, so it becomes (extremely
> high level):
>
>  - Enter A. Find A\.svn\wc.db. Process files.
>  - Enter A\B. Find A\B\.svn\wc.db. If nothing, find A\.svn\wc.db. Process
> files.
>  - Enter A\B\E. Find A\B\E\.svn\wc.db. If nothing, find A\B\.svn\wc.db. If
> nothing, find A\.svn\wc.db. Process files.
> This happens for every call to parse_local_abspath(), and if there are 10 or
> 15 levels deep, it adds up.

Actually, it will (in the common case) only recurse up one level,
since it will hit the cached directory handle for that parent. But
there should still be a more intelligent way to handle this. (And
yes, "more intelligent" could be "don't try to find the pdh so much.")

> Using a single metadata store, it could be switched to something like:
>
>  - Enter A. Find metadata store, and find entries to process under A
> (recursively if applicable).
>  - Iterate over entries and process.
>
> For directories, it should be known if we are processing an external or not
> (since this would be in the source wc.db).

The optimization I tried was "assume the parent is cached with the
information we need, and if you get a failure, go back to stat()'ing
the file." This didn't work because the failure wasn't encountered
until long after the call in this function.

I think the same problem will happen when trying to determine if a
thing is an external or not. Maybe the real question here is "how can
we get externals to share the same db as the rest of the working
copy?"

-Hyrum
Received on 2010-12-08 23:18:08 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.