On 02.12.2009 00:27, Jonathan Potter wrote:
> Hi everyone, I hope this is the best place to post this.
>
> Lately we have been reports from users of our software (a file manager -
> Directory Opus) that they are getting "lock ups" seemingly at random since
> they upgraded to Tortoise 1.6.3 (and beyond). From our investigations it
> would seem that the latest versions of Tortoise SVN have a flaw in the way
> they use a critical section, which can easily manifest as a deadlock when
> Tortoise is accessed from multiple threads simultaneously.
>
> Consider the two following partial stack traces, from 2 different threads.
> You can see they are both waiting to enter critical sections.
>
>
> -------------------- THREAD 1 ---------------------
> ntdll!ZwWaitForSingleObject+0xc
> ntdll!RtlpWaitForCriticalSection+0x132
> ntdll!RtlEnterCriticalSection+0x46
> TortoiseSVN!ATL::CComCritSecLock<ATL::CComCriticalSection>::Lock+0x1b
> [c:\program files\microsoft visual studio 9.0\vc\atlmfc\include\atlbase.h @
> 337]
> TortoiseSVN!ShellCache::IsPathAllowed+0x47
> [d:\development\svn\releases\tortoisesvn-1.6.6\src\tortoiseshell\shellcache.
> h @ 321]
> TortoiseSVN!CShellExt::IsMemberOf+0x20e
> [d:\development\svn\releases\tortoisesvn-1.6.6\src\tortoiseshell\iconoverlay
> .cpp @ 124]
> WARNING: Stack unwind information not available. Following frames may be
> wrong.
> TortoiseOverlays!DllGetClassObject+0xb4c
> shell32!CFSIconOverlayManager::_FindMatchingID+0x3b
> shell32!CFSIconOverlayManager::GetFileOverlayInfo+0x3e
> shell32!CFSFolder::_GetOverlayInfo+0x121
> shell32!CFSFolder::GetOverlayIndex+0x31
> shell32!_GetFileInfoSections+0x1ec
> shell32!SHGetFileInfoW+0x13a
> dopus!SHGetFileInfoSmart+0x46
> <....>
> -------------------- THREAD 1 ---------------------
>
>
> -------------------- THREAD 2 ---------------------
> ntdll!ZwWaitForSingleObject+0xc
> ntdll!RtlpWaitForCriticalSection+0x132
> ntdll!RtlEnterCriticalSection+0x46
> shell32!LookupFileClass+0x15
> <... a few lines deleted...>
> shell32!CRegFolder::BindToObject+0x45
> shell32!SHBindToObjectEx+0x3f
> shell32!SHBindToFolderIDListParent+0x26
> shell32!SHBindToIDListParent+0x18
> shell32!SHGetPathFromIDListEx+0x54
> shell32!SHGetPathFromIDListW+0x12
> TortoiseSVN!ShellCache::ExcludeListValid+0x42a
> [d:\development\svn\releases\tortoisesvn-1.6.6\src\tortoiseshell\shellcache.
> h @ 573]
> TortoiseSVN!ShellCache::IsPathAllowed+0x4ec
> [d:\development\svn\releases\tortoisesvn-1.6.6\src\tortoiseshell\shellcache.
> h @ 343]
> TortoiseSVN!CShellExt::GetItemData+0x40
> [d:\development\svn\releases\tortoisesvn-1.6.6\src\tortoiseshell\columnprovi
> der.cpp @ 195]
> dopus!SafeIColumnProviderGetItemData+0x51
> <....>
> -------------------- THREAD 2 ---------------------
>
>
> Our supposition is that shell32.dll has an internal critical section that is
> used by many shell functions.
>
> Thread 1 is calling the SHGetFileInfo function. Now our supposition is that
> as part of the internal workings of the SHGetFileInfo function, the shell
> enters its critical section at this point. It then calls into Tortoise,
> presumably via its IShellIconOverlayIdentifier interface (we couldn't find
> the debug symbols for TortoiseOverlay.dll so can't be 100% sure of this),
> and in ShellCache::IsPathAllowed, Tortoise attempts to enter its own
> "m_critSec" critical section. This call blocks, because Thread 2 already has
> that critical section held (see below).
>
>
> Thread 2 is calling Tortoise directly (via its IColumnProvider interface).
> In ShellCache::ExcluseListValid, Tortoise enters the critical section
> "m_critSec", and then with that critical section still held, calls into the
> shell via SHGetPathFromIDList. The shell, as part of its internal workings,
> calls shell32!LookupFileClass, which attempts to enter the critical section
> that is currently held by thread 1. Therefore we have a deadlock.
>
>
> From what we can tell, the code in ShellCache::ExcludeListValid that calls
> SHGetPathFromIDList was merged into the trunk as part of the 1.6.3 revision,
> which fits exactly with our users' reports that the problem goes away if
> they downgrade to 1.6.2.
>
>
> I've attached a suggested solution. Basically the code that adds the three
> default paths to exclude has been moved to the top of IsPathAllowed, and the
> path strings are added to a new local vector. Crucially this takes place
> BEFORE the critical section is entered. This vector is then passed to
> ExcludeListValid which now simply copies the strings rather than querying
> the shell directly.
>
> It would be great if this problem could be fixed in the next update to
> Tortoise, as we have more than a few users experiencing this issue.
>
> Please let me know if you have any comments or questions about any of the
> above.
Thanks a lot for your analysis and your patch!
You're absolutely right with your analysis. I also found the thread
about this issue here:
http://resource.dopus.com/viewtopic.php?t=11450&postdays=0&postorder=asc&start=0
After reviewing your patch and testing it I've committed it in r17777.
I've started the build for the 1.6.x nightly (32-bit). It should be
online in about 2-3 hours. You can tell your users that they can try
this nightly build:
http://nightlybuilds.tortoisesvn.net/1.6.x/win32/
(currently the build that's online is from yesterday, but as I said in a
few hours the new build will be there).
I'm not sure when we will make another official release (1.6.7), but in
the meantime the nightly build from the stable branch should be enough
for your users (the nightly build from the stable branch is what will
become the next release, so no risk in using it).
Stefan
--
___
oo // \\ "De Chelonian Mobile"
(_,\/ \_/ \ TortoiseSVN
\ \_/_\_/> The coolest Interface to (Sub)Version Control
/_/ \_\ http://tortoisesvn.net
------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=757&dsMessageId=2426460
To unsubscribe from this discussion, e-mail: [dev-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2009-12-02 21:34:32 CET