[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [ISSUE] Cancelling svn checkout leaves an open file handle

From: Daniel Sahlberg <daniel.l.sahlberg_at_gmail.com>
Date: Wed, 9 Sep 2020 10:35:54 +0200

Den ons 9 sep. 2020 kl 09:39 skrev Branko Čibej <brane_at_apache.org>:

> On 09.09.2020 08:18, Daniel Sahlberg wrote:
>
> Den ons 9 sep. 2020 kl 06:44 skrev Nathan Hartman <
> hartman.nathan_at_gmail.com>:
>
>> On Tue, Sep 8, 2020 at 6:08 AM Uroš Jovanović <urosh3d_at_gmail.com> wrote:
>> >
>> > Then, mid downloading some of the larger files a temp file will appear
>> in .svn\tmp. Once that happens, hit the Cancel button.
>> > It will signal the cancellation to the svn client and it will throw the
>> SvnOperationCanceledException, the SvnClient gets disposed BUT an open file
>> handle remains on ".svn\tmp\svn-XYZ123" file.
>> > If you try to delete it, Windows will complain that it is used by our
>> test app. :(
>>
>> Moving this to the dev@ list...
>>
>> Potentially long-running APIs such as 'checkout' allow the client to
>> provide a 'cancel_func' callback, which is called at various strategic
>> places to ask the client whether the operation should be canceled.
>>
>> It sounds to me like one of those places sees a cancel request and
>> returns to its caller, forgetting to do some cleanup.
>>
>> Last night I tried to find such a place by reading code.
>>
>> The 'checkout' command sets up a working copy (if necessary) and then
>> calls the 'update' logic to do the heavy lifting.
>>
>> The 'update' logic is quite involved as it handles all sorts of
>> possibilities, which means the number of branches of the call tree
>> that need to be checked are too numerous for my code reading approach
>> to be sensible.
>>
>> My thoughts for an automated approach, provided there is a way for a
>> process to inquire how many open file handles it has (I assume there
>> is a way; I've just never had to do this): The idea is to write a
>> minimal client that does the following (on a ramdrive):
>>
>> 1. Check out a working copy of a repository, giving a cancel_func 'A'
>> that increments a global variable 'n' each time it is called and
>> always returns "don't cancel."
>>
>> 2. Loop n times, the loop counter being a global variable 'x':
>>
>> 2.1: Delete the working copy.
>>
>> 2.2: Check out a working copy of the same repository, giving a
>> different cancel_func 'B' that returns "don't cancel" the
>> first (x - 1) times it is called, and returns "cancel" the
>> x-th time it is called.
>>
>> 2.3: Test whether there are open file handles. If there are, we
>> know at which iteration the cleanup is not done, and we break
>> out of the loop.
>>
>> 3. If x >= n, quit; we didn't find the problem.
>>
>> 4. Delete the working copy.
>>
>> 5. Check out a working copy of the same repository, giving a different
>> cancel_func 'C' that returns "don't cancel" the first (x - 1) times
>> it is called, and traps the x-th time it is called, allowing the
>> call stack to be examined.
>>
>> Notes and caveats:
>>
>> 1. This could run for days (or years).
>>
>> 2. Then again, if it can be exposed pretty reliably by a user hitting
>> a Cancel button in a GUI, that means cancel_func is called
>> frequently enough from the offending location that it should
>> (hopefully) be caught relatively soon in the process.
>>
>> 3. I think a huge repository isn't needed. The Greek Tree used by the
>> test suite may suffice. If it doesn't expose the bug, I'd retry
>> with a larger file thrown in. If that doesn't expose it, add
>> increasing complexity such as externals, etc.
>>
>> 4. This relies on the logic being executed identically for each
>> checkout (i.e., cancel_func is called the same number of times from
>> the same call sites).
>>
>> 5. No idea how this could be turned into a regression test.
>>
>> 6. If there's a better way, I'd love to hear it!
>>
>
> For a regression test (as well as trying to pinpoint what goes wrong),
> wouldn't it be enough if the cancel_func check for the presence of a file
> in .svn/tmp (maybe even checking if it is open - in Linux that should be
> easy enough to check in /proc/$PID/fd) and then signal to cancel. That
> would "only" need a repository/file that is large enough to trigger calling
> the cancel_func.
>
>
> Not even that. The cancel check function is provided by the caller (in the
> client context) and can return 'true' when the conditions are right.
> Finding the correct temp directory is a bit more involved though because
> it's platform-dependent (i.e., I'm not sure if APR can reliably tell us
> that).
>

>
> I checked quickly and I also see the open file when checking out using
> TortoiseSVN and cancelling and it seems to occur all the time.
>
>
> It could be a but in our library, or it could be a bug in TortoiseSVN,
> since the API caller controls the lifetime of pools. The easiest way to
> check would be to run a checkout from the command line client built with
> APR pool debugging enabled (that means a custom build of APR is needed,
> too) and examining the debug output.
>

Thanks for your input! If it is in TortoiseSVN, then the same bug also
exists in SharpSVN, thus I'm leaning towards a bug in Subversion itself.
I'll try to find some time to check this later this week.

Would it be a better approach to enumerate all open fd:s before and after
the call to 'checkout' and compare the list of open fd:s (after any pool
cleanup required). Of course enumerating open fd:s most probably require
platform specific code in the regression test.

>
Daniel

>
Received on 2020-09-09 10:36:09 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.