Re: PATCH: Check for cancellation before doing something, not afterwards
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2004-03-07 23:22:17 CET
Sorry for the long delay in replying.
kfogel@collab.net wrote:
Right - there aren't. I'm not sure if we'd want more than one check per iteration if there were, but that point is moot for now.
I have since thought long and hard about this patch, doubting it for a while, and am finally happy with it again and will commit it. The rest of this mail records my thoughts for future reference.
I had originally thought that my reasoning was simple and self-evident. Now I can see that the existing implementation was probably also considered obvious: perform an operation that takes a significant time; if the user tries to cancel during that operation, we can only find out afterwards, not before. However, the cancellation request is asynchronous: the user does not know for sure whether their request is received during one particular function call or iteration. Therefore, rather than "Did the user try to cancel this part of the operation?", we can think of the cancellation check as asking, from time to time, "Has the user yet requested cancellation of the whole thing?"
The effect of this patch is to check for cancellation before the first iteration of a loop (as well as between iterations), rather than after the last iteration (and between iterations). I consider the opportunity of cancelling before the operation starts its main task to be useful, but cancelling after it finishes not to be useful. Situations where cancelling just before the end can be useful are rare, as mentioned below.
Discussion:
Cancellation is required when a program or function may be long-running; to be long-running it must invoke long-running sub-functions (which should have their own cancellation checks within them, so there's little need to consider them at the caller's level) and/or it must iterate over a number of (possibly short) operations. The iteration is the important part. The typical phases of operation are:
1) Set-up: e.g. the program is loaded, a connection is made to the repository, and perhaps the local WC is scanned;
2) Work: an iteration over the files (or other items) that are to be listed, updated, or whatever.
3) Epilogue: closing the network connection, tidying up local files, committing a transaction.
Cancelling during the iteration is fine; it doesn't matter whether the cancellation point is at the start or the end of the loop body, as all the user needs is to be able to cancel between one iteration and the next.
I often want to cancel during the initial set-up phase. Either, just after I have issued the command, I realise that I don't want it after all, or I notice that it is taking a long time to get started, and this gives me a clue that something is wrong. In either case, if I press Ctrl-C while it is still starting up, I don't want it to go ahead and process the first file that it comes to, and then quit. I want it to quit before it processes any of the files.
I decided that moving the check to the top of the loop is a good way to achieve this. An alternative would be to add another check before the loop.
Another advantage of checking at the top of the loop is, as with clearing a sub-pool, to avoid the check being skipped by a "continue" statement.
The Epilogue phase (3) can also be of significant duration or effect. It may be desirable to have a cancellation check at that point so that the user can quit between the processing of the last file and the Epilogue phase. However, to benefit from this, the user has to request cancellation after the processing of the last file begins, but before it is complete, and then this cancellation will take effect after that last file is complete, but before the Epilogue phase. In most cases, that phase does no useful work from the user's point of view, and doesn't take long, and so there is no benefit to skipping it. When the number of iterations is large, the chance of the user hitting Ctrl-C during the very last iteration is small, and so it doesn't matter much if we consider that time to be too late. The benefit is only really significant when there are only one or two iterations, and the Epilogue does something permanent like a commit.
Conclusion:
I concluded that a single check at the beginning of a loop body is more useful than a single check at the end. In some cases we might want to ensure that there is a check at both ends - e.g. at the beginning of the loop body and after the end of the loop - so as not to lose the final check that currently exists (and also to ensure that there is a cancellation opportunity even with no items?). With this extra check outside the loop, the only benefit of moving the inside check to the beginning of the loop would be the avoidance of being skipped by "continue" statements, which is still worthwhile.
None of the functions affected by this patch appear to do anything significant in their Epilogue phase, such as a commit, so I will commit it as it is. I don't see a need for an extra check in these places.
Committed in r8918.
- Julian
---------------------------------------------------------------------
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.