[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes the unversioned 'MissingItem' from disk)?

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Sun, 22 Jan 2012 21:24:44 +0100

On Sun, Jan 22, 2012 at 9:12 AM, Bert Huijben <bert_at_qqmail.nl> wrote:
> This introduces the same problem as reported earlier this week (copy
> gives error on invalid casing) when the casing of the wcroot doesn't
> match but the suffix after the root does match.

Hm, interesting. I hadn't looked into that problem in detail yet, but
I see now how that could be a problem. Thanks for pointing that out.

> The other 'fix' was in svn, but this one can't be worked around at the
> client level.
>
> In this case I'm not sure if we should really fix it if it has any
> unwanted side effects as a normal 'del fILE.eXT' would just delete
> file.ext.

Ok, but in the svn working copy, even on Windows, we can have a valid
fILE.eXT in wc.db, while there is an unrelated file.ext present on the
filesystem. We should be able to do useful things with the svn item
fILE.eXT in one way or another, without harming file.ext.

The root of this new copy-with-invalid-wcroot-casing problem is
actually the fix for issue #3865 ('svn' on Windows cannot address
scheduled-for-delete file, if another file differing only in case is
present on disk), which was a pretty essential part of the case-only
renaming stuff of #3702. Without #3865, even when a case-only rename
('svn mv A a') would work, we wouldn't be able to commit the move
('svn ci A a' would have been transformed into 'svn ci a a').

For #3865, we introduced the following special case:

    If the target was truepath'ed, but there is also an exact match to the
    original target in wc.db, then the user probably meant that
    'case-shadowed' wc.db item. So we pass on the original target instead
    of the truepath.

The problem, apparently, is that the "original target" can also
contain path components from above the wcroot, which will then be
passed through unchanged. It seems to be essential that these
above-wcroot-path-components are case-normalized, otherwise we run
into problems.

I don't know how to do it, but I guess this means that the only robust
way to fix this is that, for the #3865 special case, we should still
apply case-normalization to the part above the wcroot, and only keep
the original version of the part inside the wc.

If this could be fixed, we can probably also rework the patch for
#4023 to only look at the after-wcroot-part to check for
case-clashes...

> We can never 100% emulate a case sensitive filesytem on a case
> insensitive one.

Heh. But we can still try to emulate it as good as possible :-).

-- 
Johan
> Bert Huijben (Cell phone)
> From: Johan Corveleyn
> Sent: 22-1-2012 2:48
> To: Bert Huijben
> Cc: Subversion Development
> Subject: Re: How to fix issue #4023 (on Windows, 'svn rm missingItem'
> deletes the unversioned 'MissingItem' from disk)?
> On Tue, Jan 17, 2012 at 3:02 PM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
>> On Sun, Jan 15, 2012 at 10:14 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Johan Corveleyn [mailto:jcorvel_at_gmail.com]
>>>> Sent: zondag 15 januari 2012 21:26
>>>> To: Subversion Development
>>>> Subject: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes
>>>> the unversioned 'MissingItem' from disk)?
>>>>
>>>> Hi,
>>>>
>>>> I'm looking into fixing issue #4023. I'd like to have some feedback on
>>>> a possible solution, and some hints on how to best implement it.
>>>>
>>>> The problem
>>>> -----------
>>>> - 'svn rm missingItem' (while there is a 'MissingItem' on disk, and
>>>> not a 'missingItem') will eventually call
>>>> adm.ops.c#svn_wc__delete_many, with 'missingItem' as one of the
>>>> targets (so far so good).
>>>>
>>>> - svn_wc__delete_many, after updating wc.db, will eventually call
>>>> erase_unversioned_from_wc, which will call
>>>> svn_io_remove_file2('missingItem').
>>>>
>>>> - At this point, we have a problem, because
>>>> svn_io_remove_file2('missingItem') will eventually call the Windows
>>>> API DeleteFileW, which will happily delete 'MissingItem' when given
>>>> the 'missingItem' argument.
>>>>
>>>>
>>>> Proposed solution
>>>> -----------------
>>>> Inside svn_wc__delete_many, we could find out that 'missingItem' is,
>>>> you know, missing, before we try to delete it from disk. In that case,
>>>> there is no need to call erase_unversioned_from_wc, so no call to
>>>> DeleteFileW occurs.
>>>>
>>>>
>>>> I can't come up with a better way, but if someone has other ideas, shoot.
>>>>
>>>> If this is indeed the way to proceed, then I'd appreciate a small
>>>> hint/pointer: how do I find out (as cheaply as possible) whether the
>>>> target in question is missing? In the beginning of
>>>> svn_wc__delete_many, a call is done to svn_wc__db_read_info, but this
>>>> doesn't seem to give this information (status == normal, kind == file,
>>>> whether or not the file in question is missing from disk).
>>>
>>> The cheapest way to check would be (indirectly) calling the
>>> apr_filepath_merge call to find the paths true name. If the basename of the
>>> returned path is (case-)different than the path you entered, then the file
>>> would be missing.
>>> If the file is really missing, you would get the identical path or an error
>>> from that call.
>>
>> Thanks. I'll look into that.
>>
>> I'll try to complete it during this week, but these are busy times, so
>> if someone beats me to it: no problem.
>>
>>> On Windows this is cheaper than retrieving all directory entries in the
>>> parent directory and checking yourself and on other platforms getting the
>>> truename is a no-op.
>>>
>>>
>>> We might have a wrapper for that apr_filepath_merge, for handling the path
>>> encoding conversion.
>>> If we have, then we probably used it for the file move fix we programmed in
>>> Berlin. If not, then we should probably add one :)
>>
>> Yes, apr_filepath_merge (with APR_FILEPATH_TRUENAME flag) is wrapped
>> inside svn_opt__arg_canonicalize_path. Am I allowed to call this
>> function from inside svn_wc__delete_many? In that case, I can just
>> call that function and compare its output with the output of
>> svn_dirent_internal_style, just like we did for the 'case-only move'
>> code.
>>
>> Actually, during the Berlin hackathon last year, we fixed the
>> case-only moves without adding another call to
>> svn_opt__arg_canonicalize_path. Instead, we did it by inserting some
>> additional logic inside cmdline.c#svn_client_args_to_target_array,
>> where the truepath conversion is done anyway, for all the path
>> arguments. At that point we already had knowledge of both the original
>> form and the truepath-converted form, so we could compare the two.
>> (see r1124469 and r1131326)
>
> Hi Bert,
>
> Ok, the following patch fixes issue #4023 (and passes make check). But
> I'm not sure if it's a "good fix" (more below). So I'd like some extra
> eyes if possible ...
>
> [[[
> Index: subversion/libsvn_wc/adm_ops.c
> ===================================================================
> --- subversion/libsvn_wc/adm_ops.c      (revision 1233695)
> +++ subversion/libsvn_wc/adm_ops.c      (working copy)
> @@ -59,6 +59,7 @@
>  #include "svn_private_config.h"
>  #include "private/svn_io_private.h"
>  #include "private/svn_wc_private.h"
> +#include "private/svn_opt_private.h"
>
>
>
> @@ -742,14 +743,29 @@ svn_wc__delete_many(svn_wc_context_t *wc_ctx,
>    * unversioned. */
>   if (!keep_local)
>     {
> +      const char *local_truepath;
> +
>       for (i = 0; i < versioned_targets->nelts; i++)
>         {
>           svn_pool_clear(iterpool);
>
>           local_abspath = APR_ARRAY_IDX(versioned_targets, i, const char *);
> -          SVN_ERR(erase_unversioned_from_wc(local_abspath, TRUE,
> -                                            cancel_func, cancel_baton,
> -                                            iterpool));
> +
> +          /* Canonicalization will, on Windows, return the 'truepath' of the
> +             given target (the actual casing on disk corresponding to the
> +             given path).  If this differs from the original target, we know
> +             that the latter is missing from disk, and another case-clashing
> +             file is present.  In that case, we avoid the call to
> +             erase_unversioned_from_wc (which would otherwise erase the wrong
> +             file from disk).  See also issue #4023. */
> +          SVN_ERR(svn_opt__arg_canonicalize_path(&local_truepath,
> +                                                 local_abspath, iterpool));
> +          if (strcmp(local_abspath, local_truepath) != 0)
> +            continue;
> +          else
> +            SVN_ERR(erase_unversioned_from_wc(local_abspath, TRUE,
> +                                              cancel_func, cancel_baton,
> +                                              iterpool));
>         }
>     }
>
> ]]]
>
> Some things on my mind:
>
> - Is it ok to call svn_opt__arg_canonicalize_path from libsvn_wc, or
> is this some kind of layering violation? (if so, we'd have to write
> another wrapper for apr_filepath_merge (if so, where should we put
> this?))
>
> - It can only detect real 'case-clashes' (which is sufficient to fix
> #4023), not 'missing' in general (like we discussed above). That's
> because, going through svn_opt__arg_canonicalize_path, one cannot see
> whether apr_filepath_merge returned an error, or just canonicalized to
> the identical path (meaning the exact path was present). Both will
> return the given path as is. OTOH, this is not really a problem, it's
> enough to get #4023 fixed ...
>
> --
> Johan
Received on 2012-01-22 21:25:38 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.