[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: UTF-8 NFC/NFD paths issue

From: Greg Stein <gstein_at_gmail.com>
Date: Sat, 18 Sep 2010 15:55:57 -0400

On Sat, Sep 18, 2010 at 04:42, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> Greg Stein wrote on Fri, Sep 17, 2010 at 07:22:12 -0400:
>> On Thu, Sep 16, 2010 at 19:26, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> > Greg Stein wrote on Thu, Sep 16, 2010 at 00:59:59 -0400:
>> >> On Wed, Sep 15, 2010 at 23:35, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> >...
>> >> > If yes, then we infer that no two in-repository paths (which are
>> >> > bytewise different) canonicalize to the same byte sequence. Which is
>> >> > pretty useful precondition to have, i.e., what /can/ svn do on a legacy
>> >> > repository where some two paths are bytewise-different yet Unicode-equal?
>> >>
>> >> This will be *very* difficult to manage. Even if a given repository
>> >> somehow manages to rewrite history to "fix" the paths, then you may
>> >> have an unknown number of downstream synchronized repositories to
>> >> similarly fix.
>> >>
>> >> I think an answer might be to rely on the upcoming obliterate
>> >> feature's "out of band" change descriptions. For example, a repository
>> >> might tell a working copy, "hey: file XYZ was obliterated since you
>> >> last talked to me. if you happen to have it, then get rid of it. I
>> >> won't recognize it henceforth." You can see a similar descriptor sent
>> >> to working copies or repositories that says "I recoded XYZ. update to
>> >> the new encoding."
>> >>
>> >
>> > I don't see why this needs to be special-cased? The server can simply
>> > send "rename(NFD(), NFC())" and the wc library can figure for itself
>> > that it's inoperative for her in the same place she determines that
>> > "rename('foo','FOO')" is inoperative for her (when the filesystem is
>> > case-insensitive).
>>
>> When does the server send that? If the wc is at r1000, and the server
>> is at r1000, then the standard update response is nil.
>>
>> Yet if an administrator comes along and renames the repository paths
>> to NFC, then *something* needs to return in an update response. I see
>> it as "not part of the update request", and that there is an
>> out-of-band response that details such changes. ie. changes that occur
>> outside the revision numbering flow.
>>
>
> i.e., out-of-band is one choice, and making it a proper revision is the
> other choice.
>
> ?

That doesn't solve the historical revisions containing "bad" paths. My
understanding of the problem was that we'd go into the past and
rewrite the paths into a single, canonical form.

Cheers,
-g
Received on 2010-09-18 21:56:52 CEST

This is an archived mail posted to the Subversion Dev mailing list.