[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Let's discuss about unicode compositions for filenames!

From: Hiroaki Nakamura <hnakamur_at_gmail.com>
Date: Fri, 3 Feb 2012 06:46:07 +0900

2012/2/3 Peter Samuelson <peter_at_p12n.org>:
> [Hiroaki Nakamura]
>> Existing repositories, I think it would be better to convert them too using
>> svndump/svnload. And we change svnload to convert filenames to NFC.
>> However in reality we cannot force users to convert every existing repository.
> Also note that if you convert a repository (via dump/load or whatever),
> all working copies based on the repository are invalidated and need to
> be re-checked-out. Avoiding _that_ problem would be really hairy, I
> think, very similar to the sort of work that would be needed to support
> obliterate without losing working copies.
>> We also need to changes servers in order to deal with existing 1.x
>> clients. We convert filenames to NFC when web_dav_svn and svnserve
>> receive filenames from clients, they must first convert filenames to
>> NFC.
> You keep saying what we "must" do on the server side. I propose
> something that is purely on the client side. It will solve the OS X /
> non-OS X interoperability problem. It will not solve every problem
> ever faced by a Subversion user. That's a job for 2.0.

OK. When I started this thread, I suppose we'd like to focus to
long term solution 2.x. That's because the short term solution options (4)
written in
seems too diificult and complex for me.

But if a modification to my proposal will fit in short term 1.x,
I will modify it delightedly.

>> Yes, like I said above, "clients" actually includes components that
>> run on servers like web_dav_svn, and it should read as any components
>> that access to repositories and working copies.
> No. By "clients" I mean components that run on the client side. If my
> proposal had required changes to mod_dav_svn, I would not have said
> "strictly client-side". I do not propose any change to mod_dav_svn,
> svnserve, svnadmin, libsvn_repos, libsvn_fs, the repository data, or
> anything else on the server side.
>> If you think in analogy to ASCII uppercase and lowercase examples,
>> you miss the point. Please reread the Unicode Standard Annex #15
>> UAX #15: Unicode Normalization Forms
>> http://unicode.org/reports/tr15/
> Thanks, I've read it. The analogy stands. We could prevent NFC/NFD
> collisions as an additional service to users, something we have not
> done for the past 10 years. This would be along the lines of
> preventing users from shooting themselves in the foot.
> The actual _software_ problem that is solved by preventing collisions
> is the same as the software problem solved by preventing upper/lower
> case collisions: certain clients are unable to check out a folder that
> has such collisions. (Windows clients, in the case of upper/lower
> collisions; OS X clients, in the case of NFC/NFD collisions.)

Yes, I agree with that.

> I think we are talking past each other. You are trying to solve two
> distinct but related problems: 1. OS X client-side confusion when faced
> with a non-NFD repository path; 2. NFC/NFD collisions. I am only
> trying to solve problem 1. I'm ignoring problem 2 for two reasons:
> (a) Problem 2 requires server-side work and complex compatibility /
> upgrade scenarios (dump/load, re-check-out all wcs, etc).
> (b) Problem 2 can be worked around, for new repositories (or
> repositories with no existing collisions), with a pre-commit hook.
> ...neither of which are true for my proposal to solve problem 1.
> So long as you continue to insist that, to solve problem 1, we must
> also solve problem 2, I'm pretty sure we will never come to any
> agreement.

OK. So how about changing my proposal like:
(1) No sever modification. Just modify svn_path_cstring_to_utf8 only.
(2) Let users install a pre-commit hook which rejects any non-NFC filenames.

In this way, we only need one function. Modification is just like
the original OS X unicode path patch:

Only difference the original patch to my patch will be mine use
utf8proc so that we can use it on all platforms, Mac OS X, Windows
and Linux.

)Hiroaki Nakamura) hnakamur_at_gmail.com
Received on 2012-02-02 22:46:38 CET

This is an archived mail posted to the Subversion Dev mailing list.