[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Encoding problems in subversion under Mac OS X (HFS+)

From: Balázs Szabó (dLux) <dlux_at_dlux.hu>
Date: 2005-12-04 22:36:48 CET

Hi,

Thank you for the explanation and the idea.

But what can I do with it as a subversion user? Does anyone have a
patch or something like this for this problem?

Thanks,

Balázs Szabó (dLux)
-- -- - - - -- -

On 2005.12.03., at 18:01, Paul Koning wrote:

>>>>>> "Balázs" == Balázs Szab <Bal> writes:
>
> Balázs> Hi, I have problems using Subversion on OSX (10.4.3). I have
> Balázs> tried a few different versions and the problem is always the
> Balázs> same.
>
> Balázs> I have checked out a repository, which I created on Linux,
> Balázs> and it contained filenames like "statisztikák.sxc"
>
> Balázs> I set up the environment before I did anything:
>
> Balázs> export LC_CTYPE="hu_HU.UTF-8"
>
> Balázs> The checkout worked fine, but right after the checkout, I had
> Balázs> the following output for svn status (SVN 1.3RC4, but the
> Balázs> results are similar with 1.2.3 as well):
>
> Balázs> ? statisztikák.sxc ! statisztikák.sxc
>
> Balázs> The problem can be that (as I read elsewhere), HFS+ stores
> Balázs> the filenames in decomposed form, and since "á" has two UTF-8
> Balázs> code in composed and decomposed forms, SVN thinks that this
> Balázs> file is different what is just checked out...
>
> That sounds plausible. This problem can appear anytime you deal with
> strings that aren't plain English text -- accents, for example.
>
> There's a standard solution designed in the IETF called Stringprep
> (it's an RFC, I don't have the number handy). Basically it involves
> translating the string into a single "canonical" format, so that no
> matter which choice of encoding you start with, after Stringprep there
> is only one possible outcome. Think of it as the UTF analog of
> case-insensitive comparison.
>
> So in order to compare UTF strings, you first run the two through
> Stringprep, and after that you compare them. That way, two strings
> that are the same to the user will also be the same to the program,
> and any irrelevant transformations done in storing the strings (like
> the HFS+ one) will not confuse things.
>
> paul
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Dec 4 22:39:41 2005

This is an archived mail posted to the Subversion Users mailing list.