Re: [RFC] Non-normalizing Unicode Composition Awareness

From: Branko ÄŒibej <brane_at_wandisco.com>
Date: Fri, 09 Nov 2012 13:49:59 +0100

On 09.11.2012 12:28, Thomas Ã…kesson wrote:
> Today, I noticed that Branko started some implementation in a branch. Looks like a collation based on utf8proc is in the making? I think that would make a lot of sense because the ICU extension poses some challenges in the build process and we might not need all that functionality that it provides.

Hi Thomas,

Yes, I started a branch that's intended to fix the normalization
problem. I selected utf8proc because we really don't need ICU (I can't
see a serious need for language-specific case folding, for example, nor
for Unicode regular expressions). Furthermore, utf8proc can be easily
embedded into Subversion so it doesn't present another dependency that
users would have to worry about.

I'm currently doing the grunt work of implementing the collation (done)
and the LIKE and GLOB operators that we'll need (in progress). The next,
and biggest, step will be to review the client and WC libraries to make
sure that paths sent to the server always come from the wc.db, not from
disk.

One open question is what to do about (historical) collisions in
existing repositories, but I don't think that issue is important enough
to resolve now.

It'll take a while, but I hope to be able to finish the work in time for
1.8. If not ... well then, it'll be in 1.9.

-- Brane

-- 
Branko ÄŒibej
Director of Subversion | WANdisco | www.wandisco.com

Received on 2012-11-09 13:50:41 CET

This message: [ Message body ]
Next message: Branko ÄŒibej: "Re: Windows buildbot FAIL on 1.7.x"
Previous message: Stefan Sperling: "Re: Windows buildbot FAIL on 1.7.x"
In reply to: Thomas Åkesson: "Re: [RFC] Non-normalizing Unicode Composition Awareness"
Next in thread: C. Michael Pilato: "Re: [RFC] Non-normalizing Unicode Composition Awareness"
Reply: C. Michael Pilato: "Re: [RFC] Non-normalizing Unicode Composition Awareness"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]