Re: Comments on 'notes/unicode-composition-for-filenames'

From: Stefan Sperling <stsp_at_elego.de>
Date: Tue, 22 Feb 2011 19:56:42 +0100

On Tue, Feb 22, 2011 at 07:41:12PM +0100, Branko ÄŒibej wrote:
> On 22.02.2011 18:17, Julian Foad wrote:
> >> Proposed Support Library
> >> ========================
> >>
> >> Assumptions
> >> -----------
> >>
> >> The main assumption is that we'll keep using APR for character set
> > s/character set/character encoding/.
> >
> >> conversion, meaning that the recoding solution to choose would not
> >> need to provide any other functionality than recoding.
> > s/recoding/converting between NFD and NFC UTF8 encodings/.
>
> Actually -- you have to go all the way and support complete
> normalization, even if your normalization targets are only NFC and NFD.
> That's because there isn't a sane way to detect whether a string is
> normalized or not -- "sane" in the sense that it should take about as
> long to discover that as to just normalize it.

To put it differently, the only way to figure out whether a given
UTF-8 sequence is valid (or, by extension, uses NFC and/or NFD)
is to parse the entire sequence.
Received on 2011-02-22 19:57:25 CET

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: Comments on 'notes/unicode-composition-for-filenames'"
Previous message: Branko ÄŒibej: "Re: Comments on 'notes/unicode-composition-for-filenames'"
In reply to: Branko ÄŒibej: "Re: Comments on 'notes/unicode-composition-for-filenames'"
Next in thread: Daniel Shahaf: "Re: Comments on 'notes/unicode-composition-for-filenames'"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]