[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: log --search test failures on trunk and 1.8.x

From: Branko Čibej <brane_at_wandisco.com>
Date: Sun, 21 Apr 2013 19:57:34 +0200

On 21.04.2013 15:37, Bert Huijben wrote:
>
>> -----Original Message-----
>> From: Branko Čibej [mailto:brane_at_wandisco.com]
>> Sent: zondag 21 april 2013 14:48
>> To: dev_at_subversion.apache.org
>> Subject: Re: log --search test failures on trunk and 1.8.x
>>
>> On 21.04.2013 14:05, Stefan Sperling wrote:
>>> On Sun, Apr 21, 2013 at 01:53:43PM +0200, Bert Huijben wrote:
>>>> I'd rather pull the case insensitive search part of this new in 1.8 search
>> feature and do it right in 1.9.
>>> What's the issue with the current implementation apart from the
>>> test failures on Windows?
>>>
>>> The behaviour of 'svn log --search' regarding case-sensitivity
>>> isn't even documented, so we're not really prosmising anything.
>>>
>>> It is possible that some users who are using languages other than
>>> English will complain, since ASCII is being matched case-insensitively,
>>> and all other characters are being matched case-sensitively.
>>> But this is due to a missing feature in APR's implemention of fnmatch().
>>>
>>> Provided we can fix the 1.8.x tests on Windows I see no reason to
>>> change our implementation of log --search. We can simply wait for
>>> APR to grow the necessary support for multibyte strings.
>> The wc-collate-path branch has an svn_utf__glob function that's mainly
>> intended for use by SQLite, however, it can be a replacement for
>> apr_fnmatch. It uses apr_fnmatch internally, but decomposes the inputs
>> to Unicode normalization form D, which keeps diacriticals separate from
>> the base letters. In other words, we could easily extend that to do
>> completely diacritical-agnostic case-folding matching for Latin
>> alphabets (and probably also for Cyrillic scripts).
>>
>> The idea to manually hack things to work with western Latin alphabets
>> seems completely wrong-headed to me.
>>
>> But yes; in general, case folding is locale-specific. If we wanted to
>> support that, we'd need ICU instead of utf8proc. I can imagine that
>> eventually being an option, but not a mandatory dependency.
> Summarizing: What would it help to include utf8proc on trunk now for this issue?
>
> Your conclusion is (similar to mine) that we need more for case folding than what we have now and/or what utf8proc will offer us.
>
> Do we want case folding (or at least case variant compare) support in our libraries for 1.8?
>
> Or is this 1.9+ scope?

It's 1.9+ scope. It would be madness to merge such an invasive change to
trunk at this late date (IMHO).

-- Brane

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com
Received on 2013-04-21 19:58:11 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.