RE: svn commit: r1731300 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_repos/dump.c libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c

From: Bert Huijben <bert_at_qqmail.nl>
Date: Sat, 20 Feb 2016 12:09:02 +0100

> -----Original Message-----
> From: kotkov_at_apache.org [mailto:kotkov_at_apache.org]
> Sent: vrijdag 19 februari 2016 23:11
> To: commits_at_subversion.apache.org
> Subject: svn commit: r1731300 - in /subversion/trunk/subversion:
> include/private/svn_utf_private.h libsvn_repos/dump.c
> libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c
> tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c
>
> Author: kotkov
> Date: Fri Feb 19 22:11:11 2016
> New Revision: 1731300
>
> URL: http://svn.apache.org/viewvc?rev=1731300&view=rev
> Log:
> Make svn log --search case-insensitive.
>
> Use utf8proc to do the normalization and locale-independent case folding
> (UTF8PROC_CASEFOLD) for both the search pattern and the input strings.
>
> Related discussion is in http://svn.haxx.se/dev/archive-2013-04/0374.shtml
> (Subject: "log --search test failures on trunk and 1.8.x").
>
> * subversion/include/private/svn_utf_private.h
> (svn_utf__normalize): Add new boolean argument to perform case folding.

Usually it is far more efficient to perform the comparison on the unnormalized strings using the apis, than to normalize and perform the operation later. I'm not sure if utf8proc supports this feature though

But I'm wondering why you added this feature to an existing function?

I don't think it is recommended practice to perform the normalization this way and adding a boolean to an existing function makes it easier to do perform things in a not recommended way.

Locale independent case folding is not that well defined... Things like the Turkish 'i' that doesn't fold, so any decision on that makes it locale dependent. (n this case probably by choosing not Turkish, but that doesn't make it 'locale independent'.

Just folding the western European characters is much easier to explain/document.

Bert
Received on 2016-02-20 12:14:15 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]