[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

'svn log --search': forcing case sensitivity? (was: svn commit: r1731300 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_repos/dump.c libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c)

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Tue, 15 Mar 2016 00:08:38 +0000

kotkov_at_apache.org wrote on Fri, Feb 19, 2016 at 22:11:11 -0000:
> Author: kotkov
> Date: Fri Feb 19 22:11:11 2016
> New Revision: 1731300
>
> URL: http://svn.apache.org/viewvc?rev=1731300&view=rev
> Log:
> Make svn log --search case-insensitive.
>
> Use utf8proc to do the normalization and locale-independent case folding
> (UTF8PROC_CASEFOLD) for both the search pattern and the input strings.
>
> Related discussion is in http://svn.haxx.se/dev/archive-2013-04/0374.shtml
> (Subject: "log --search test failures on trunk and 1.8.x").
>
> +++ subversion/trunk/subversion/svn/log-cmd.c Fri Feb 19 22:11:11 2016
> @@ -38,6 +38,7 @@
> @@ -110,6 +111,24 @@
> +/* Return TRUE if STR matches PATTERN. Else, return FALSE. Assumes that
> + * PATTERN is a UTF-8 string normalized to form C with case folding
> + * applied. Use BUF for temporary allocations. */
> +static svn_boolean_t
> +match(const char *pattern, const char *str, svn_membuf_t *buf)
> +{
> + svn_error_t *err;
> +
> + err = svn_utf__normalize(&str, str, strlen(str), TRUE /* casefold */, buf);
> + if (err)
> + {
> + /* Can't match invalid data. */
> + svn_error_clear(err);
> + return FALSE;
> + }
> +
> + return apr_fnmatch(pattern, str, 0) == APR_SUCCESS;

Should there be a command-line flag to disable casefolding?

E.g., to allow users to grep for identifiers (function/variable/file
names) using their exact case? Do people who use 'log --search' need it
to be case-sensitive? (I don't use 'log --search' often.)

Even if casefolding is disabled, we should still apply Unicode
normalization to form C.

Cheers,

Daniel

P.S. This patch introduces a minor behaviour change: before this patch,
the search pattern «foo[A-z]bar» would match the log message «foo_bar»,
whereas after this change it would not. (This is because the pattern is
now casefolded between being passed to APR, and '_' is between 'A'
and 'z' but not between 'A' and 'Z', when compared as C chars.) I doubt
anyone will notice this behaviour change; I'm just mentioning it for
completeness.

> +}
Received on 2016-03-15 01:08:44 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.