[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1731300 - in /subversion/trunk/subversion: include/private/svn_utf_private.h libsvn_repos/dump.c libsvn_subr/utf8proc.c svn/cl-log.h svn/log-cmd.c svn/svn.c tests/cmdline/log_tests.py tests/libsvn_subr/utf-test.c

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Mon, 7 Mar 2016 19:57:51 +0300

Branko ─îibej <brane_at_apache.org> writes:

> The big question here is what we'll use the API for. Currently we have a
> 'normalize' function that's used by svn_fs_verify (IIRC). Since we're
> talking about a funciton that transforms a UTF-8 string to a shape
> suitable for stuff-insensitive comparison, we could follow the example
> of the standard strxfrm() -> svn_utf__xfrm(); but if that's too ugly, my
> preference is for svn_utf__fold().
> However, I'd not add arguments for normalization/case folding/etc; I'd
> just make this function DTRT without any additional flags, because
> otherwise we'll always be second-guessing the correct invocation.

One use case that I keep in mind is doing server-side search or filtering,
where a client tells the server what kind of comparison and matching she
expects to get.

The strxfrm() function doesn't define the transformation in terms of
preserving case or diacritical marks. Hence, we can't have svn_utf__xfrm()
doing the right thing for svn log --search, as that would mean that a
libsvn_subr function controls the behavior of the command-line client.
And while a private function somewhere around svn.c could be doing that,
hardcoding this kind of behavior in libsvn_subr doesn't sound proper to me.

We can drop the `normalize' argument, since keeping denormalized strings
around is dangerous and unnecessary, but I'd leave the other two and let the
caller specify the wanted behavior:

    svn_error_t *
    svn_utf__xfrm(const char **result,
                  const char *str,
                  apr_size_t len,
                  svn_boolean_t case_insensitive,
                  svn_boolean_t accent_insensitive,
                  svn_membuf_t *buf);

I attached the patch that does that. What do you think?

Evgeny Kotkov

Received on 2016-03-07 17:58:22 CET

This is an archived mail posted to the Subversion Dev mailing list.